-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Globus Store implementation #10162
Globus Store implementation #10162
Conversation
Conflicts: pom.xml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm clicking approve, not because I'm claiming that I understand everything that's going on here, but because I'm doing a combination of review+QA at the same time already.
@@ -0,0 +1,19 @@ | |||
Globus support in Dataverse has been expanded to include support for using file-based Globus endpoints, including the case where files are stored on tape and are not immediately accessible, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should Globus be listed under https://guides.dataverse.org/en/latest/admin/integrations.html ?
|
||
The setup required to enable Globus is described in the `Community Dataverse-Globus Setup and Configuration document <https://docs.google.com/document/d/1mwY3IVv8_wTspQC0d4ddFrD2deqwr-V5iAGHgOy4Ch8/edit?usp=sharing>`_ and the references therein. | ||
More details of the setup required to enable Globus is described in the `Community Dataverse-Globus Setup and Configuration document <https://docs.google.com/document/d/1mwY3IVv8_wTspQC0d4ddFrD2deqwr-V5iAGHgOy4Ch8/edit?usp=sharing>`_ and the references therein. | ||
|
||
As described in that document, Globus transfers can be initiated by choosing the Globus option in the dataset upload panel. (Globus, which does asynchronous transfers, is not available during dataset creation.) Analogously, "Globus Transfer" is one of the download options in the "Access Dataset" menu and optionally the file landing page download menu (if/when supported in the dataverse-globus app). | ||
|
||
An overview of the control and data transfer interactions between components was presented at the 2022 Dataverse Community Meeting and can be viewed in the `Integrations and Tools Session Video <https://youtu.be/3ek7F_Dxcjk?t=5289>`_ around the 1 hr 28 min mark. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this video still worth watching, given the changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. I gave a talk in 2023 as well, but 2022 goes into the steps in more detail, so I left it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we link to this page from https://guides.dataverse.org/en/6.0/api/intro.html#lists-of-dataverse-apis ?
export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | ||
export SERVER_URL=https://demo.dataverse.org | ||
export PERSISTENT_IDENTIFIER=doi:10.5072/FK27U7YBV | ||
export JSON_DATA="{"taskIdentifier":"3f530302-6c48-11ee-8428-378be0d9c521", \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, does this work? We might need single quotes on the outside instead of double.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... or have the double quotes on the inside escaped. But yeah, the above will export a JSON string with no double quotes in it.
@@ -499,14 +499,14 @@ Logging & Slow Performance | |||
|
|||
.. _file-storage: | |||
|
|||
File Storage: Using a Local Filesystem and/or Swift and/or Object Stores and/or Trusted Remote Stores | |||
----------------------------------------------------------------------------------------------------- | |||
File Storage: Using a Local Filesystem and/or Swift and/or Object Stores and/or Trusted Remote Stores and/or Globus Stores |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha. At some point we might want a more generic title instead of listing each type. 😄 Plus, Swift probably shouldn't be second forever. It's hardly used.
I'm going to assume that the war file currently deployed on the test instance (from around noon Dec. 7) is the latest build. |
Also, going to assume that the last Jenkins failure (build 18) is one of those random flukes where the ec2 instance fails to start up in time and has nothing to do with the branch. |
My guess is the globusr store does not have the files-not-accessible-by-dataverse flag set to true when it should. (Same reason publish fails as validation is on and not disabled by this flag.) |
src/main/java/edu/harvard/iq/dataverse/dataaccess/AbstractRemoteOverlayAccessIO.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Philip Durbin <philipdurbin@gmail.com>
[this is a status update per slack discussion] So, we have a few documentation edit requests, some less some more nitpicking. But as for the functionality in the PR, what I could crudely test on my macbook and on Jim’s ec2 instance, all worked for me. “crudely” is the key. |
We have reviewed and cleared the "managed store tied to Globus S3 connector" use case this morning. Merging. |
What this PR does / why we need it: Implements new Globus functionality to handle
The overall functionality requires use of the Borealis Dataverse-Globus app which is being updated to work with the functionality added in this PR.
Which issue(s) this PR closes:
Closes #9123
Special notes for your reviewer:
~code complete - some docs but more to follow. External doc includes more info.
Suggestions on how to test this:
Without the app, testing requires some manual steps. Basically, Dataverse launches the app like an external tool to support upload (transfer to Dataverse) or download (transfer from Dataverse), so to test one can look at the URL used to launch the app and manually make the Dataverse API calls it would do, along with initiating the Globus transfer it would do (via the standard Globus app). As it sounds, this is tedious, and it requires a properly configured Dataverse instance.
For dev, I have an AWS instance (with associated Globus endpoints) that can be used for testing. Getting on a zoom call is probably the easiest way to walk through it all.
Does this PR introduce a user interface change? If mockups are available, please link/include them here: It adds Globus-related functionality for upload/download if /when the Globus functionality is enabled.
Is there a release notes update needed for this change?: yes - tbd
Additional documentation: