Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GDCC/8750 DRS archiver #8752

Merged

Conversation

qqmyers
Copy link
Member

@qqmyers qqmyers commented May 26, 2022

What this PR does / why we need it: This adds a DRSSubmitToArchiveCommand class that can send archival Bags to Harvard's DRS. The class is also an example that supports status messaging, optional single-version semantics, per collection archiving capabilities, JWT authentication with the archival system, etc.

Which issue(s) this PR closes:

Closes #8750

Special notes for your reviewer: This essentially depends on all the 3a stuff before it, most specifically on #8751

Suggestions on how to test this: The Harvard Library has a working copy of Dataverse in it's qa network that is configured to work with a test DRS instance. There are also integration tests that create a dataset, publish it, and check for a status returned from the DRS DIMS system. The easiest way to test this is probably to deploy this specific PR to that machine, run the IT tests, and perform any additional manual tests desired (i.e. testing that publishing a dataset in a collection that isn't configured for archiving does not trigger archiving, flipping the single_version toggle from true to false to see how the Dataverse UI changes, etc.). Nominally, nothing else breaking and this working for the qa network (and eventually on Harvard Dataverse) are the real acceptance criteria.

Does this PR introduce a user interface change? If mockups are available, please link/include them here: no - nothing beyond prior PRs

Is there a release notes update needed for this change?: addressed in #8611

Additional documentation: Harvard Library has addition info on configuration and user testing plans, etc.

@qqmyers qqmyers added HDC Harvard Data Commons HDC: 3a Harvard Data Commons Obj. 3A labels May 26, 2022
@coveralls
Copy link

coveralls commented May 26, 2022

Coverage Status

Coverage decreased (-0.04%) to 19.58% when pulling 1ddbe38 on GlobalDataverseCommunityConsortium:GDCC/8750-DRS_Archiver into e1f862e on IQSS:develop.

@pdurbin
Copy link
Member

pdurbin commented Jul 25, 2022

As explained in the description and as discussed last week, we shouldn't move this pull request to QA until PR #8751 has been merged.

Here's a "compare" link we can use to see just the changes in this pull request:

GlobalDataverseCommunityConsortium/dataverse@GDCC/8749-S3Archiver...GDCC/8750-DRS_Archiver

@pdurbin pdurbin self-assigned this Jul 25, 2022
Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a full, final review because we're waiting for #8751 to be merged but I looked at the diff between the two pull requests and tried to give some feedback.

import com.auth0.jwt.exceptions.JWTCreationException;

@RequiredPermissions(Permission.PublishDataset)
public class DRSSubmitToArchiveCommand extends S3SubmitToArchiveCommand implements Command<DatasetVersion> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DRSSubmitToArchiveCommand is so Harvard-specific it would be nice to move it outside the main code base. Is this possible?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nominally the exact archiver class is picked up via reflection so at a minimum one could drop this class file in after war deployment and have it work. However, I don't know enough to know if one can compile it separately and drop a jar in instead (anyone?). Nominally any/all of the archivers could be handled that way going forward if that is figured out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't think anyone on the team knows how to do this. It would be great to figure this out. We'd like to make Dataverse more modular.

doc/sphinx-guides/source/installation/config.rst Outdated Show resolved Hide resolved
doc/sphinx-guides/source/installation/config.rst Outdated Show resolved Hide resolved
doc/sphinx-guides/source/installation/config.rst Outdated Show resolved Hide resolved
doc/sphinx-guides/source/installation/config.rst Outdated Show resolved Hide resolved
@pdurbin
Copy link
Member

pdurbin commented Aug 3, 2022

@qqmyers as discussed this PR is a bit parked until #8751 is merged. However, there are merge conflicts...

GlobalDataverseCommunityConsortium/dataverse@GDCC/8749-S3Archiver...GDCC/8750-DRS_Archiver

... so if you want to go ahead and resolve them (merge the other branch into this one), it could save some time later.

Thanks for making so many changes base on my previous review. Again, I don't have a way to test this but I think we're close. And I understand that an environment will be available for QA.

@qqmyers
Copy link
Member Author

qqmyers commented Aug 3, 2022

Hmm - the last commit was Merge branch 'GDCC/8749-S3Archiver' into GDCC/8750-DRS_Archiver

@pdurbin
Copy link
Member

pdurbin commented Aug 3, 2022

I may well have caused the conflict my making a small typo fix in the other branch (to get the tests to run).

@qqmyers
Copy link
Member Author

qqmyers commented Aug 3, 2022

Yep - fixed.

@scolapasta scolapasta added this to the 5.12 milestone Aug 5, 2022
Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR this depended on was just merged (thanks @qqmyers for merging the latest). Tests are passing. Harvard-specific docs are gone. I think we're good. Off to QA.

@kcondon kcondon self-assigned this Aug 15, 2022
@kcondon kcondon merged commit 6bd3ec6 into IQSS:develop Aug 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
HDC Harvard Data Commons HDC: 3a Harvard Data Commons Obj. 3A
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HDC 3A: Support archiving from Dataverse to the Harvard DRS
5 participants