Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add devel script for copying files from the stacks to a new loca… #543

Merged
merged 3 commits into from
Sep 11, 2019

Conversation

peetucket
Copy link
Member

@peetucket peetucket commented Sep 10, 2019

a simple script used for AI work, given a CSV with a list of druids, will copy all of the files from the stacks for each druid (as well as the MODs files from the purl cache) to another location

@coveralls
Copy link

coveralls commented Sep 10, 2019

Coverage Status

Coverage remained the same at 58.581% when pulling c0baa90 on get-files-from-stacks into 09e92de on v3-legacy.

@jcoyne
Copy link
Contributor

jcoyne commented Sep 10, 2019

This doesn't seem to have anything to do with assembly. Is this the correct repository? Have you considered PURL or Stacks?

@peetucket
Copy link
Member Author

My opinion is that this code falls into "scripty" category - the types of scripts in the past have collected in the pre-assembly codebase, as that is deployed to a server that has access to basically all mounts and services and is configured to do this. It is not out of the realm of possibilities that this script could be extended to involve fetching info not associated with stacks or PURL (and it already touches both, making it ambiguous which of those two repos it would go in). Suggest we discuss in the next meeting about strategies for when a script/rake task gets put in a specific repo that it is directly related to vs the "script" location.

Copy link
Contributor

@ndushay ndushay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this; Great comment block. I'd like it to also include the requestor and a tiny bit about the context for our future selves.

# Copies files from the stacks into a separate folder given a list of druids
# Pass into the full path to the input CSV below and the full path to the output location.
# The input CSV must have a column called 'druid' containing the druid, other columns are ignored
# August 29, 2019
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please indicate who requested it and the context too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, If we don't have tests, I would like to know, how long do we have to support this and who is in charge of authoring/testing changes that need to be made. Ideally this would have a sunset date after which we can delete.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some comments/context

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why this script has to circumvent the existing APIs to accomplish what it needs to?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean public APIs, like crawling PURL pages, and downloading content from the stacks? Or dor-services-app APIs? Either way, it was just expediency under the presumption it is possibly a one-timeish request. If it turns out not to be, then it could be more formalized.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, in that case, can we add "This work was done as a one time request and does not need to be maintained" in the documentation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done. I'd also like to see us agree on a general strategy for these types of requested scripts to minimize time spent on them. For example, a general README in this "devel" folder (assuming that is the specified place) that stipulates the scripts are generally unsupported and untested, and not suitable for regular production usage.

@peetucket peetucket changed the title add devel script for copying files from the stacks to a new location add devel script for copying files from the stacks to a new loca… Sep 11, 2019
@peetucket peetucket merged commit b37185e into v3-legacy Sep 11, 2019
@peetucket peetucket deleted the get-files-from-stacks branch September 11, 2019 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants