-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingest corporate archive images #1126
Comments
Questions - just make sure I understand what it is I'm supposed to be doing:
|
Regarding the Glacier aspect, I think we can trigger a Bulk retrieval then use Notifications to trigger the next step |
There is a mention of this forming a "Repeatable pipeline is established for future ingests from S3 buckets", are there some other ingests expected in the near future? Knowing this would help establish what kinds of options/parameters I might need to establish. |
@paul-butcher my understanding is that the next, and most likely, ingest would be when the next year of corporate photography shoots is accessioned, so it is likely to be a largely similar/uniform kind of thing from a similar sort of place - if that is vaguely precise enough for now.
|
|
That's great. At the lowest level, the main thing I was wondering about is whether Glacier normally be involved, or if it's just going to be involved occasionally, might it be easiest to do that bit manually? I don't need an answer on this, as I'll probably work it out as I go along. |
|
Don't know, assume not. I am out on Tuesday/Wednesday - so suggest you get in touch with Ashley? |
Ah. I've just spotted the minor wrinkle that both the buckets are in different accounts. That's a little bit of a pain. |
Archivematica can be a bit flaky when ingesting large amounts of data. We may need to do some kind of retry. The limit on ingest is by number of files per packet - Ashley recalls that there is a maximum of probably 500. I will set the maximum 250 in order to steer well clear of that. If we have to retry because of ephemeral issues, I'd like to be pretty sure we aren't also failing because the packages are too big. |
I assume that the target for these is Should it go into some (new?) subfolder in that bucket? When ephemeral failures occur, is it just a matter of moving a zip from If they fail for a legitimate reason, would it be appropriate to download the failed zip from there, modify/split it, then upload? (as opposed to storing the zips elsewhere and fetching the one that corresponds to the failure) |
If we want to practice this (and I think we should because it's really hard to delete things in storage if you mess up) then it'll need to go into the born-digital-accession folder of The If you need to resubmit the zip because it failed in Archivematica (usually because Archviematica fell over rather than anything legitimately wrong with the zip), you can just copy the zip into the same location and it'll overwrite and the Lambda should pick it up again. If they fail for a legitimate reason then you should be able to pick it back up out of |
To support the accession of corporate photography into the archive, we would like to find a way to automatically ingest them in Archivematica.
https://www.notion.so/wellcometrust/Ingest-corporate-archive-images-09d2b2fc47b846a0a377900a6c7e386d?pvs=4
The text was updated successfully, but these errors were encountered: