Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make sure we know what pre-assembly outputs #349

Closed
jmartin-sul opened this issue Sep 24, 2018 · 6 comments
Closed

make sure we know what pre-assembly outputs #349

jmartin-sul opened this issue Sep 24, 2018 · 6 comments
Assignees

Comments

@jmartin-sul
Copy link
Member

and what we need to keep. what gets produced:

  • progress log file - we know we want this. allows resumption of job.
  • what used to go to standard out -- is this valuable?
    • if it is valuable, and we want to keep it, does it get kept separately in a log file in the bundle context output dir? do we need to present the results to the user? or is it something we can just send to the regular rails logs.
  • application-wide logs: just goes to rails log.
  • other pre-assembly output -- what are the other artifacts of pre-assembly? datastreams? organized content? this is what the assembly robot picks up, right? if output is large, that's what the smpl symlink style is for?
@jmartin-sul
Copy link
Member Author

jmartin-sul commented Sep 24, 2018

also, how exactly does the staging_style_symlink behavior differ in terms of where it puts artifacts or what artifacts it creates?

@blalbrit
Copy link
Contributor

on the symlink question: for very large data transfers (like media objects), the symlinks are created in the dor/assembly folder, passed up to the /dor/workspace folder, and the content is pulled over from the storage server at the shelve and sdr-ingest-transfer steps. that way, we're not dealing with network transfer times for (sometimes) TB of content except at the ingest to preservation step. so - other than the way that content is moved around the system - the symlink approach is producing the same kinds of objects and outputs as non-symlinked.

@blalbrit
Copy link
Contributor

what used to go to standard out -- is this valuable?
if it is valuable, and we want to keep it, does it get kept separately in a log file in the bundle context output dir? do we need to present the results to the user? or is it something we can just send to the regular rails logs.

^^ this should be what we're keeping in the progress log file - the user does not need to be presented with the results of standard out except through the log itself.

@blalbrit
Copy link
Contributor

other pre-assembly output -- what are the other artifacts of pre-assembly? datastreams? organized content? this is what the assembly robot picks up, right? if output is large, that's what the smpl symlink style is for?

^^ pre-assembly creates a stub contentMD output which it deposits in /dor/assembly at the bottom of the druid tree for each object. assembly robots pick up from there. the symlink style is for reduction of content movement across the network for very large files.

pre-assembly also creates organized content in the /dor/assembly druid tree path that gets picked up by assembly robots.

it also may still be doing something with smpl techMD files - @peetucket , can you weigh in on that?

@peetucket
Copy link
Member

I believe SMPL provides incoming techMD files that are combined in some way algorithmically, see https://github.com/sul-dlss/pre-assembly/blob/master/app/lib/pre_assembly/digital_object.rb#L182

@jmartin-sul
Copy link
Member Author

i think we got the info we needed, and we're just starting in on UAT, so closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants