-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimized batch_import_marc #2995
Optimized batch_import_marc #2995
Conversation
Thanks, @damien-git, that's a great idea; I did not realize that SolrMarc supported this behavior. I'm adding a TODO item to port these changes to the .bat versions of the files if possible, and I'll look into doing that when time permits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again, @damien-git. I haven't actually tried this out yet, but I had time to give it a closer look today. I found one possible bug and had two suggestions that might or might not be helpful -- let me know what you think about them, but there's no pressure to implement if you don't think they're worth the effort.
Once you've had a chance to look at this, I'll see about working on the Windows port of the new functionality. I suspect that adding the -x functionality to batch-import-marc.bat may be more effort than it's worth due to the limitations of batch files (in which case I can just add an error message if somebody tries to pass -x, and we can spend more time if a Windows user actually cares), but I can probably at least add the multiple file support to marc-import.bat.
…s/vufind into optimized_batch_import_marc
@damien-git, I've done a little more work on the Windows equivalent functionality -- for now, if you pass a -x flag to the batch-import-marc.bat file, it fails with a message that the feature is not supported in Windows. I suspect that this could be implemented -- probably by using a counter and stringing together the filenames inside the main loop -- but I'm not sure that it's worth the effort to bother doing the work, since I don't think we have many Windows users. Unless anyone objects, I'll just leave it at this until somebody complains. Is there any other work you want to do on this, or is it now ready to merge from your perspective? |
I've opened VUFIND-1626 to track the missing Windows functionality, so people can comment/vote there if they want it. For good measure, I've also linked to the ticket from the Windows error message, so if anyone encounters the situation, they'll know how to report their needs. |
I think it's ready to merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again, @damien-git, this is a great improvement. Merging now!
- Allows multiple files to be passed to import-marc.sh (Linux + Windows) - Utilizes multiple file loading to improve performance of batch-import-marc.sh (Linux only) --------- Co-authored-by: Demian Katz <demian.katz@villanova.edu>
Currently when imported files are small there is a large overhead when the JVM is loaded and SolrMarc prepares to process files. But SolrMarc supports taking multiple files as input.
This PR changes batch-import-marc.sh and import-marc.sh so that several files can be sent at once (set to 10 by default). When files only contain a few records (which happens for instance with incremental harvests), performance is much better.
TODO