Skip to content

Troubleshooting issues when ingesting items via etd_loader or ingest_etd

Dan Kerchner edited this page Aug 21, 2019 · 4 revisions

General tip:

  • If the stack trace from etd_loader doesn't provide as much information as you'd like, you can try scraping the rake command from the console - look for Command is:.

As an example:

INFO:__main__:Unzipping /data/etd-loader/etd_to_be_imported/ to /tmp/tmp78c6eu7c
INFO:__main__:Importing 686240. ETD file is /tmp/tmp78c6eu7c/Hardy_gwu_0075M_14840.pdf and attachements are []
INFO:__main__:Command is: rake RAILS_ENV=production gwss:ingest_etd -- --manifest=/tmp/tmp78c6eu7c/metadata.json --primaryfile=/tmp/tmp78c6eu7c/Hardy_gwu_0075M_14840.pdf
DEPRECATION WARNING: human_readable_type= is deprecated and will be removed from a future release (human_readable_type is deprecated. Set the i18n key for activefedora.models.#{model_name.i18n_key} instead. This will be removed in Hyrax 3). (called from <class:GwEtd> at /opt/scholarspace/scholarspace-hyrax/app/models/gw_etd.rb:11)
DEPRECATION WARNING: human_readable_type= is deprecated and will be removed from a future release (human_readable_type is deprecated. Set the i18n key for activefedora.models.#{model_name.i18n_key} instead. This will be removed in Hyrax 3). (called from <class:GwWork> at /opt/scholarspace/scholarspace-hyrax/app/models/gw_work.rb:11)
INFO:__main__:Repository id for 686240 is dr26xz169
INFO:__main__:Adding 686240 with dr26xz169
INFO:__main__:Unzipping /data/etd-loader/etd_to_be_imported/ to /tmp/tmpuaehg2gy
INFO:__main__:Importing 686809. ETD file is /tmp/tmpuaehg2gy/Stiegler_gwu_0075A_14848.pdf and attachements are []
INFO:__main__:Command is: rake RAILS_ENV=production gwss:ingest_etd -- --manifest=/tmp/tmpuaehg2gy/metadata.json --primaryfile=/tmp/tmpuaehg2gy/Stiegler_gwu_0075A_14848.pdf

and the rest of the stack trace follows. You would then:

cd \opt\scholarspace\scholarspace-hyrax

rake RAILS_ENV=production gwss:ingest_etd -- --manifest=/tmp/tmpuaehg2gy/metadata.json --primaryfile=/tmp/tmpuaehg2gy/Stiegler_gwu_0075A_14848.pdf

and you'll often get a more informative stack trace.

Specific Scenarios:

Symptom rake aborted! Ldp::BadRequest: RDF was not parsable: [line: 8, col: 1 ] Broken token (newline): ;
Probable root cause Control characters such as tabs, etc. in metadata fields in the XML. Ref
Recommended solution Unzip the zip file, edit the control characters out, update the zip with the new XML (use zip -u), then re-run
Symptom TBD
Probable root cause TBD
Recommended solution TBD