Skip to content

EPrints_Json_Importer

Julie Allinson edited this page Jul 3, 2018 · 8 revisions

EPrints JSON Importer

Add the gem

gem 'leaf_addons', :git => 'git://github.com/leaf-research-technologies/leaf_addons.git'

Run bundle install

Run the importers generator 'rails g leaf_addons:importers'

Run:

bin/import_from_eprints_json <server> <json_file>

For example:

bin/import_from_eprints_json localhost path/to/my/file.json

What the script does

  • Reads through the eprints json and builds an object of a particular type
  • Downloads the attached files and adds them to the object either as FileSets, or as Files within Filesets

The Code

The importer code can be found here:

lib/importer/eprints

Supported Models

From DogBiscuits:

  • PublishedWork
  • ConferenceItem
  • JournalArticle

To use different models, make sure there is a Factory and override the 'find_model' method in lib/importer/eprints/json_attributes_override.rb

Supported Fields

If a field isn't supported an error message will print on the screen as it runs.

To add support for a new field, add a method into lib/importer/eprints/json_attributes.rb (core code) or in the local application, into lib/importer/eprints/json_attributes_overrides.rb

For example, to process an eprints field called 'lovely_field' and add it to a hyku property called 'lovely_field'.

    # Add lovely_field to attributes
    #
    # @param val [String] the value
    # @param attributes [Hash] hash of attributes to update
    # @return [Hash] attributes
    def lovely_field(val, attributes)
      attributes[:lovely_field] = [val]
      attributes
    end

Note:

  • The property MUST be available to the specified model in Hyku
  • This only works on fields at the top level of the eprints
  • The method can be more complex, eg. see 'creator'
  • Any fields that should not be processed at all should be added to the 'ignored' method in lib/importer/eprints/json_attributes.rb (or the local _overrides.rb file)
  • Any field that will be processed via a different method should be added to the 'special' method

Files

During the initial processing of the json, two attributes relating to attached files are created:

  • attributes[:remote_files] is processed during object creation and files are added to the object; files will be downloaded from their remote urls and added to the object. This is for 'primary files', ie. those that are not the subject of a relation.
  • attributes[:files_hash] is processed after object creation has completed and is used to update existing files with related files, for example 'extracted_text'

Currently only isIndexCodesVersionOf is supported. Thumbnails are ignored because Hyku will create new thumbnails, and no other relations have been added to the code.

Note:

  • This assumes that files are web downloadable without restriction