Skip to content

Customizing Bulkrax

Rob Kaufman edited this page Apr 7, 2023 · 5 revisions

Configuration

Please see the Configuring Bulkrax guide for how to customize Bulkrax through configuration.

Local Metadata Processing

When you run the bulkrax install (rails g bulkrax:install), a file called has_local_processing.rb is written to app/models/concerns/bulkrax.

This file contains an empty method stub called .add_local. This method is run at the end of the .build_metadata method.

The purpose of .build_metadata is to process the raw metadata from the import and turn it into parsed_metadata - a data hash that can be consumed by the factory and used to create the work.

The .add_local method is provided as a convenient place to overwrite the default behaviour of the underlying Bulkrax code.

For example:

I have a specific use case to append a string to the beginning of the title.

The default .build_metadata creates the following parsed_metadata:

{ title: 'My title', other fields }

I can override the title using .add_local like so:

	def add_local
		current_title = self.parsed_metadata['title]
		self.parsed_metadata['title] = "Some appended text : {current_title}"
	end

Local Overrides

Where overrides through .has_local_processing do not suffice (for example, to override parser, rather than entry, behavior), there are various mechanisms for overriding that can be used to alter the behaviour of specific methods. For example class_eval or prepend can be used during load time to alter specific code.

Custom Parsers and Entries

Bulkrax comes with a small (and growing) set of Parsers and Entries, but there will inevitably be time where support for a different format is needed.

For common formats we encourage contributions to Bulkrax. The following guide is written for both bulkrax and local application implementation.

Before continuing, please read about the Anatomy of Bulkrax.

Creating an Entry

eg. app/models/bulkrax/foo_entry.rb

The Entry class:

  • MUST inherit from Bulkrax::Entry
  • OPTIONALLY inherit from a subclass of Bulkrax::Entry

For import:

  • MUST implement self.fields_from_data(data)
  • MUST implement self.read_data(path)
  • MUST implement self.data_for_entry(data, path = nil)
  • OPTIONALLY implement self.collection_field
  • OPTIONALLY implement self.children_field
  • OPTIONALLY implement self.matcher_class
  • MUST implement record
  • MUST implement build_metadata
  • OPTIONALLY implement collections_created?
  • OPTIONALLY implement find_or_create_collection_ids

For export:

  • MUST implement build_export_metadata

Notes:

  • Refer to the Bulkrax codebase for documentation on the purpose of each method.
  • If subclassing an exiting Entry, some of the listed methods may not need to be overridden.
  • If supporting collections, an accompanying Collection Entry is required (eg. FooCollectionEntry.rb) and .collections_created? and find_or_create_collection_ids will be needed on FooEntry.

Creating a Parser

eg. app/parsers/bulkrax/foo_parser.rb

The Parser class:

  • MUST inherit from Bulkrax::ApplicationParser
  • OPTIONALLY inherit from a subclass of Bulkrax::ApplicationParser
  • MUST implement entry_class
  • MUST implement collection_entry_class
  • MUST implement records(opts = {})
  • MUST implement create_collections
  • MUST implement create_works
  • MUST implement setup_export_file (if export_supported?)
  • MUST implement write_files (if export_supported?)
  • OPTIONALLY implement valid_import? (default: true)
  • OPTIONALLY implement total (default: 0)
  • OPTIONALLY implement self.export_supported? (default: false)
  • OPTIONALLY implement self.import_supported? (default: true)

Notes:

  • Refer to the Bulkrax codebase for documentation on the purpose of each method.
  • If subclassing an exiting Parser, some of the listed methods may not need to be overridden.
  • collection_entry_class is only required If supporting collections
  • The Parser and Entry do not have to have the same base name (eg. BagitImporter can import CSVEntry or RDFEntry)

Creating a Matcher

eg. app/matchers/bulkrax/FooMatcher.rb

A Matcher is not required. Create a Matcher if specific parsing behaviour is required for the new Entry.

  • MUST inherit from Bulkrax::ApplicationMatcher
  • OPTIONALLY inherit from a subclass of Bulkrax::ApplicationMatcher

ApplicationMatcher is a complete class, no methods are required on a sub-class. Override / add any methods needed for parsing the entry data.

Creating a field partial

eg. app/views/bulkrax/importers/_foo_fields.html.erb or app/views/bulkrax/exporters/_foo_fields.html.erb

The field partial defines fields specific to the Parser which will be displayed on the Importer#new and Importer#edit pages. The name of the field partial for each parser is defined in the configuration. It may be that the new parser can use one of the existing field partials. The outer div must have a class that matches the field partial's name. For example _xml_fields.html.erb has <div class='xml_fields'> as its outer div.

Note: If the field partial isn't showing up on the page when selecting the newly created parser in the dropdown, try running the following command in the terminal and refreshing the page: docker-compose exec rm -r tmp/cache/*

Configure your Parser and Entry

In Bulkrax

lib/bulkrax.rb and lib/generators/bulkrax/templates/config/initializers/bulkrax.rb

In a local application

config/initialisers/bulkrax.rb