Skip to content

Migration Guide: Islandora to Islandora using OAI PMH

Brandon Weigel edited this page Feb 5, 2021 · 2 revisions

Overview

In 2020, the Vancouver Public Library decided to move their This Vancouver project from their own Islandora installation to Arca (https://vpl.arcabc.ca). They did not have full access to their Fedora back-end, and needed a simple approach to migration that would allow metadata editing to meet Arca's standards before ingest could take place.

The solution was to use the OAI-PMH toolchain to list all collections and download both files and metadata, and then to use a text editor (in this case, BBedit) to standardize the collected MODS.

This approach is favoured because:

  • Importing objects directly from Fedora into a shared repository would be complex in terms of logistics, metadata standards, and namespace changes
  • With a third-party module, the Islandora OAI module can be configured to expose the download URL, making acquiring the objects simple

Step 1: Configure the source repository

  • To simplify the process, install Islandora OAI With Download Links. This modifies the OAI request handler to include a download link in the returned records.

  • Configure Islandora OAI to use the new request handler, and at admin/islandora/tools/islandora-oai/handler, under "Download Links", configure the element you want to use. In this example, we use <location><url access="download">%url%</url></location>.

  • Ensure that the CModel to datastream mapping provides the datastreams you wish to acquire for the objects you'll be ingesting.

Step 2: Get a list of collections

  • In your source repository, go to /oai2?verb=ListSets. This will give you a list of sets that you can use to build your INI file so you can download by collection instead of getting the whole repository at once.

Step 3: Set up your INI file.

; MIK configuration file for an OAI-PMH toolchain.

[CONFIG]
config_id = My migration
last_updated_on = "2020-11-15"
last_update_by = "bw"

[SYSTEM]
date_default_timezone = 'America/Vancouver'
verify_ca = 0

[FETCHER]
class = Oaipmh
oai_endpoint = "http://mysite/oai2"
metadata_prefix = mods
;set_spec = islandora_collection1
;set_spec = islandora_collection2
;set_spec = islandora_collection3
;set_spec = islandora_collection4

temp_directory = "/Volumes/Arca/tmp/oaitest_temp"

[METADATA_PARSER]
class = mods\OaiToMods

[FILE_GETTER]
class = OaipmhModsXpath
xpath_expression = "//mods:location/mods:url"
temp_directory = "/Volumes/Mydrive/tmp/oaitest_temp"

[WRITER]
class = Oaipmh
output_directory = "/Volumes/Mydrive/output/**set_name**"


[LOGGING]
path_to_log = "/Volumes/Arca/tmp/oaitest_output/mik.log"
path_to_manipulator_log = "/Volumes/Arca/tmp/oaitest_output/manipulator.log"

To download a given collection, uncomment the set you want to use, and comment out the previous set, then under output_directory choose a new subdirectory corresponding to the set.

Step 4: Fix up your metadata

In this case, among other issues, dates had been stored in the dateCreated element instead of the required dateIssued element, and dates were formatted DD-MM-YYYY instead of the standard YYYY-MM-DD. Using BBedit's multi-file find and replace functions, batch edits were done easily. And BBedit's Grep find/replace enabled things like transposing days and years.

Step 5: Fix file extensions

Audio CModel objects might end up being downloaded with file extensions that cannot be recognized by Islandora. In the VPL migration, .wav OBJs have been downloaded as .x-wav, and .mp3 OBJs were downloaded as .mpeg. All that's needed is to change the file extensions.

This can be done in bulk in the terminal or shell. For example, a linux command to rename .x-wav files to .wav: rename .x-wav .wav *.x-wav. To turn mpeg into mp3: rename .mpeg .mp3 *.mpeg.

Step 6: Upload and ingest

Create the relevant collections in the new repository, map your collection directories to the new collections, and run batch ingests.

Clone this wiki locally