Home

Ere Maijala edited this page Feb 27, 2017 · 29 revisions

Introduction

RecordManager provides metadata management functionality that can be useful e.g. for getting data into the search index of a discovery service. It uses MongoDB or TokuMX as the database for storing and processing the metadata. Multiple record formats can be used with pluggable record drivers. RecordManager provides the following main functions:

  • Harvest records using OAI-PMH protocol
  • Import records from files
  • Split records using PHP classes or XSLT
  • Normalize records using PHP classes or XSLT
  • Deduplicate records using the fancy built-in algorithm
  • Export records to files
  • Provide records to other harvesters with the built-in OAI-PMH Provider
  • Harvest database descriptions (IRDs) [from MetaLib](wiki/Harvesting MetaLib IRDs)
  • Harvest [SFX knowledgebase records](wiki/Harvesting SFX Objects)
  • Directly update a Solr index (VuFind)
  • Record preview support for e.g. Voyager’s “Send Record to WebVoyage VuFind” function (requires a bit of custom code in VuFind)
  • Enrich records with pluggable enrichment modules (includes a sample enrichment that fetches additional data for topics from an ontology)

The basic idea is that metadata can be harvested from multiple data sources and indexed into a single Solr index. To make sure records from different sources can be identified and kept separate, RecordManager prepends any ID fields with the data source ID (e.g. local record ID 123 becomes source.123). This means that any software using the Solr index can identify the source of the record, but must also take care to strip the prefix before, for instance, linking to a UI in the source system or fetching holdings information via an API.

Installation

See the accompanying README.md file for short installation instructions.

Getting Records into RecordManager

There are currently four different ways to get records into RecordManager's Mongo database:

  1. OAI-PMH harvesting
  2. Direct file load
  3. MetaLib IRD harvesting
  4. SFX export harvesting

The normal ways to get data in are OAI-PMH and loading files. The configuration page explains all the related settings, and there are also multiple sample configurations in conf/datasources.ini.sample.

OAI-PMH is driven via the harvest.php script and file loads are done using the import.php script. See the Usage page for examples and more information on the tools.

Configuration

See the Configuration wiki page for information on the settings.

Usage

See the Usage wiki page for basic instructions.

Command Line Functions

See the [Command Line Reference](wiki/Command Line) for more information.

Additional Information

Installing MongoDB on MAMP in Mac OS X: http://www.davidgolding.net/mongodb/installing-mongodb-on-mamp-1-9-5.html