Skip to content
Web application for harvesting, merging and publishing metadata feeds. Supports RIF-CS and Atom-RDC formats and the OAI-PMH, Atom-PMH and SRU protocols. From the ANDS funded UQ Data Collections Registry project.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
app
config
db
foreman
lib
log
public
script
spec
trove-bin
vendor
.foreman
.gitignore
.rspec
.travis.yml
.yardopts
COPYING
Gemfile
Gemfile.lock
Guardfile
INSTALL.md
Procfile
README.md
Rakefile
clock.rb
config.ru

README.md

Miletus

Build Status Dependency Status Code Climate YARD Docs

Overview

Miletus is an application for aggregating and merging collection metadata records at an institutional level. If you have never heard of RIF-CS and ANDS, then likely it's not for you.

Its most common use case is to harvest records from systems outputing collection metadata, merge together records describing the same concept (eg. a person) and to look up further information from institutional systems storing HR and funding data.

Features

Miletus supports the following harvest methods:

  • OAIPMH
    • RIF-CS
  • Direct document fetching
    • RDC Atom
    • RIF-CS
  • Atom feeds (via rel="alternate" links)
    • RDC Atom
    • RIF-CS

Output feeds are available in:

  • OAIPMH
    • RIF-CS (rif)
    • Dublin Core (oai_dc)
  • Atom
    • RIF-CS (application/rifcs+xml)

The web interface is also heavily sprinkled with RDFa for good search engine optimisation.

Static content can be updated through the admin interface using Markdown.

Usage

The database connection is provided by the DATABASE_URL environment variable.

Take advantage of Foreman's .env file:

echo "DATABASE_URL=postgres:///my_db" > .env

To run Rake tasks adhoc:

foreman run rake jobs:work

Or to run the whole lot:

foreman start

Exporting to system script to run on port 8000, managed by bluepill:

sudo gem install bluepill
foreman export bluepill /tmp -a miletus -u <miletus_user> -p 8000 -t ./foreman
sudo cp /tmp/miletus.pill /etc/bluepill/miletus.pill
sudo cp ./foreman/miletus-bluepill.init /etc/init.d/miletus
sudo service miletus start

To configure feeds and lookup services once running, go to /admin/. The default username is admin@example.com, and password is password. You are strongly advised to change both of these after you log in.

The OAI-PMH output feed can be found at /oai. The Atom feed can be found at /atom.

Production & SSL

Miletus will enforce the use of HTTPS for admin logins in a production environments (ie. RAILS_ENV=production). If you need to run in an environment where HTTPS is not available, you run with DISABLE_HTTPS=1 in the environment to disable this check.

Architecture

The harvest and output sections are loosely coupled to the merge process, and are persisted separatedly. The result is that changes will take time to flow through the system, but the output performance should be independant of the time it takes to merge facets.

An example of the sequence for OAI-PMH RIF-CS records is as follows:

┌─────────────────────────────────────────┐
│ Miletus::Harvest::OAIPMH::RIFCS::Record │
└─────────────────────────────────────────┘
  │
  │ :after_save
  ▼
┌─────────────────────────────────────────┐
│           RifcsRecordObserver           │
└─────────────────────────────────────────┘
  │
  │ :create
  ▼
┌─────────────────────────────────────────┐
│          Miletus::Merge::Facet          │
└─────────────────────────────────────────┘
  │
  │ :reindex
  ▼
┌─────────────────────────────────────────┐
│         Miletus::Merge::Concept         │
└─────────────────────────────────────────┘
  │
  │ :after_save
  ▼
┌─────────────────────────────────────────┐
│          OaipmhOutputObserver           │
└─────────────────────────────────────────┘
  │
  │ :create
  ▼
┌─────────────────────────────────────────┐
│     Miletus::Output::OAIPMH::Record     │
└─────────────────────────────────────────┘

Acknowledgements

This project is supported by the Australian National Data Service (ANDS). ANDS is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative.

You can’t perform that action at this time.