Skip to content
Helper for ReDBox-Mint. Reads ReDBox and Mint OAI-PMH RIF-CS portals via a network connection and generates a static web page for each RIF-CS record. This allows the metadata to be exposed to the internet via static web pages whilst the ReDBox-Mint web applications are not.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin
etc_sample_css
etc_sample_nocss
future
get_oai_pages
lib
log
rifcs_notes
var
INSTALL
LICENSE
README.md
rif2website_dev_doc_v1.1.pdf

README.md

FlindersRedbox-rif2website

Purpose

"ReDBox is a metadata registry application for describing research data. The Mint is an name-authority and vocabulary service that complements ReDBox." See http://www.redboxresearchdata.com.au/. The purpose of this script is to read ReDBox and Mint OAI-PMH RIF-CS (XML) via a network connection and generate a static web page for each registryObject element (ie. RIF-CS record). Each static web page is generated by extracting XML information from rules specified in a file.

Notes

  • Has been tested and designed for use on ReDBox and Mint dev-handle build.

  • In order for handles to point to these pages, each data source template (eg. Mint home/harvest/Parties_People.json) should have it's urlTemplate like:

    "urlTemplate": "http://MY_STATIC_PAGES_VHOST/MY_PATH/[[OID]].html",

    At the time of writing, all Mint urlTemplates are:

    "urlTemplate": "http://MY_STATIC_PAGES_VHOST/md/m/[[OID]].html",

    and ReDBox urlTemplate (in home/harvest/workflows/dataset.json) is:

    "urlTemplate": "http://MY_STATIC_PAGES_VHOST/md/r/[[OID]].html",

  • Because the source information is the ReDBox-Mint OAI-PMH portals, hence available on a network, this script (and so the destination website) can be on a host other than ReDBox or Mint servers.

Application environment

Read the INSTALL file.

Installation

Read the INSTALL file.

Features

  • The following OAI-PMH harvest methods are permitted:

    • The first harvest of ReDBox (or Mint) must be a full harvest (--full-harvest)
    • Subsequent harvests of ReDBox (or Mint) may optionally be incremental harvests (--incr-harvest). Incremental harvests use the OAI-PMH from argument to obtain all new and updated records since the specified from-datestamp. The summary page (discussed below) is for all records even if an incremental harvest is used (provided a full harvest has been performed in the past and there are no 'gaps' in the incremental harvest datestamps).
  • Use the RIF-CS key to lookup the Facinator OID:

    • with redirect (from Handle.net) 1 level deep
  • Store Facinator OIDs in a local cache in order to bypass the (Handle.net) lookups above. This results in a massive performance improvement (of 80 times on our test system).

  • If there is more than one OAI-PMH page of records, iterate through all pages by using the resumption token

  • Use the following config files (with hash elements which can be overwritten):

    • main (containing RIFCS URL, target root dir, target html-template, user-agent)
    • multiple rule-files according to user's preference Eg. perhaps 1 per record type (eg. collection, party) and subtype (eg. dataset, person)
  • log file (eg. errors, warnings)

  • HtmlHelper class

  • Allow one invocation for Mint and another for Redbox.

  • Using a replacement token for xpath PRIMARY RECORD TYPE so the user can reference other rulesets. Eg ActivityProjectRules = PartyPersonRules

  • Allow program to determine which RIF-CS records will be processed based which *Rules arrays exist.

  • Security checks before running eval().

  • Convert URLs into hyperlinks.

  • Ensure shell script will run as a cronjob.

  • Cope with ReDBox-Mint being offline.

  • Make a handle-landing page for each retired record.

  • Make a rule to show the OID.

  • Make a rule to show the ANDS "Registry View" URL for record. ANDS Services say this is no longer possible since RDA Release 10.

  • Make a rule to show the RDA URL for record.

  • Split dest dir by redbox/mint repo; may be necessary to avoid an OID namespace clash!

  • Creates a summary page (eg. index.html) which points to all static pages created by this script.

  • Use common html-template for both summary page and individual pages.

  • Allow selected output table rows to be highlighted (eg. with bold or italic text).

  • Ruby source code produces rdoc documentation.

  • Rules have been written for record types:

    • party-person
    • activity-project
    • collection-* (but not complete)

    No rules have been written for service-*

Todo

  • Consider untarring images/css in ruby via config file (perhaps using minitar gem).
  • Consider instructing the web browser not to cache the page.

Acknowledgement

The development of this software was a component of a larger [Flinders University] (http://www.flinders.edu.au/) project funded by the Australian National Data Service (ANDS).

You can’t perform that action at this time.