Skip to content

A set of tools for interacting with data from NERC's Research Outputs Database

Notifications You must be signed in to change notification settings

okfn/nerc-rod-tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

How to use these tools

Create and populate the database:

$ cat rod-schema.sql | sqlite3 rod.db
$ bzcat rod-complete.sql.bz2 | sqlite3 rod.db

Install Flask and run the viewer:

$ virtualenv --distribute pyenv
$ source pyenv/bin/activate
$ pip install flask
$ python viewer.py

Initial extraction

The data has been extracted from an MS Access (JET) database, with the help of mdbtools.

First extract a basic schema. This is a rudimentary conversion, and the end result required extensive editing to make the output SQLite-compatible. PRIMARY KEY and FOREIGN KEY constraints were also added where obvious.

$ mdb-schema rod.mdb > rod-schema.sql

Now get a list of tables we're interested in:

$ ROD_TABLES="$(mdb-tables rod-snap-20100426.mdb | tr ' ' '\n' | grep ROD)"

Extract each of these tables to CSV:

$ echo $ROD_TABLES | while read table; do mdb-export rod-snap-20100426.mdb $table > $table.csv; done

You can optionally convert these CSVs to SQL "INSERT" statement files using csv2sql.py:

$ echo $ROD_TABLES | while read table; do python csv2sql.py $table < $table.csv | bzip2 > $table.sql.bz2; done

Note that you will need to ensure you load the tables into the SQL database in the correct order to ensure FOREIGN KEY constraints are met. In particular, load ROD_GRANTS first, and ROD_OUTPUTS second.

NB: The foreign key constraints won't be met by the data at the moment. The apparent absence of the concept of such constraints from Access means this is a rather messy dataset.

About

A set of tools for interacting with data from NERC's Research Outputs Database

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages