Skip to content
Source code of the website.
Python JavaScript Ruby CSS
Find file
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


OpenUP RBINS Contribution

What is it ?

This repository contains all source code written in the context of helping the RBINS to publish its high-resolution beetles pictures to the OpenUP / Europeana project.

It consists of:
  • Data import/transformation scripts that are used to turn the original data (Excel files and high-resolution JPEG files) into usable data for OpenUP (to be served through BioCASE)
  • The source code (Django project) of the website buld to showncase these pictures (

(Webapp) Requirements

Required Python packages are listed in requirements.txt, install with:

$ pip install -r requirements.txt

The application requires the following environment variables to run:

  • DATABASE_URL # postgres://username:password@host:port/db_name
  • GOOGLE_ANALYTICS_TABLE_ID #something like ga:123456

(Webapp) Running locally

To keep auto-reaload working:

$ foreman run python runserver

OR to ensure same running details than on heroku:

$ foreman start

(Webapp) Deploy to staging

$ git push heroku master

(Webapp) Install:

$ python syncdb

Data import:

The data import process takes as input:

  • Original, high resolution images (JPEG, with fancy filenames and sorted in different directories, by family).
  • The Excel file managed by S. Kerkhof.

And will output:

  • Transformed images that will be served on the web (two versions for each: one resized with just the animal, the other resized with additional overlays: scientific name, license info, ...).
  • A database (referencing these images) to be used as a source for OpenUP publication (trough the BioCASE provider). The only really necessary output consists of "rbinsphotos" (consumed by BioCASE + website import) and rbinsmetadata views
  • Another database for the website (Django project).

Process overview:

!!! Steps 1-3 are overly fragile and complex, but currently necessary as the CORRECT taxonomic data is not present in the Excel file and has to be extracted from file path/names !!! !!! Notes for rewriting this process (and the related data requirements can be found in data_import_tools/import_review.rst)

  1. data_import_tools/images_transformation/transform.rb resize the images, add overlays, some padding, ...
  2. The Excel file and the result of an image directory "walk" are reconciled and imported in the "OpenUP" PostgreSQL database (will be consumed by BioCASE provider)
  3. data_import_tools/images_transformation/step2/move_files.rb loop on the newly created rbinsphotos table, and for each row rename the associated image to <ROW_ID>.jpg and move it to a flat directory structure.
  4. We publish these static files on the Internet
  5. We use the a CSV dumpp of the "OpenUP" database to populate the "website" database.

Step 1: details

  • Requires rmagick
  • Configure constants in transform.rb

Step 2: details

  • The OpenUP database is built using data_import_tools/create.sql

  • This SQL script relies on 2 CSV data sources:
    • the main CSV file is just an export of the source Excel file (field separator: ; / encoding: UTF-8) !! You'll have to add a line column after remarks, containing the line number
    • walk.csv is generated using walk.rb (that walks over the images directory)

Step 5: details:

  • We first extract data into CSV from Andre's view:
$ psql -h dev -U postgres -d openup_rbins
openup_rbins=# COPY (SELECT * FROM rbinsphotos) TO 'openup_export.csv' WITH CSV HEADER;
$ ./ load_rbins_data data/openup_export.csv --truncate
Something went wrong with that request. Please try again.