OCR-D website

All the OCR-D documentation and information in one place



18.04, >= 8 GB RAM


First some development pkgs:

sudo apt install make git ruby-dev ruby-bundler openjdk-8-jre python3-pip

NOTE: The openjdk-8-jre dependency is only required for building the GT guidelines.

Then jekyll, in the repo:

make jekyll

This will install jekyll into ./vendor/bundle.


The OCR-D site requires quite a few sub repositories conveniently laid out in the ./repo dir:

make help

Run make help to see a list of commands.


deps-ubuntu       ubuntu deps
jekyll            Install jekyll dependencies
shinclude         Install shinclude
bootstrap         Set up the repos, site and tools
gt                Build gt-guidelines. This takes a few minutes. Be patient.
build-modules     TODO Build module information
build-processors  TODO Build processor information
serve             serve the site dynamically
build-site        build the site
core-docs         Build sphinx documentation for core
spec              Build the spec documents TODO translate
workflows         Rebuild the workflow document from wiki fragments


REPODIR          Directory containing this Makefile. Don't change it. Default '/home/kba/build/'
JEKYLL           Which jekyll binary to use. Default 'jekyll'
DSTDIR           Where to build site. Default '/home/kba/build/'
SRCDIR           Where site is stored. Default '/home/kba/build/'
GTDIR            Repositories mit dne DITA Quelltexten. Default: /home/kba/build/
JEKYLL_HOST      host to serve from. Default:
KWALITEE_CONFIG  Configuration file for ocrd-kwalitee. Default: /home/kba/build/
LANGS            Languages to build. Default: 'de en'
LANGS_DST        Guideline langs to build. Default: 

Activate any virtualenvs before running make.

To ensure a complete setup for Debian/Ubuntu based Linuxes: make bootstrap. This will test whether all the tools are installed and offer remediation if not.

Directory structure

  • docs: This is where the site will be built. Never touch it.
  • site: This is the jekyll site. Posts and Pages live here.
  • repo: Contains required subrepos
  • layout.html: Template for the layout for sphinx-doc to use. to be run through shinclude

Rebuild gt-guidelines

make gt


Most elements of the page should be made available as both German and English texts.

Use the keys lang and lang-ref in YAML front matter to control language:

  • lang should be either en or de.
  • lang-ref is a unique arbitrary identifier that marks two pages as translations of each other.

E.g. to create a new page about cars:


title: The interestingness of cars never ceases to amaze
lang: en
lang-ref: that-weird-cars-page

# Cars ...

amazing aren't they?


title: Autos sollen gekauft werden
lang: de
lang-ref: that-weird-cars-page

weil es fuer die wirtschaft gut ist.

You could then go to and to from there.

Changing the menu

The menus are generated from the YAML file site/_data/menu.yml.

Every menu entry

  • MUST have a url field
  • MUST have a label field
  • MAY have a children field for a submenu

Both url and label can be either a string or an object with keys de and en. In the former case, url or label are the same across languages, in the latter case, you can adapt it per language.

url should not include the /en or /de prefix unless the page in question is only available in one language.

Updating publications

  • Go to
  • Select all items (Hold Shift to mark in bulk, Ctrl-leftclick to mark the first entry)
  • Export as "Zotero RDF"
  • Open Zotero Desktop
  • Import collection from file
  • Delete all "Presentation" (for "articles", delete everything else for "presentations")
  • Sort reverse by date
  • Select all
  • Right click -> export bibliography
  • Use style
  • Export as html, save as pub.html
  • Edit pub.html, crop to just the <body> contents
  • Replace some minor inconsistencies in Zotero's HTML output:
    • sed -i 's,>/slides,>,' pub.html
    • sed -i 's,,,' pub.html
  • paste pub.html into site/en/ or site/de/

Updating workflows

The workflows page is built from pages on inidividual steps in the OCR-D wiki.

To automate this, you need to have shinclude installed with make shinclude.

Make sure that repo/ is up-to-date: cd repo/; git pull origin master.

make workflows will generate site/en/ from the wiki fragments. Inspect it for consistency before merging.

Building and deploying the site

make build-site: rebuild the website to render the changes from Markdown to HTML

make deploy: Copy all the contents of ./docs to, commit and push the changes in

or simply run both at once:

make build-site deploy

URL Shortcuts

To add a shortcut to use the mechanism:

  • Add a line consisting of the shortcut name, a space charater and the URL to shortcuts.txt
  • make shortcuts


