Skip to content
Thrift definitions, making HLT data specifications concrete
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Copyright 2012-2018 Johns Hopkins University HLTCOE. All rights reserved. This software is released under the 2-clause BSD license. See LICENSE in the project root directory.


Current version: 4.15

Please consult for more information about changes between versions.


Concrete is an attempt to map out various NLP data types in a Thrift schema for use in projects across Johns Hopkins University. This standardized schema allows researchers to use a common, underlying data model for all NLP tasks, and thus, facilitating integration between projects.

Browsable Schema Documentation

This repository contains HTML documentation for the Concrete schema. The documentation content is generated from the .thrift schema files. This HTML documentation contains the exact same content as the schema text files, but the HTML format makes it easier browse and explore relations between different Concrete data structures.

To view the HTML documentation, open the file:


in your favorite web browser.

Documentation Webserver

The repository comes with an (optional) simple Bottle-based Python web server for hosting the documentation. You can install Bottle using pip:

pip install bottle

and then start the web server with the command:

python [--port PORT_NUMBER]

This command will start a web server on your machine on the default port number (8097).

Point your browser to http://localhost:8097 to navigate to the documentation (assuming port 8097).

Regenerating Documentation

If you do not have write access to this repository than you can safely ignore this section.

The HTML documentation is a modified version of the documentation generated by the Thrift compiler. In order to regenerate the documentation, you will need both the thrift compiler and the Python library beautifulsoup4. You can regenerate the documentation by running the script:

cd docs
./ path_to_thrift_compiler

This script will call thrift --gen html to generate HTML files for each .thrift file, and then copy modified versions of each HTML file to the schema/ directory. Not all files in the schema/ directory are auto-generated.

You can’t perform that action at this time.