Skip to content
Custom Taxonomy Builder
Clojure JavaScript Shell CSS
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

CTB - Custom Taxonomy Builder


Given a list of terms and a set of UMLS files, the CTB generates a subset the of UMLS containing the supplied terms and their word-based variants.


The following files should be placed in the data/input directory:

  • MRCONSO.RRF concepts file
  • MRSTY.RRF concept -> semantic types file

Supplied to Web Interface

  • list of supplied terms


  • Custom version of mrconso.rrf
  • Custom version of mrsty.rrf


To use CTB you must first create indexes of your UMLS files and then start the tool.

Prepare Knowledge Sources

Copy MRCONSO.RRF, MRSTY.RRF to ctb/data/input/'your data set name'/.

In the ctb directory run:

bin/ 'your data set name'

For example:

bin/ 2016AA

Update the system configuration file

There should be a file called in the "config" directory. In change:

ctb.ivf.dataroot: ...


ctb.ivf.dataroot: data/ivf/<your data set name>

Adding LVG to configuration file for term expansion

If you want to use the Lexical Tools Lexical Variant Generator (LVG) to supply term combinations not found in the UMLS then download LVG from the Lexical Systems Group website ( and install it according to its directions. After installing the Lexical Tools then add the following to the file: {LVGDIR}

Where LVGDIR is the location of your LVG installation.

Start up system

In the top-level ctb directory run:

java -jar target/ctb-0.1.0-SNAPSHOT-standalone.jar [port]

or if you have Leiningen:

lein ring server [port]

Then point your web browser to localhost:3000 (or if you supplied a port number, that port number.)

Supply Term List

Paste your term list into the "Input Terms" (first) page and press "Submit".

Filter synonyms

Select or de-select terms in Synonym Set View to filter the synonyms generated by the tool and press "Submit".

Generate Data Set

The generated dataset will be placed in the directory resources/public/output/user//.

The directory should contain the following files:


For Developers

Running the system in Apache Tomcat

If you have tomcat you can use the file target/ctb-0.1.0-SNAPSHOT-standalone.war to deploy the system to tomcat.

Currently, the application expects the config directory containing and the data directory containing the indexes to be in the top-level apache-tomcat directory (where directories conf, webapps, etc. resides)

Note: CTB has not been extensively tested in Tomcat and may require modification to work properly.


CTB is product of the U.S. Government and is not subject to copyright.

For more information see:

You can’t perform that action at this time.