Skip to content
tranSMART Arborist ETL toolkit
Python CSS JavaScript HTML R
Branch: develop
Clone or download



Master: Documentation Status


Anaconda Cloud latest package:

A toolkit for ETL curation for the tranSMART data warehouse. The TranSMART curation toolkit (tmtk) can be used to edit and validate studies prior to loading them with transmart-batch.

For general documentation visit readthedocs.


Installing via Anaconda Cloud or Pypi package managers


$   conda install -c conda-forge tmtk


$   pip3 install tmtk

Installing manually

Initialize a virtualenv

$ pip install virtualenv
$ virtualenv -p /path/to/python3.x/installation env
$ source env/bin/activate

For mac users it will most likely be

$ pip install virtualenv
$ virtualenv -p python3 env
$ source env/bin/activate

or do this using virtualenvwrapper.

Installation from source

To install tmtk and all dependencies into your Python environment, and enable the Arborist Jupyter notebook extension, run:

$   pip3 install -r requirements.txt
$   python3 install

or if you want to run the tool from code in development mode:

$   pip3 install -r requirements.txt
$   python3 develop
$   jupyter-nbextension install --py tmtk.arborist
$   jupyter-serverextension enable tmtk.arborist


These dependencies will have to be installed:
  • pandas>=0.22.0
  • ipython>=5.3.0
  • jupyter>=1.0.0
  • jupyter-client>=5.0.0
  • jupyter-core>=4.3.0
  • jupyter-console>=5.1.0
  • notebook>=4.4.1
  • requests>=2.13.0
  • tqdm>=4.11.0
  • xlrd>=1.0.0
  • click>=6.0
  • arrow>=0.10.0
Optional dependencies:
  • mygene>=3.0.0



You can’t perform that action at this time.