adtl is a data transformation language (DTL) used by some applications in Global.health, notably for the ISARIC clinical data pipeline at globaldothealth/isaric and the InsightBoard project dashboard at globaldothealth/InsightBoard
Documentation: ReadTheDocs
You can install this package using either pipx
or pip
. Installing via pipx
offers advantages if you want to just use the
adtl
tool standalone from the command line, as it isolates the Python
package dependencies in a virtual environment. On the other hand, pip
installs
packages to the global environment which is generally not recommended as it
can interfere with other packages on your system.
-
Installation via
pipx
:pipx install git+https://github.com/globaldothealth/adtl
-
Installation via
pip
:python3 -m pip install git+https://github.com/globaldothealth/adtl
If you are writing code which depends on adtl (instead of using
the command-line program), then it is best to add a dependency on
git+https://github.com/globaldothealth/adtl
to your Python build tool of
choice.
Most existing data transformation languages are usually in a XML dialect, though there are recent variations in other file formats. In addition, many DTLs use a custom domain specific language. The primary utility of this DTL is to provide a easy to use library in Python for basic data transformations, which are specified in a JSON file. It is not meant to be a comprehensive, and adtl can be used as a step within a larger data processing pipeline.
adtl can be used from the command line or as a Python library
As a CLI:
adtl specification-file input-file
Here specification-file is the parser specification (as TOML or JSON) and input-file is the data file (not the data dictionary) that adtl will transform using the instructions in the specification.
Python library:
import adtl
parser = adtl.Parser(specification)
print(parser.tables) # list of tables created
for row in parser.parse().read_table(table):
print(row)
If adtl is not in your PATH, this may give an error. Either add the location where the adtl script is installed to your PATH, or try running adtl as a module
python3 -m adtl specification-file input-file
Running adtl will create output files with the name of the parser, suffixed with table names in the current working directory.
Install pre-commit and setup pre-commit hooks (pre-commit install
) which will do linting checks before commit.