i2b2/tranSMART etl-client

etl-client 19.1 is deployed as docker container and can be used to load the properly delimited dataset (for example: NHANES data publicly available) into i2b2/TM 19.1 compatible database. Once you are familiar with the process then you can use it load any clinical datafile in i2b2/tranSMART 19.1 database. Our process has been tested extensively on Oracle databases so if you are running any database the please adjust the enviornment accordingly.

DISCLAIMER:

The data loader truncate and reload the i2b2demodata/i2b2metadata schema tables.

Please double check before running any data loaders in production environment.

Installation

This etl-client docker image can be deployed using following 2 ways:

Locally ( on Mac/Linux )
- Docker should up and running
On a VM on a cloud vendor like AWS, GCP and Azure
- perform some additional steps to make the machine(VM/EC2) ready for docker image

There are 2 ways you can use this etl-client.

Use Case 19.1 ( preferred if you are loading your own data in production with 19.1 stack):

ETL client with 19.1 Database

Use case (Quickstart 18.1b):

ETL client with (Quickstart 18.1b stack)

This option can be used if you are still using the Quickstart 18.1b stack.

Data Loading Overview

It is recommended that you first try loading example datafiles .

If you already familiar with the process and want to load your custom data file then just follow the steps below.

Make sure i2b2/tranSMART DB 19.1 is up and running.
Install etl-client-docker - container should be up and running ( follow above installation instructions)
Validate connections to DB - should be Successful
Start with Datafile you want to load
Build Initial mapping file using MappingGenerator
Fix mapping file to match your i2b2 tree and datatypes, try using Mapping Editor (provided) to fix the mapping file
Run EntityGenerator to generate csv file for each table
Run Workflow scripts to load data in your DB
Test your i2b2/tranSMART application with latest data loaded.

Example to load NHANES dataset subsets

small ~ 100 patients
large ~ 5k patients

Load NHANES data subsets example

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
.env		.env
.gitignore		.gitignore
ETL-client-overview.png		ETL-client-overview.png
LICENSE		LICENSE
README-18.1.md		README-18.1.md
README.md		README.md
Screen Shot 2018-10-04 at 11.42.51 AM.png		Screen Shot 2018-10-04 at 11.42.51 AM.png
docker-compose.yml		docker-compose.yml
etl-client-19.1.png		etl-client-19.1.png
etl-client-AWS-EC2.md		etl-client-AWS-EC2.md
etl-client-docker.env		etl-client-docker.env
useCase-19.1.md		useCase-19.1.md
useCase1.md		useCase1.md
useCase2.md		useCase2.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

i2b2/tranSMART etl-client

DISCLAIMER:

The data loader truncate and reload the i2b2demodata/i2b2metadata schema tables.

Installation

There are 2 ways you can use this etl-client.

Use Case 19.1 ( preferred if you are loading your own data in production with 19.1 stack):

Use case (Quickstart 18.1b):

Data Loading Overview

Example to load NHANES dataset subsets

About

Releases

Packages

Contributors 2

License

hms-dbmi/etl-client-docker

Folders and files

Latest commit

History

Repository files navigation

i2b2/tranSMART etl-client

DISCLAIMER:

The data loader truncate and reload the i2b2demodata/i2b2metadata schema tables.

Installation

There are 2 ways you can use this etl-client.

Use Case 19.1 ( preferred if you are loading your own data in production with 19.1 stack):

Use case (Quickstart 18.1b):

Data Loading Overview

Example to load NHANES dataset subsets

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages