The goal of this repository is to apply Clinical trial data conventions to the 'real' data. As the first try, the PhUSE dataset has been chosen. It is a CDISC-compliant synthetic SDTM dataset.
For more information about the Clinical trials conventions, please visit our wiki or download pdf version.
About the group - see here.
The ETL process is based on Apache Spark™ analytics engine running in a docker container. So the only thing you need is Docker. You can find instructions on how to install Docker on your system at the official site. After getting Docker installed, run the conversion in easy three steps:
- Clone the repository into a folder on your machine:
$ git clone https://github.com/OHDSI/ClinicalTrialsWGETL.git
- Download all the necessary vocabularies from the Athena and put them into vocab/omop folder
- And finally, from the root repository folder, run the following command:
$ docker-compose run --rm --service-ports phuse_etl
After the conversion is done, resultant CDM tables (in csv format) are in the data/cdm folder.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.