Skip to content

ProtocolData/trialtracker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Trial Tracker

Improving cancer research with data!

Table of Contents

What is it?

trialtracker is a Python package that provides methods to easily extract, transform, and download clinical trial data. It aims to create standardized data infrastructure for clinical trial digitalization, focusing on structured representation of clinical trial protocols.

Main Features

Here are some of the things trialtracker allows you to do:


- Download pre-curated clinical trial and clinical trial eligibility criteria datasets
- Easily query data from clinicaltrials.gov
- Apply state-of-the-art natural language processing methods to extract useful information from raw clinicaltrials.gov data
- Data visualizations and analysis of clinical trial data

The current version of the package is primarily focused on cancer trials, which are an important area for clinical development. Improved data infrastructure is especially helpful in this area given the complexity of the disease and treatments.

Impact

Cancer is one of the leading causes of death worldwide. The way we test and approve new treatments is through clinical trials. But 97% of cancer trials fail, driven by inability to recruit enough patients. And yet many patients are routinely excluded from trials, including minority groups who are most affected by the disease.

The key to solving these problems is in changing how we design trials, recruit patients, and report on results. Regulatory requirements for clinical trial registration became required in 2017, making semi-structured trial protocol data available on clinicaltrials.gov. Today, this is not being systematically used in trial design, patient recruitment, or reporting decisions in Oncology. This project aims to unlock the value of clinical trial data to help accelerate cancer research and improve the lives of cancer patients.

Built With

Technologies and methods used to build this project!

Getting Started

To get a local copy up and running follow the steps below.

Prerequisites

Get up and running with conda. Given the many dependencies of this project, we use conda as a package/environment manager to make sure we're running things in the same environment and that nothing breaks :)

Installation

  1. Clone the repo
git clone https://github.com/zfx0726/trialtracker.git
  1. Navigate into the trialtracker project directory and recreate the conda environment.
conda env create --file=trialtrackerenv_py36.yaml
  1. Activate conda python environment
conda activate trialtrackerenv_py36

Running eligibility criteria extraction with FB Clinical Trial Parser

  1. Download the MeSH vocabulary, from root directory:
./extract/src/github.com/facebookresearch/Clinical-Trial-Parser/script/mesh.sh
  1. Navigate into the trialtracker project directory and recreate the conda environment.

Running eligibility criteria extraction with pyMeSHSim

  1. Download and extract MetaMap as per:

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Forrest Xiao - zfx0726@gmail.com