Skip to content

Yiwen-Shi/drug-labeling-extraction

Repository files navigation

Drug Labeling Retrieval Application

Install dependencies

pip install -r requirements.txt

Data Source files

Data Source Data File Name File Path Download Website
OrangeBook products.txt /datafiles/orangebook https://www.fda.gov/drugs/drug-approvals-and-databases/orange-book-data-files
DailyMed dm_spl_zip_files_meta_data.txt /datafiles/dailymed https://dailymed.nlm.nih.gov/dailymed/spl-resources-all-mapping-files.cfm
DrugBank fulldatabase.xml /datafiles/drugbank https://go.drugbank.com/releases/latest
Drugs@FDA ApplicationDocs.txt
Applications.txt
/datafiles/drugsfda https://www.fda.gov/drugs/drug-approvals-and-databases/drugsfda-data-files

Data Collection

DailyMed

python dailymed_data_collection.py

Drugs@FDA

python drugsfda_data_collection.py

DrugBank

Download full database from https://go.drugbank.com/releases/latest

Data Preprocess

DailyMed

python dailymed_preprocess.py

Drugs@FDA

python drugsfda_preprocess.py

DrugBank

python drugbank_preprocess.py

Comparison of Source Coverage and Overlap of Unique Drug and Drug Labeling Sections

Combine Drug Information from Multiple Data Sources and Basic Statistics

python multi_data_source_combine.py

Calculate Data Source Coverage and Overlap

python calculate_coverage_overlap.py

Food Effect Classification Model

Data Preprocessing

python food_effect_model_data_preprocess.py

Model Training & Evaluation

Run food_model.ipynb and food_model_bert.ipynb

Citation

@article{shi2021information,
  title={Information Extraction from FDA Drug Labeling to Enhance Product-Specific Guidance Assessment Using Natural Language Processing},
  author={Shi, Yiwen and Ren, Ping and Zhang, Yi and Gong, Xiajing and Hu, Meng and Liang, Hualou},
  journal={Frontiers in Research Metrics and Analytics},
  volume={6},
  pages={40},
  year={2021},
  publisher={Frontiers}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published