pip install -r requirements.txt
Data Source | Data File Name | File Path | Download Website |
---|---|---|---|
OrangeBook | products.txt | /datafiles/orangebook | https://www.fda.gov/drugs/drug-approvals-and-databases/orange-book-data-files |
DailyMed | dm_spl_zip_files_meta_data.txt | /datafiles/dailymed | https://dailymed.nlm.nih.gov/dailymed/spl-resources-all-mapping-files.cfm |
DrugBank | fulldatabase.xml | /datafiles/drugbank | https://go.drugbank.com/releases/latest |
Drugs@FDA | ApplicationDocs.txt Applications.txt |
/datafiles/drugsfda | https://www.fda.gov/drugs/drug-approvals-and-databases/drugsfda-data-files |
python dailymed_data_collection.py
python drugsfda_data_collection.py
Download full database from https://go.drugbank.com/releases/latest
python dailymed_preprocess.py
python drugsfda_preprocess.py
python drugbank_preprocess.py
Combine Drug Information from Multiple Data Sources and Basic Statistics
python multi_data_source_combine.py
Calculate Data Source Coverage and Overlap
python calculate_coverage_overlap.py
python food_effect_model_data_preprocess.py
Run
food_model.ipynb
and food_model_bert.ipynb
@article{shi2021information,
title={Information Extraction from FDA Drug Labeling to Enhance Product-Specific Guidance Assessment Using Natural Language Processing},
author={Shi, Yiwen and Ren, Ping and Zhang, Yi and Gong, Xiajing and Hu, Meng and Liang, Hualou},
journal={Frontiers in Research Metrics and Analytics},
volume={6},
pages={40},
year={2021},
publisher={Frontiers}
}