Scripts to extract and structure the content of the "thesaurus of drug interactions" published, in a PDF format, by the French Agency for the Safety of Health Products (ANSM) at this address: https://ansm.sante.fr/documents/reference/thesaurus-des-interactions-medicamenteuses-1
The ANSM interaction working group publishes guidelines once or twice a year. All the files are in the thesauri folder.
You need to download tika-app-1.11.jar:
wget https://archive.apache.org/dist/tika/tika-app-1.11.jar
bash extract.sh --help
bash extract.sh -p -t
The program creates a TXT folder and a JSON folder in each thesaurus folder.
First, PDF files are transformed into txt files with Apache Tika.
For example:
java -jar tika-app-1.11.jar -h ./thesauri/2019_09/PDF/index_des_substances_09_2019.pdf > ./index_des_substances_09_2019.txt
java -jar tika-app-1.11.jar -t ./thesauri/2019_09/PDF/Thesaurus_09_2019.pdf >
./Thesaurus_09_2019.txt
The message "ERROR FlateFilter: stop reading corrupt stream due to a DataFormatException" can be ignored, the content is correctly extracted.
Next txt files are transformed to JSON files. For example:
python extractSubstanceDrugClasses.py -f ./index_substances092019.txt
python extractInteraction.py -f ./thesauri/2019_09/TXT/Thesaurus_09_2019.txt
The error message "SeverityLevelerror while extraction PDDI between X and Y" means that the programs has failed to structure the mechanism of action and the severity level of X and Y. To fix this issue, we need to structure it manually and add it to "./python/Interactions/pddis_manually_extracted.json"
Run all the tests with this command:
python -m unittest discover ./
Detect Potential Drug Drug Interactions: https://github.com/scossin/pddiansm
@mastersthesis{cossin:dumas-01442668,
TITLE = {{Interactions m{\'e}dicamenteuses : donn{\'e}es li{\'e}es et applications}},
AUTHOR = {Cossin, S{\'e}bastien},
URL = {https://dumas.ccsd.cnrs.fr/dumas-01442668},
PAGES = {80},
YEAR = {2016},
MONTH = Nov,
KEYWORDS = {Interaction m{\'e}dicamenteuse ; Web S{\'e}mantique ; Interop{\'e}rabilit{\'e} ; Pharmacovigilance },
PDF = {https://dumas.ccsd.cnrs.fr/dumas-01442668/file/Med_spe_2016_Cossin.pdf},
HAL_ID = {dumas-01442668},
HAL_VERSION = {v1},
}
Cossin S. Interactions médicamenteuses : données liées et applications. 30 nov 2016.