Drug repositioning based on Heterogeneous network and Text Mining
- 'data' directory Contain the nine drug-related networks.
- compute_similarity.m compute Jaccard similarity based on association network(i.e., drug-disease, drug-side-effect, and drug-protein networks)
- PPMI_Matrix.m generate PPMI matrix
- fuse_similarity_matrix.m fuse nine PPMI matrix
- SAE.py get the drug feature
Part of codes and parameters use the BioBERT (https://github.com/dmis-lab/biobert)
- Requirments: tensorflow-gpu >= 1.11.0 GPU version of TensorFlow. sklearn To evaluate RE answers pandas==0.23
- Fine-tuning BioBERT: Relation Extraction (RE) Run Shell File train.sh # Remind of the file path. Detail see https://github.com/dmis-lab/biobert (After that, we get the embedding of vocabularies, which we need to obtain the model checkpoint file, i.e. model.ckpt-4496.)
- getEmbedding_1229.py get the Embedding of diseases: Run Code File getEmbedding_1229.py
Because the file size limit, 'Get disease feature.zip' download URL: https://github.com/stjin-XMU/HeTDR/releases/download/HeTDR/Get.disease.feature.zip
- 'data' directory contain the gold standard drug-disease association and the example dataset.
- main.py get the result of disease-drug association prediction.
You can use
python main.py --input data --features data/feature