SemaTyP: a knowledge graph based literature mining method for drug discovery

This is the source code and data for the task of drug discovery as described in our paper: "SemaTyP: a knowledge graph based literature mining method for drug discovery"

Requirements

scikit-learn
numpy
tqdm

Data

In order to use the code, you have to provide

Theraputic Target Database You don't need to download by yourself, I have uploaded all the TTD 2016 version in <./data/TTD>.
SemedDB You need to download from here with password:1234 to obtain the whole knowledge graph. After downloading the "predications.txt" file, please replace the file <./data/SemedDB/predications.txt>. with this new downloaded file.

Run the codes

Install the environment.

pip install -r requirements.txt

Construct training and test data.

python experimental_data.py

Train and test the model.

python main.py

Illustration of feature selection

An illustration of the features constructed in our work.

File declaration

data/SemmedDB： contains all relations extracted from SemmedDB, which are used for constructing the Knowledge Graph in our experiment. The whole "predications.txt" contains 39,133,975 relations, we just leave a small sample "predications.txt" file here which contain 100 relation. The whole "predications.txt" file coule be downloaded from

data/TTD： contains the drug, target and disease relations retrieved from Theraputic Target Database.

experimental_data.py: constuct the drug-target-disease associations from TTD and Knowledge Graph.

knowledge_graph.py: construct the Knowledge Graph used in our experiment.

data_loader.py：used to load traing and test data.

main.py：used to train and test the models

Cite

Please cite our paper if you use this code in your own work:

@article{sang2018sematyp,
  title={SemaTyP: a knowledge graph based literature mining method for drug discovery},
  author={Sang, Shengtian and Yang, Zhihao and Wang, Lei and Liu, Xiaoxia and Lin, Hongfei and Wang, Jian},
  journal={BMC bioinformatics},
  volume={19},
  number={1},
  pages={1--11},
  year={2018},
  publisher={Springer}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SemaTyP: a knowledge graph based literature mining method for drug discovery

Requirements

Data

Run the codes

Illustration of feature selection

File declaration

Cite

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
data		data
figures		figures
utils		utils
README.md		README.md
data_loader.py		data_loader.py
experimental_data.py		experimental_data.py
knowledge_graph.py		knowledge_graph.py
main.py		main.py
requirements.txt		requirements.txt

ShengtianSang/SemaTyP

Folders and files

Latest commit

History

Repository files navigation

SemaTyP: a knowledge graph based literature mining method for drug discovery

Requirements

Data

Run the codes

Illustration of feature selection

File declaration

Cite

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages