Skip to content

camilothorne/tree-kernel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tree-kernel

Code of a baseline method to automatically detect disease-chemical relationships in biomedical papers.

The method works by computing a word embedding of the training corpus, concatenating the embeddings of disease-chemical pairs (into one vector of ~100 dimensions), to train a SVM with a quadratic kernel. Tree kernels were also tried, but their impact on classification was negative (compared to embeddings or bags of words).

As training and test corpus we use the known CDR corpus (BioCreative). The baseline has an accuracy of 80%. The whole experiment is self-contained. Download and type on the command line (in the package directory):

   python main.py

to run the experiment.

About

A system to recognize disease -- chemical relations using SVMs and word embeddings.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages