Skip to content

lasigeBioTM/IBEnt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IBEnt

Framework for identifying biomedical entities

Dependencies:

Configuration

A Dockerfile is provided to help with the installation. Build and then run with the -i flag. After setting up the dependencies, you have to run python src/config/config.py to set up some values. You can use the CHEMDNER-patents sample data to check if the system is working correctly. Then run ./benchmarks/check_setup.sh to confirm if everything is set up correctly.

Usage

To run distant supervision multi-instance learning experiments, use src/trainevaluate.py and check mil.sh for an example.

You can either run the system in batch or server mode. Batch mode expects specific data formats and can be used to train classifiers and evaluate on a test set. For example, to train a classifier models/class1.ser.gz from the data on corpus1:

python src/main.py load_corpus --goldstd corpus1
python src/main.py train --goldstd corpus1 --models models/class1

To test with this classifier on corpus2 and save the results to data/results1.pickle:

python src/main.py load_corpus --goldstd corpus2
python src/main.py test --goldstd corpus2 -o pickle data/results1 --models models/class1

To evaluate the results on the corpus2 gold standard:

python src/evaluate.py evaluate corpus2 --results data/results1 --models models/class1

To map the term to the ChEBI ontology:

python src/evaluate.py chebi corpus2 --results data/results1 --models models/class1

If you just want to send text to previously trained classifiers and get results, use the server mode. Start the server with python src/server.py and input text with python src/client. You can also use your own client, sending a POST request to the address in config.host_ip.