BELB Benchmark

Code base to reproduce benchmarking experiments on BELB (Biomedical Entity Linking Benchmark) reported in:

@article{10.1093/bioinformatics/btad698,
    author = {Garda, Samuele and Weber-Genzel, Leon and Martin, Robert and Leser, Ulf},
    title = {{BELB}: a {B}iomedical {E}ntity {L}inking {B}enchmark},
    journal = {Bioinformatics},
    pages = {btad698},
    year = {2023},
    month = {11},
    issn = {1367-4811},
    doi = {10.1093/bioinformatics/btad698},
    url = {https://doi.org/10.1093/bioinformatics/btad698},
    eprint = {https://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btad698/53483107/btad698.pdf},
}

Setup

We assume you have a working installation of belb in your python environment:

git clone https://github.com/sg-wbi/belb
cd belb
pip install -e .

and other requirements:

(belb-venv) user $ pip install -r requirements.txt

Models

There are two type of models: rule-based entity-specific and those based on pretrained language models (PLM).

Rule-based entity-specific

Entity		Status	Note
Gene	GNormPlus	✅	NER+EL
Species	Linnaues	✅	NER+EL
Species	SR4GN	✅	NER+EL
Species	SPECIES	❌	Compilation fails
Disease	TaggerOne	✅	NER+EL
Chemical	BC7T2W	✅	Installation fails on Linux. NER+EL
Variant	tmVar (v3)	✅	NER+EL
Cell line	TaggerOne	❌	Model not available and training fails
UMLS	MetaMap	✅	NER+EL
UMLS	QuickUMLS	❌	Installation fails
UMLS	SciSpacy	✅

For each system there is a run_*.sh script in the bin folder. The script installs the software in the user-specified directory, runs the tool and collects the output in the data/results/ directory.

(belb) user $ chmod +x ./bin/run_gnormplus.sh
(belb) user $ ./bin/run_gnormplus.sh <BELB directory> <tool directory>

E.g. to run GNormPlus:

(belb) user $ chmod +x ./bin/run_gnormplus.sh
(belb) user $ ./bin/run_gnormplus.sh <BELB directory> <tool directory>

BC7T2W

See instructions in corresponding README.md

SciSpacy

(belb) user $ python -m benchmark.scispacy.scispacy --run output --in_dir test --belb_dir ~/data/belb

MetaMap

See instructions in corresponding README.md

PLM-based

These type of models require training. We only provide code to create the input in the requested format and to parse the output generated by each model. Detailed instructions on how run these models on BELB can be found in the corresponding folders (e.g. benchmark/arboel).

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Ab3P		Ab3P
benchmark		benchmark
bin		bin
config		config
data/backup		data/backup
results		results
scripts		scripts
tests		tests
.gitignore		.gitignore
README.md		README.md
path_Ab3P		path_Ab3P
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ab3P

Ab3P

benchmark

benchmark

bin

bin

config

config

data/backup

data/backup

results

results

scripts

scripts

tests

tests

.gitignore

.gitignore

README.md

README.md

path_Ab3P

path_Ab3P

requirements.txt

requirements.txt

Repository files navigation

BELB Benchmark

Setup

Models

Rule-based entity-specific

BC7T2W

SciSpacy

MetaMap

PLM-based

About

Releases

Packages

Languages

sg-wbi/belb-exp

Folders and files

Latest commit

History

Repository files navigation

BELB Benchmark

Setup

Models

Rule-based entity-specific

BC7T2W

SciSpacy

MetaMap

PLM-based

About

Resources

Stars

Watchers

Forks

Languages