Skip to content
BioELMo is a biomedical version of embeddings from language model (ELMo), pre-trained on PubMed abstracts.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
README.md Update README.md Dec 3, 2019

README.md

bioelmo

BioELMo is a biomedical version of embeddings from language model (ELMo), pre-trained on PubMed abstracts. Pre-training uses 10M recent PubMed abstracts (2.46B tokens in total), and BioELMo achieves an averaged forward and backward perplexity of 31.37 on a held-out test set. BioELMo encodes biomedical entity-type and relational information pretty well, as shown in our paper.

Download Weights

You can use BioELMo as a fixed-feature extractor for downstream tasks using these weights:

Download Tensorflow Checkpoints

You can further fine-tune BioELMo on other corpora using the Tensorflow checkpoint. See this for details.

Usage

Please visit https://github.com/allenai/bilm-tf. Basically, you use BioELMo the same way you use ELMo.

Probing Experiments

Please visit https://github.com/Andy-jqa/probing_biomed_embeddings (currently under construction) for codes of probing experiments described in our paper.

Citation

Please cite the following paper if you use BioELMo:

@inproceedings{jin2019probing,
  title={Probing Biomedical Embeddings from Language Models},
  author={Jin, Qiao and Dhingra, Bhuwan and Cohen, William and Lu, Xinghua},
  booktitle={Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for NLP},
  pages={82--89},
  year={2019}
}
You can’t perform that action at this time.