Use ELMo as a Language Model

Purpose

Run pretrained ELMo model to get single sentence perplexity.

Installation

pip install tensorflow-gpu==1.2 h5py
python setup.py install

Run Evaluation

Data file format: each line in a file is a sentence to calculate perplexity. data
Split data file into pieces, one sentence per piece.

cd data
split sents.txt -d -l 1 -a 4 cs

Run the evaluation script.

sh evaluate.sh

The perplexity score is shown in stdout

...
5946: 129.57085
5947: 1412.2032
5948: 5172.711
5949: 2126.5542
...

Sentence line number followed by the perplxity (unnormalized by sentence length)

Finetune

To finetune the ELMo on additional corpus, first downlaod the pretrained model to models/

The tensorflow checkpoint is available by downloading these files:

vocabulary checkpoint options 1 2 3

|
|--models
    |--vocab-2016-09-10.txt
    |--checkpoint
            |--checkpoint
            |--options.json
            |--model.ckpt-935588.meta
            |--model.ckpt-935588.index
            |--model.ckpt-935588.data-00000-of-00001

Then use the following script.

sh finetune.sh

After finetuning the model, you can run the evaluation again to see the finetune effect.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
bilm		bilm
bin		bin
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
evaluate.sh		evaluate.sh
fine_tune.sh		fine_tune.sh
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Use ELMo as a Language Model

Purpose

Installation

Run Evaluation

Finetune

About

Releases

Packages

Languages

Shuailong/bilm-tf

Folders and files

Latest commit

History

Repository files navigation

Use ELMo as a Language Model

Purpose

Installation

Run Evaluation

Finetune

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages