Use google BERT to do CoNLL-2003 NER !
Switch branches/tags
Nothing to show
Clone or download
Latest commit 58c2011 Nov 21, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
NERdata Add files via upload Nov 7, 2018
BERT_NER.py Update BERT_NER.py Nov 21, 2018
README.md Update README.md Nov 21, 2018
conlleval.pl Add files via upload Nov 21, 2018
picture1.png Add files via upload Nov 8, 2018
picture2.png Add files via upload Nov 8, 2018
picturen.png Add files via upload Nov 8, 2018
tf_metrics.py Add files via upload Nov 8, 2018

README.md

BERT-NER

Use google BERT to do CoNLL-2003 NER !

Try to implement NER work based on google's BERT code!

First git clone https://github.com/google-research/bert.git

Second download file in this project

BERT
|____ bert
|____ BERT_NER.py
|____ checkpoint
|____ output

Third run:

  python BERT_NER.py   \
                  --task_name="NER"  \ 
                  --do_train=True   \
                  --do_eval=True   \
                  --do_predict=True
                  --data_dir=NERdata   \
                  --vocab_file=checkpoint/vocab.txt  \ 
                  --bert_config_file=checkpoint/bert_config.json \  
                  --init_checkpoint=checkpoint/bert_model.ckpt   \
                  --max_seq_length=128   \
                  --train_batch_size=32   \
                  --learning_rate=2e-5   \
                  --num_train_epochs=3.0   \
                  --output_dir=./output/result_dir/ 

result:

The predicted result is placed in folder ./output/result_dir/. It contain two files, token_test.txt is the tokens and label_test.txt is the labels for each token. If you want a more accurate evaluation result you can use script conlleval.pl for evaluation.

The following evaluation results differ from the evaluation results specified by conll2003.

注:For the parameters of the above model, I have not made any modifications. All parameters are based on the BERT default parameters. The better parameters for this problem can be adjusted by yourselves.

The f_score evaluation codes come from:https://github.com/guillaumegenthial/tf_metrics/blob/master/tf_metrics/__init__.py

reference: