Skip to content

cedar33/bert_ner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bert_ner

1.理论概述

基于google提出的bert模型和tensorflow实现,bilm+crf部分参考了Guillaume Genthial的代码,详细理论讲解可见 右左瓜子的知乎专栏

2.Requires

3.使用方法

  • 1.下载以上require中的代码和文件,将bilm_crf.pybert_bilm_crf.py放到和google的bert中和modeling.py平级的文件夹下
  • 2.运行脚本
python .\bert_bilm_crf.py --task_name=ner --do_train=true --do_eval=false --do_predict=false --data_dir=path\to\yourdata \
--vocab_file=path\to\chinese_L-12_H-768_A-12\vocab.txt --bert_config_file=path\to\chinese_L-12_H-768_A-12\bert_config.json \
--init_checkpoint=path\to\chinese_L-12_H-768_A-12\bert_model.ckpt --max_seq_length=50 --train_batch_size=32 \
--learning_rate=5e-5 --num_train_epochs=2.0 --output_dir=/tmp/ner_output/
  • 3.导出训练结果 运行export.py
  • 4.部署在服务端 运行export.sh
  • 5调用例子 clien.py

4.关于训练数据

在给出的example.tsv中有两行示例数据,把格式整理成类似的即可


1.theory detail

this is a solution to NER task base on BERT and bilm+crf, the BERT model comes from google's github, the bilm+crf part inspired from Guillaume Genthial's code, visit this page for more details

2.Requires

3.how to use

  • 1.download codes and files mentioned above, put the bilm_crf.py and bert_bilm_crf.py into the google's BERT directory, the to python file is at the same level with modeling.py
  • 2.run script
python .\bert_bilm_crf.py --task_name=ner --do_train=true --do_eval=false --do_predict=false --data_dir=path\to\yourdata \
--vocab_file=path\to\chinese_L-12_H-768_A-12\vocab.txt --bert_config_file=path\to\chinese_L-12_H-768_A-12\bert_config.json \
--init_checkpoint=path\to\chinese_L-12_H-768_A-12\bert_model.ckpt --max_seq_length=50 --train_batch_size=32 \
--learning_rate=5e-5 --num_train_epochs=2.0 --output_dir=/tmp/ner_output/
  • 3.export model run export.py
  • 4.deploy run export.sh
  • 5.example clien.py

4.about trainning data

there is an example data in example.tsv to show the formate, you are surpoed to transform your data into this formate, or you can modify the input_fn in bert_bilm_crf.py


更新情感分析方法:bert_senta.py,以及预测方法senta_pred.pysenta_pred.py中读取数据的方法都注释掉了使用时添加上自己的数据读取方式 预测方法使用了dataset数据流的形式,单个预测耗时10ms

About

基于bert的ner,使用bilstm+crf

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published