Dataset: https://www.clips.uantwerpen.be/conll2000/chunking/
connlleval.py: https://github.com/sighsmile/conlleval/blob/master/conlleval.py
- Download and unzip dataset.
- Set the dataset path in
preprocessing.py
. - Set the hyperparameter in
train.py
. - Change other setting you want to change.
- Do
python train.py
if you want to train with BERT, dopython bert_train.py
. - Set model path and do
python pred.py
orpython bert_train.py
. - Do
python connlleval.py
. - You can monitor or confirm loss curve by tensorboard.
trained models: https://drive.google.com/drive/folders/11uyVskbp9oLQVsj7A5lfIsKhhPJOoFKr?usp=sharing
current score
LSTM-crf: 92.6,
LSTM-w2v-crf: 92.49,
BERT-crf: 96.38: current output.txt,