T-NER is a python tool for language model finetuning on named-entity-recognition (NER) implemented in pytorch, available via pip. It has an easy interface to finetune models and test on cross-domain and multilingual datasets. T-NER currently integrates 9 publicly available NER datasets and enables an easy integration of custom datasets. All models finetuned with T-NER can be deploy on our web app for visualization.
Paper Accepted: Our paper demonstrating T-NER has been accepted to EACL 2021 🎉 paper link.
PreTrained Models: We release 46 XLM-RoBERTa models finetuned on NER on the HuggingFace transformers model hub, see here for more details and model cards.
- Setup
- Web API
- Pretrained Models
- Model Finetuning
- Model Evaluation
- Model Inference
- Datasets
- Reference
Description | Link |
---|---|
Model Finetuning | |
Model Evaluation | |
Model Prediction | |
Multilingual NER Workflow |
Install pip package
pip install tner
or directly from the repository for the latest version.
pip install git+https://github.com/asahi417/tner
To start the web app, first clone the repository
git clone https://github.com/asahi417/tner
cd tner
then launch the server by
uvicorn app:app --reload --log-level debug --host 0.0.0.0 --port 8000
and open your browser http://0.0.0.0:8000 once ready.
You can specify model to deploy by an environment variable NER_MODEL
, which is set as asahi417/tner-xlm-roberta-large-ontonotes5
as a defalt.
NER_MODEL
can be either path to your local model checkpoint directory or model name on transformers model hub.
Acknowledgement The App interface is heavily inspired by this repository.
Language model finetuning on NER can be done with a few lines:
import tner
trainer = tner.TrainTransformersNER(checkpoint_dir='./ckpt_tner', dataset="data-name", transformers_model="transformers-model")
trainer.train()
where transformers_model
is a pre-trained model name from transformers model hub and
dataset
is a dataset alias or path to custom dataset explained dataset section.
Model files will be generated at checkpoint_dir
, and it can be uploaded to transformers model hub without any changes.
To show validation accuracy at the end of each epoch,
trainer.train(monitor_validation=True)
and to tune training parameters such as batch size, epoch, learning rate, please take a look the argument description.
Train on multiple datasets: Model can be trained on a concatenation of multiple datasets by providing a list of dataset names.
trainer = tner.TrainTransformersNER(checkpoint_dir='./ckpt_merged', dataset=["ontonotes5", "conll2003"], transformers_model="xlm-roberta-base")
Custom datasets can be also added to it, e.g. dataset=["ontonotes5", "./examples/custom_data_sample"]
.
Command line tool: Finetune models with the command line (CL).
tner-train [-h] [-c CHECKPOINT_DIR] [-d DATA] [-t TRANSFORMER] [-b BATCH_SIZE] [--max-grad-norm MAX_GRAD_NORM] [--max-seq-length MAX_SEQ_LENGTH] [--random-seed RANDOM_SEED] [--lr LR] [--total-step TOTAL_STEP] [--warmup-step WARMUP_STEP] [--weight-decay WEIGHT_DECAY] [--fp16] [--monitor-validation] [--lower-case]
Evaluation of NER models is easily done for in/out of domain settings.
import tner
trainer = tner.TrainTransformersNER(checkpoint_dir='path-to-checkpoint', transformers_model="language-model-name")
trainer.test(test_dataset='data-name')
Entity span prediction: For better understanding of out-of-domain accuracy, we provide the entity span prediction pipeline, which ignores the entity type and compute metrics only on the IOB entity position.
trainer.test(test_dataset='data-name', entity_span_prediction=True)
Command line tool: Model evaluation with CL.
tner-test [-h] -c CHECKPOINT_DIR [--lower-case] [--test-data TEST_DATA] [--test-lower-case] [--test-entity-span]
If you just want a prediction from a finetuned NER model, here is the best option for you.
import tner
classifier = tner.TransformersNER('transformers-model')
test_sentences = [
'I live in United States, but Microsoft asks me to move to Japan.',
'I have an Apple computer.',
'I like to eat an apple.'
]
classifier.predict(test_sentences)
Command line tool: Model inference with CL.
tner-predict [-h] [-c CHECKPOINT]
Public datasets that can be fetched with TNER are summarized here. Please cite the corresponding reference if using one of these datasets.
Name (alias ) |
Genre | Language | Entity types | Data size (train/valid/test) | Note |
---|---|---|---|---|---|
OntoNotes 5 (ontonotes5 ) |
News, Blog, Dialogue | English | 18 | 59,924/8,582/8,262 | |
CoNLL 2003 (conll2003 ) |
News | English | 4 | 14,041/3,250/3,453 | |
WNUT 2017 (wnut2017 ) |
SNS | English | 6 | 1,000/1,008/1,287 | |
FIN (fin ) |
Finance | English | 4 | 1,164/-/303 | |
BioNLP 2004 (bionlp2004 ) |
Chemical | English | 5 | 18,546/-/3,856 | |
BioCreative V CDR (bc5cdr ) |
Medical | English | 2 | 5,228/5,330/5,865 | split into sentences to reduce sequence length |
WikiAnn (panx_dataset/en , panx_dataset/ja , etc) |
Wikipedia | 282 languages | 3 | 20,000/10,000/10,000 | |
Japanese Wikipedia (wiki_ja ) |
Wikipedia | Japanese | 8 | -/-/500 | test set only |
Japanese WikiNews (wiki_news_ja ) |
Wikipedia | Japanese | 10 | -/-/1,000 | test set only |
MIT Restaurant (mit_restaurant ) |
Restaurant review | English | 8 | 7,660/-/1,521 | lower-cased |
MIT Movie (mit_movie_trivia ) |
Movie review | English | 12 | 7,816/-/1,953 | lower-cased |
To take a closer look into each dataset, one may want to use tner.get_dataset_ner
as in
import tner
data, label_to_id, language, unseen_entity_set = tner.get_dataset_ner('data-name')
where data
consists of the following structured format.
{
'train': {
'data': [
['@paulwalk', 'It', "'s", 'the', 'view', 'from', 'where', 'I', "'m", 'living', 'for', 'two', 'weeks', '.', 'Empire', 'State', 'Building', '=', 'ESB', '.', 'Pretty', 'bad', 'storm', 'here', 'last', 'evening', '.'],
['From', 'Green', 'Newsfeed', ':', 'AHFA', 'extends', 'deadline', 'for', 'Sage', 'Award', 'to', 'Nov', '.', '5', 'http://tinyurl.com/24agj38'], ...
],
'label': [
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], ...
]
},
'valid': ...
}
To go beyond the public datasets, users can use their own datasets by formatting them into the IOB format described in CoNLL 2003 NER shared task paper, where all data files contain one word per line with empty lines representing sentence boundaries. At the end of each line there is a tag which states whether the current word is inside a named entity or not. The tag also encodes the type of named entity. Here is an example sentence:
EU B-ORG
rejects O
German B-MISC
call O
to O
boycott O
British B-MISC
lamb O
. O
Words tagged with O are outside of named entities and the I-XXX tag is used for words inside a
named entity of type XXX. Whenever two entities of type XXX are immediately next to each other, the
first word of the second entity will be tagged B-XXX in order to show that it starts another entity.
The custom dataset should have train.txt
and valid.txt
files in a same folder.
Please take a look sample custom data.
If you use any of these resources, please cite the following paper:
@inproceedings{ushio-camacho-collados-2021-ner,
title = "{T}-{NER}: An All-Round Python Library for Transformer-based Named Entity Recognition",
author = "Ushio, Asahi and
Camacho-Collados, Jose",
booktitle = "Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations",
month = apr,
year = "2021",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2021.eacl-demos.7",
pages = "53--62",
abstract = "Language model (LM) pretraining has led to consistent improvements in many NLP downstream tasks, including named entity recognition (NER). In this paper, we present T-NER (Transformer-based Named Entity Recognition), a Python library for NER LM finetuning. In addition to its practical utility, T-NER facilitates the study and investigation of the cross-domain and cross-lingual generalization ability of LMs finetuned on NER. Our library also provides a web app where users can get model predictions interactively for arbitrary text, which facilitates qualitative model evaluation for non-expert programmers. We show the potential of the library by compiling nine public NER datasets into a unified format and evaluating the cross-domain and cross- lingual performance across the datasets. The results from our initial experiments show that in-domain performance is generally competitive across datasets. However, cross-domain generalization is challenging even with a large pretrained LM, which has nevertheless capacity to learn domain-specific features if fine- tuned on a combined dataset. To facilitate future research, we also release all our LM checkpoints via the Hugging Face model hub.",
}