Named Entity Recognition
Named entity recognition (NER) is a fundamental task that aims to identify named entities in raw text and assign them pre-defined categorical tags such as PER (Person), ORG (Organization), LOC (Location), etc.
This is an implementation of the paper: Attention-based Multi-level Feature Fusion for Named Entity Recognition. Intuitively, multi-level features can be helpful when recognizing named entities from complex sentences. This study proposes a novel framework called attention-based multi-level feature fusion (AMFF), which is used to capture the multi-level features from different perspectives to improve NER.
Requirements
- Ubuntu
- Python 3.6.9+
- TensorFlow-gpu 1.13.1
- CUDA 10+
- pathlib
- numpy
- json
Datasets
Usage
- Switch to the corresponding virtual environment, and install metrics for NER
pip install git+https://github.com/guillaumegenthial/tf_metrics.git
- Put the dataset into the corresponding directory and preprocess the datasets to the CONLL format, e.g.,
data/sample
- get pre-trained word embeddings GloVe and put it into
data/sample
;
Switch to
../data/sample
:
Runpreprocess.py
and make sure the first line of the output file is not blank.
python preprocess.py
decompress glove:
unzip glove.840B.300d.zip -d glove.840B.300d.txt
rm glove.840B.300d.zip
build vocab and glove:
python build_vocab.py
python build_glove.py
- Switch to the root directory and get started with
main_amff.py
.
python main_amff.py
References
- Scibert: A pretrained language model for scientific text, Beltagy et al. link
- Collabonet: collaboration of deep neural networks for biomedical named entity recognition, Yoon, et al. link
- Contextual String Embeddings for Sequence Labeling, Akbik et al. link
- Neural architectures for named entity recognition, Lample, et al. link
- End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF, Ma et al. link
- Named Entity Recognition with Bidirectional LSTM-CNNs, JPC Chiu et al. link
- Bidirectional LSTM-CRF Models for Sequence Tagging, Z Huang et al. link
- Sequence-tagging-with-tensorflow, guillaumegenthial. link
...
Citation
Please cite:
@InProceedings{yang2020AMFF,
title = {Attention-based Multi-level Feature Fusion for Named Entity Recognition},
author = {Zhiwei Yang, Hechang Chen, Jiawei Zhang, Jing Ma, and Yi Chang},
booktitle = {IJCAI},
year = {2020},
}