Code for our ICASSP 2021 paper : "Improving NER in Social Media via Entity Type-Compatible Unknown Word Substitution"
Distributor ID: Ubuntu Description: Ubuntu 16.04.1 LTS Release: 16.04
NVIDIA Tesla V100
Python 3.6.9
pip install flair==0.5.1
pip install transformers==3.2.0
pip install torch==1.6.0 torchvision==0.7.0
OR install with:
pip install -r requirements.txt
Systems are evaluated using a modified version of conlleval.py, provided by WNUT-17 Committee
The datasets can be downloaded from (https://noisy-text.github.io/2017/emerging-rare-entities.html)
The crawl vocab is extracted from fastText Crawl
The POS tagging model can be downloaded from (https://nlp.informatik.hu-berlin.de/resources/models/upos/en-pos-ontonotes-v0.4.pt)
python ner/ner_wordEmbedding.py
python ner/ner_bert.py
python etc_main.py
python eval.py
python embedding/vec2bin.py