Skip to content

CoderMusou/NFLAT4CNER

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NFLAT4NER

This is the code for the paper NFLAT: Non-Flat-Lattice Transformer for Chinese Named Entity Recognition.

Introduction

We advocate a novel lexical enhancement method, InterFormer, that effectively reduces the amount of computational and memory costs by constructing non-flat lattices. Furthermore, with InterFormer as the backbone, we implement NFLAT for Chinese NER. NFLAT decouples lexicon fusion and context feature encoding. Compared with FLAT, it reduces unnecessary attention calculations in "word-character" and "word-word". This reduces the memory usage by about 50% and can use more extensive lexicons or higher batches for network training.

Environment Requirement

The code has been tested under Python 3.7. The required packages are as follows:

torch==1.5.1
numpy==1.18.5
FastNLP==0.5.0
fitlog==0.3.2

you can click here to know more about FastNLP. And you can click here to know more about Fitlog.

Example to Run the Codes

  1. Download the pretrained character embeddings and word embeddings and put them in the data folder.

  2. Modify the utils/paths.py to add the pretrained embedding and the dataset.

  3. Long sentence clipping for MSRA and Ontonotes, run the command:

python sentence_clip.py
  1. Merging char embeddings and word embeddings:
python char_word_mix.py
  1. Model training and evaluation
    • Weibo dataset
    python main.py --dataset weibo
    • Resume dataset
    python main.py --dataset resume
    • Ontonotes dataset
    python main.py --dataset ontonotes
    • MSRA dataset
    python main.py --dataset msra

Acknowledgements

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages