CFIGER: Chinese Fine-Grained entity typing under FIGER ontology

A repository for a Chinese fine-grained entity typing dataset based on the FIGER ontology. This repository is part of the software release of our paper Cross-lingual Inference with a Chinese Entailment Graph. The dataset based on A Chinese Corpus for Fine-grained Entity Typing.

Annotation Process

The dataset has been annotated through label mapping: we manually mapped the tokens from each of the ~6000 ultra-fine-grained types to a FIGER type; for more details please check our paper. The resulting mappings are here, they should be put under ./u2figer; the resulting re-annotated dataset is here, decompose the zip file and put it under the root directory.

Baselines

We updated the CFET baseline in accordance with our re-annotated data. To run the baseline, take the following steps:

From fastText, download its Chinese model here;
Run preprocess.py in mode embed, data and pred respectively, remember to set the correct path to the downloaded fastText model;
Do training simply with python train.py, configurations can be set in config.py;
For doing inference on datasets in other domains, please refer to predict.py

We have also built another baseline model based on the HierType, which as shown below, has better generalization properties than the present baseline. The Chinese HierType baseline can be found in another repository here.

Results

Citing Us

Coming soon.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
u2figer		u2figer
utils		utils
.gitattributes		.gitattributes
README.md		README.md
config.py		config.py
predict.py		predict.py
predict.sh		predict.sh
preprocess.py		preprocess.py
results.png		results.png
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

u2figer

u2figer

utils

utils

.gitattributes

.gitattributes

README.md

README.md

config.py

config.py

predict.py

predict.py

predict.sh

predict.sh

preprocess.py

preprocess.py

results.png

results.png

train.py

train.py

Repository files navigation

CFIGER: Chinese Fine-Grained entity typing under FIGER ontology

Annotation Process

Baselines

Results

Citing Us

About

Releases

Packages

Languages

Teddy-Li/CFIGER

Folders and files

Latest commit

History

Repository files navigation

CFIGER: Chinese Fine-Grained entity typing under FIGER ontology

Annotation Process

Baselines

Results

Citing Us

About

Resources

Stars

Watchers

Forks

Languages