Skip to content
main
Switch branches/tags
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 

README.md

KAMG

Code repository for EMNLP 2020 proceedings paper Multi-label Few/Zero-shot Learning with Knowledge Aggregated from Multiple Label Graphs.

KAMG

Introduction

We intorduce the knowledge aggregation mechanism to improve the few/zero-shot learning performance on MIMIC-II/III datasets as well as the EU legislation dataset. We extend Rios & Kavuluru's (2018) work with the implementation from Neural classifier (Liu et al., 2019).

To use our codes, please make sure that you have read the README file from Neural classifier.

Support text encoders

Other encoders for conventional text classification task are mentioned in Neural classifier.

Dataset

Due to the policy of accessing the MIMIC dataset, you need to follow the instruction on the web page to apply for all datasets. Tokenization and other data cleaning codes will be provided very soon.

EU legislation dataset could be download from Chalkidis et al. (2019) but the codes for running the experiments of EU legislation dataset are developed based on their original codes, so not included here.

Graph data will be uploaded to Google Drive with open access in order to repeat the experimental results. The Google Drive link will be given here very soon. The python scripts for generating graph data has been included in the source codes.

Config files

For quickly repeat our experiments results, the config files will also be available in the future.

To simply run the codes, you should

cd NeuralNLP-NeuralClassifier-KAMG
python train.py conf/config.py

which is same as Neural classifier.

Acknowledgement

Our codes are based on Neural classifier, and we have modified following files which will be different from the source.

  • model/rnn.py (update)

  • model/fusion.py (add)

  • model/graph_convolution.py (add)

  • model/zlwacnn.py (add)

  • model/zlwarnn.py (add)

  • model/zsjlcnn.py (add)

  • model/zsjlrnn.py (add)

  • evaluate/classification_evaluate.py (update)

  • dataset/collator (update)

  • dataset/dataset.py (update)

  • dataset/classification_dataset.py (update)

  • dataset/generate_graph_data.py (add)

  • dataset/graph_dataset.py (add)

  • util.py (add)

Citation

@inproceedings{lu-etal-2020-multi,
    title = "Multi-label Few/Zero-shot Learning with Knowledge Aggregated from Multiple Label Graphs",
    author = "Lu, Jueqing  and
      Du, Lan  and
      Liu, Ming  and
      Dipnall, Joanna",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.emnlp-main.235",
    doi = "10.18653/v1/2020.emnlp-main.235",
    pages = "2935--2943",
    abstract = "Few/Zero-shot learning is a big challenge of many classifications tasks, where a classifier is required to recognise instances of classes that have very few or even no training samples. It becomes more difficult in multi-label classification, where each instance is labelled with more than one class. In this paper, we present a simple multi-graph aggregation model that fuses knowledge from multiple label graphs encoding different semantic label relationships in order to study how the aggregated knowledge can benefit multi-label zero/few-shot document classification. The model utilises three kinds of semantic information, i.e., the pre-trained word embeddings, label description, and pre-defined label relations. Experimental results derived on two large clinical datasets (i.e., MIMIC-II and MIMIC-III ) and the EU legislation dataset show that methods equipped with the multi-graph knowledge aggregation achieve significant performance improvement across almost all the measures on few/zero-shot labels.",
}

About

EMNLP2020-short

Resources

Releases

No releases published

Packages

No packages published

Languages