- Install python3 requirements:
pip install -r requirements.txt
- Initialize GloVe as follows:
$ wget https://github.com/explosion/spacy-models/releases/download/en_vectors_web_lg-2.3.0/en_vectors_web_lg-2.3.0.tar.gz -O en_vectors_web_lg-2.3.0.tar.gz
$ pip install en_vectors_web_lg-2.3.0.tar.gz
You should first create a model configuration file under configs/
(see the provided sample). The following script will run the training:
$ python3 main.py --RUN train --MODEL_CONFIG <model_config> --DATASET <dataset>
with checkpoint saved under ckpts/<dataset>/
and log under results/log/
Important parameters:
--VERSION str
, to assign a name for the model.--GPU str
, to train the model on specified GPU. For multi-GPU training, use e.g.--GPU '0, 1, 2, ...'
.--SEED int
, set seed for this experiment.--RESUME True
, start training with saved checkpoint. You should assign checkpoint version--CKPT_V str
and resumed epoch--CKPT_E int
.--NW int
, to accelerate data loading speed.--DATA_ROOT_PATH str
, to set path to your dataset.
To check all possible parameters, use --help
You can evaluate on validation or test set using --RUN {val, test}
. For example:
$ python3 main.py --RUN test --MODEL_CONFIG <model_config> --DATASET <dataset> --CKPT_V <model_version> --CKPT_E <model_epoch>
or with absolute path:
$ python3 main.py --RUN test --MODEL_CONFIG <model_config> --DATASET <dataset> --CKPT_PATH <path_to_checkpoint>.ckpt
To obtain incremental evaluation on the test sets, use the flag --INCR_EVAL
We do not upload the original datasets as some of them needs license agreements. The preprocessing steps are described in the paper.
As it is, the code can run experiments on:
- Chunk, CoNLL 2000 (chunk)
- Named Entity Recognition, OntoNotes 5.0, WSJ (ner-nw-wsj)
- PoS Tagging, OntoNotes 5.0, WSJ (pos-nw-wsj)
- Slot filling and intent detection, ATIS (atis-slot & atis-intent) from here
- Slot filling and intent detection, SNIPS (snips-slot & snips-intent) from here
- Sentiment classification, Pros/Cons (proscons)
- Sentiment classification, Positive/Negative (sent-negpos)
Data has to be split into three files (data/train/train.<task>
, data/valid/valid.<task>
and data/test/test.<task>
) as in /configs/path_config.yml
, all of them following the format:
- Sequence tagging:
token \t label \n token \t label \n
with an extra \n between sequences.
- Sequence classification:
<LABEL>: atis_airfare \n token \n token \n
with an extra \n between sequences.
If this repository is helpful for your research, we would really appreciate if you could cite the paper:
@inproceedings{kahardipraja-etal-2021-towards,
title = "Towards Incremental Transformers: An Empirical Analysis of Transformer Models for Incremental {NLU}",
author = "Kahardipraja, Patrick and
Madureira, Brielen and
Schlangen, David",
booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2021",
address = "Online and Punta Cana, Dominican Republic",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.emnlp-main.90",
pages = "1178--1189",
}