Skip to content

pkhdipraja/towards-incremental-transformers

Repository files navigation

Towards Incremental Transformers: An Empirical Analysis of Transformer Models for Incremental NLU

Setup

  • Install python3 requirements: pip install -r requirements.txt
  • Initialize GloVe as follows:
$ wget https://github.com/explosion/spacy-models/releases/download/en_vectors_web_lg-2.3.0/en_vectors_web_lg-2.3.0.tar.gz -O en_vectors_web_lg-2.3.0.tar.gz
$ pip install en_vectors_web_lg-2.3.0.tar.gz

Training

You should first create a model configuration file under configs/ (see the provided sample). The following script will run the training:

$ python3 main.py --RUN train --MODEL_CONFIG <model_config> --DATASET <dataset>

with checkpoint saved under ckpts/<dataset>/ and log under results/log/

Important parameters:

  1. --VERSION str, to assign a name for the model.
  2. --GPU str, to train the model on specified GPU. For multi-GPU training, use e.g. --GPU '0, 1, 2, ...'.
  3. --SEED int, set seed for this experiment.
  4. --RESUME True, start training with saved checkpoint. You should assign checkpoint version --CKPT_V str and resumed epoch --CKPT_E int.
  5. --NW int, to accelerate data loading speed.
  6. --DATA_ROOT_PATH str, to set path to your dataset.

To check all possible parameters, use --help

Testing

You can evaluate on validation or test set using --RUN {val, test}. For example:

$ python3 main.py --RUN test --MODEL_CONFIG <model_config> --DATASET <dataset> --CKPT_V <model_version> --CKPT_E <model_epoch>

or with absolute path:

$ python3 main.py --RUN test --MODEL_CONFIG <model_config> --DATASET <dataset> --CKPT_PATH <path_to_checkpoint>.ckpt

To obtain incremental evaluation on the test sets, use the flag --INCR_EVAL

Data

We do not upload the original datasets as some of them needs license agreements. The preprocessing steps are described in the paper.

As it is, the code can run experiments on:

Data has to be split into three files (data/train/train.<task>, data/valid/valid.<task> and data/test/test.<task>) as in /configs/path_config.yml, all of them following the format:

  • Sequence tagging:
token \t label \n token \t label \n

with an extra \n between sequences.

  • Sequence classification:
<LABEL>: atis_airfare \n token \n token \n

with an extra \n between sequences.

If this repository is helpful for your research, we would really appreciate if you could cite the paper:

@inproceedings{kahardipraja-etal-2021-towards,
    title = "Towards Incremental Transformers: An Empirical Analysis of Transformer Models for Incremental {NLU}",
    author = "Kahardipraja, Patrick  and
      Madureira, Brielen  and
      Schlangen, David",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.90",
    pages = "1178--1189",
}

About

Code for "Towards Incremental Transformers: An Empirical Analysis of Transformer Models for Incremental NLU", EMNLP 2021

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages