Skip to content

This repository includes the code for integrating contextual information for supervised text classification tasks using a dual-encoder approach and information exchange via cross-attention. You can find the paper here: https://arxiv.org/abs/2211.01874

License

UKPLab/arxiv2022-context-injection-stance

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Contextual information integration for stance detection via cross-attention

This repository includes the code for integrating contextual information for supervised text classification tasks using a dual-encoder approach and information exchange via cross-attention.

Further details can be found in our publication Robust Integration of Contextual Information for Cross-Target Stance Detection.

Abstract: Stance detection deals with identifying an author’s stance towards a target. Most existing stance detection models are limited because they do not consider relevant contextual information which allows for inferring the stance correctly.Complementary context can be found in knowledge bases but integrating the context into pretrained language models is non-trivial due to the graph structure of standard knowledge bases. To overcome this, we explore an approach to integrate contextual information as text which allows for integrating contextual information from heterogeneous sources, such as structured knowledge sources and by prompting large language models.Our approach can outperform competitive baselines on a large and diverse stance detection benchmark in a cross-target setup, i.e. for targets unseen during training. We demonstrate that it is more robust to noisy context and can regularize for unwanted correlations between labels and target-specific vocabulary. Finally, it is independent of the pretrained language model in use.

Information

Contact person: Tilman Beck, tilman.beck@tu-darmstadt.de

https://www.ukp.tu-darmstadt.de/

https://www.tu-darmstadt.de/

Don't hesitate to e-mail us or report an issue, if something is broken (and it shouldn't be) or if you have further questions.

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

Project structure

(change this as needed!)

  • data/ -- container for the data which includes the scripts for processing the benchmark datasets and retrieving tweets for Twitter datasets
  • src/analysis -- Python scripts to analyze data, attention attribution, compute significance and some visualization utils
  • src/model -- contains the model files
  • src/retrieve -- the code for retrieving contextual information from external knowledge sources (e.g. ConceptNet, CauseNet, T0pp)
  • src/train -- utility files for training

Requirements

  • Python3.6 or higher
  • PyTorch 1.10.2 or higher

Data

We make use of the benchmark datasets provided by Schiller et al. 2021 and Hardalov et al. 2021. The datasets are linked in the respective repositories here and here

Once you have obtained all datasets, put them in a folder (e.g. benchmark_original) and run

$ preprocess_benchmark.py --root_dir /path/to/benchmarḱ_original --output_dir /path/to/benchmark_processed

Now all datasets should be available in the same JSON format, that is

{"text":"This is sample text", "label":1, "target":"example target", "split":"train"}

Our dataset numbers differ slightly from the ones reported by Hardalov et al. (2021) due to the following reasons:

  • rumor: not all tweets could be downloaded
  • mtsd: we were provided the full dataset by the original authors

To produce the final files for running the benchmark experiments you can either use the pre-computed context or compute it by your self. Note you can also add other tasks. Therefore, you need to create a new folder within the processed benchmark folder (e.g. benchmark_processed) for each task and compute the context on your own. There you need to put the samples as .jsonl in the same format as mentioned above.

Use Pre-Computed Context

The pre-computed context is available to download (TBD) and is the same as we used in our paper. It includes the following sources:

After downloading the pre-computed context, you need to extract into a folder (e.g. benchmark_context). The following will then compose the final files to run the experiments.

$ finalize_sd_benchmark.py --root_dir /path/to/benchmark_processed --context_dir /path/to/benchmark_context  --output_dir /path/to/benchmark_final

Do-it-yourself

TODO

  • download conceptnet, extract to graphfile, find relevant nodes
  • download causenet, encode causenet and task inputs, find the most closest
  • find NPs, describe them using T0pp

Setup

  • Clone the repository
$ git clone https://github.com/UKPLab/arxiv2022-context-injection-stance
$ cd arxiv2022-context-injection-stance
  • Create the environment and install dependencies
$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt

Running the experiments

To run the INJECT model with k=2 context candidates retrieved from conceptnet for dataset argmin:

$ python run.py --task argmin --variation conceptnet --setting INJECT_JOINED_TOPIC --k 2 --output_dir /path/to/results/

Parameter description

(change this as needed!)

  • --setting
    • The experiment setting which defines which context injection to use from [BASELINE, BASELINE_TOPIC, BASELINE_JOINED_TOPIC, INJECT_TOPIC, INJECT_JOINED_TOPIC]
  • --k
    • The number of context candidates to use (default 2)
  • --is_cross_topic
    • Declare if the evaluation setting should be cross-target (True) or in-target (False)
  • --input_encoder_layers
    • The layer of the input encoder at which cross-attention should be applied to (default 11)
  • --context_encoder_layers
    • The layer of the context encoder at which cross-attention should be applied to (default 11)
  • --task
    • The task which is used (choose from [emergent,argmin,vast,poldeb,mtsd,rumor,wtwt,iac1,scd,semeval2019t7,semeval2016task6,perspectrum,ibmcs,arc,fnc1,snopes])
  • --variation
    • The context source to use (choose from [conceptnet, causenet, t0pp_key_np, t0pp_key_np_target])
  • --model_name
    • The backbone transformer model which is used for the encoders (e.g. bert-base-uncased)
  • --output_dir
    • The directory to write the experiment outputs
  • --data_folder
    • The directory where the data files are stored
  • --shared_model
    • Both input encoder and context encoder use shared weights
  • --frozen_model
    • Only train classification head, freeze basemodel parameters of both encoders
  • --truncation_length
    • Cut input at truncation_length
  • --random_seed
  • --batch_size

Citing

Please use the following citation:

@inproceedings{beck-etal-2023-robust,
    title = "Robust Integration of Contextual Information for Cross-Target Stance Detection",
    author = "Beck, Tilman  and
      Waldis, Andreas  and
      Gurevych, Iryna",
    booktitle = "Proceedings of the The 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.starsem-1.43",
    pages = "494--511"
}

About

This repository includes the code for integrating contextual information for supervised text classification tasks using a dual-encoder approach and information exchange via cross-attention. You can find the paper here: https://arxiv.org/abs/2211.01874

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages