Contextual information integration for stance detection via cross-attention

This repository includes the code for integrating contextual information for supervised text classification tasks using a dual-encoder approach and information exchange via cross-attention.

Further details can be found in our publication Robust Integration of Contextual Information for Cross-Target Stance Detection.

Abstract: Stance detection deals with identifying an author’s stance towards a target. Most existing stance detection models are limited because they do not consider relevant contextual information which allows for inferring the stance correctly.Complementary context can be found in knowledge bases but integrating the context into pretrained language models is non-trivial due to the graph structure of standard knowledge bases. To overcome this, we explore an approach to integrate contextual information as text which allows for integrating contextual information from heterogeneous sources, such as structured knowledge sources and by prompting large language models.Our approach can outperform competitive baselines on a large and diverse stance detection benchmark in a cross-target setup, i.e. for targets unseen during training. We demonstrate that it is more robust to noisy context and can regularize for unwanted correlations between labels and target-specific vocabulary. Finally, it is independent of the pretrained language model in use.

Information

Contact person: Tilman Beck, tilman.beck@tu-darmstadt.de

https://www.ukp.tu-darmstadt.de/

https://www.tu-darmstadt.de/

Don't hesitate to e-mail us or report an issue, if something is broken (and it shouldn't be) or if you have further questions.

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

Project structure

(change this as needed!)

data/ -- container for the data which includes the scripts for processing the benchmark datasets and retrieving tweets for Twitter datasets
src/analysis -- Python scripts to analyze data, attention attribution, compute significance and some visualization utils
src/model -- contains the model files
src/retrieve -- the code for retrieving contextual information from external knowledge sources (e.g. ConceptNet, CauseNet, T0pp)
src/train -- utility files for training

Requirements

Python3.6 or higher
PyTorch 1.10.2 or higher

Data

We make use of the benchmark datasets provided by Schiller et al. 2021 and Hardalov et al. 2021. The datasets are linked in the respective repositories here and here

Once you have obtained all datasets, put them in a folder (e.g. benchmark_original) and run

$ preprocess_benchmark.py --root_dir /path/to/benchmarḱ_original --output_dir /path/to/benchmark_processed

Now all datasets should be available in the same JSON format, that is

{"text":"This is sample text", "label":1, "target":"example target", "split":"train"}

Our dataset numbers differ slightly from the ones reported by Hardalov et al. (2021) due to the following reasons:

rumor: not all tweets could be downloaded
mtsd: we were provided the full dataset by the original authors

To produce the final files for running the benchmark experiments you can either use the pre-computed context or compute it by your self. Note you can also add other tasks. Therefore, you need to create a new folder within the processed benchmark folder (e.g. benchmark_processed) for each task and compute the context on your own. There you need to put the samples as .jsonl in the same format as mentioned above.

Use Pre-Computed Context

The pre-computed context is available to download (TBD) and is the same as we used in our paper. It includes the following sources:

CauseNet Heindorf et al. 2020
ConceptNet here
T0pp-NP using Sanh et al. 2021
T0pp-NP-Targ using Sanh et al. 2021
All combination of the four sources above
Random* context sampled from the Gutenberg Project here

After downloading the pre-computed context, you need to extract into a folder (e.g. benchmark_context). The following will then compose the final files to run the experiments.

$ finalize_sd_benchmark.py --root_dir /path/to/benchmark_processed --context_dir /path/to/benchmark_context  --output_dir /path/to/benchmark_final

Do-it-yourself

TODO

download conceptnet, extract to graphfile, find relevant nodes
download causenet, encode causenet and task inputs, find the most closest
find NPs, describe them using T0pp

Setup

Clone the repository

$ git clone https://github.com/UKPLab/arxiv2022-context-injection-stance
$ cd arxiv2022-context-injection-stance

Create the environment and install dependencies

$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt

Running the experiments

To run the INJECT model with k=2 context candidates retrieved from conceptnet for dataset argmin:

$ python run.py --task argmin --variation conceptnet --setting INJECT_JOINED_TOPIC --k 2 --output_dir /path/to/results/

Parameter description

(change this as needed!)

--setting
- The experiment setting which defines which context injection to use from [BASELINE, BASELINE_TOPIC, BASELINE_JOINED_TOPIC, INJECT_TOPIC, INJECT_JOINED_TOPIC]
--k
- The number of context candidates to use (default 2)
--is_cross_topic
- Declare if the evaluation setting should be cross-target (True) or in-target (False)
--input_encoder_layers
- The layer of the input encoder at which cross-attention should be applied to (default 11)
--context_encoder_layers
- The layer of the context encoder at which cross-attention should be applied to (default 11)
--task
- The task which is used (choose from [emergent,argmin,vast,poldeb,mtsd,rumor,wtwt,iac1,scd,semeval2019t7,semeval2016task6,perspectrum,ibmcs,arc,fnc1,snopes])
--variation
- The context source to use (choose from [conceptnet, causenet, t0pp_key_np, t0pp_key_np_target])
--model_name
- The backbone transformer model which is used for the encoders (e.g. bert-base-uncased)
--output_dir
- The directory to write the experiment outputs
--data_folder
- The directory where the data files are stored
--shared_model
- Both input encoder and context encoder use shared weights
--frozen_model
- Only train classification head, freeze basemodel parameters of both encoders
--truncation_length
- Cut input at truncation_length
--random_seed
--batch_size

Citing

Please use the following citation:

@inproceedings{beck-etal-2023-robust,
    title = "Robust Integration of Contextual Information for Cross-Target Stance Detection",
    author = "Beck, Tilman  and
      Waldis, Andreas  and
      Gurevych, Iryna",
    booktitle = "Proceedings of the The 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.starsem-1.43",
    pages = "494--511"
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
src		src
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE.txt		NOTICE.txt
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

src

src

.gitignore

.gitignore

LICENSE

LICENSE

NOTICE.txt

NOTICE.txt

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Contextual information integration for stance detection via cross-attention

Information

Project structure

Requirements

Data

Use Pre-Computed Context

Do-it-yourself

Setup

Running the experiments

Parameter description

Citing

About

Releases

Packages

Contributors 3

Languages

License

UKPLab/arxiv2022-context-injection-stance

Folders and files

Latest commit

History

Repository files navigation

Contextual information integration for stance detection via cross-attention

Information

Project structure

Requirements

Data

Use Pre-Computed Context

Do-it-yourself

Setup

Running the experiments

Parameter description

Citing

About

Resources

License

Stars

Watchers

Forks

Languages