Skip to content


Repository files navigation

Chitchat as Interference: Adding User Backstories to Task-Oriented Dialogues

During task-oriented dialogues (TODs), human users naturally introduce questions or information that is beyond the immediate scope of the task. This creates interferences which TOD systems cannot always handle. In this project, we create a testbed for building more resilient TOD systems. Using few-shot prompting with an LLM, we create a novel chitchat augmentation for MultiWOZ. In this augmentation a chatty user adds elements of backstory to their request, an interference to which the system responds with support and understanding while also advancing the task. We assess the resilience of 3 model variants, all based on SimpleToD.


This project uses Python 3.11

Create a virtual environment:

conda create -n cc_as_inter python=3.11

Install the requirements:

git clone
cd chitchat-as-interference
pip install -r requirements.txt

Generating the Interferences

Augmented dialogues can be directly found in data/interference_data/. Augmented dialogues will have an augmented_idx key with the idxs of the turns augmented. For the augmented turns, the user will have a backstory key with the backstory to append and the system will have a reaction key with the supportive reaction to prepend to the original text. You will still need to download the Fusedchat dialogues (see step 1. below).

For reproduction and/or modification, here is the augmentation pipeline used:

  1. Download the MultiWOZ2.2 dataset from here. Follow guidelines to convert the data to MultiWOZ2.1 format at the bottom of the page: you should have one single .json with all the annotated dialogues. Also download Fusedchat dialogues from here. The file needed is fusedchat_prepended.json. Add both .jsons to data.

  2. Generate seed situations in a few-shot manner using the prepended FusedChat exchanges. Check the script for paths and set the desired arguments. You can separate the training data into several batches to parallelize generation, see argparse in script.

python --model_name=<...> --data_split=<...> --training_batch_number=<...>
  1. Augment random user turns with backstory and the following system responses with a supportive reaction. Pass in the path for the generated situations. Optiionally add the backstory path if already generated. Check argparse arguments for more details.
python --model_name=<...> --gen_sit_path=<...> --gen_back_path=<...>

For steps 2. and 3. you can also use and modify the following scripts to run the jobs on slurm:

  1. Filter and merge interferences into mwoz:
python --interference_path=<...> --split=<...>

Preparing Data for Training

Prepare training data for SimpleToD. In the data directory, run python Check paths are correct in the script. This will output an lm_data directory.

Train, gen and eval

Train the LLM (Llama-2-7B, but any other LLM is possible—just be sure to adapt the LoraConfig), generate outputs on an eval set and evalute the generated outputs. This can be done all in one go. Check the script to make sure paths are correct and set the desired arguments.

python --train_data_dir=<...> --eval_data_json=<...> --eval_split=<...> --lr=<...> etc.

# you can also modify the following script to run the job on slurm

Possible train+eval setups are:

  • train on vanilla mwoz | eval: vanilla mwoz (as reference)
  • train on vanilla mwoz | eval: interference
  • train on prepended fusedchat | eval: interference
  • train on interference | eval: interference

To only perform generation and evaluation, using previously trained lora weights, run the script or modify and submit

Eval Interface

Interfaces for evalauting the quality of the generated interferences and of the different model responses are in the eval directory.


No description, website, or topics provided.






No releases published
