# Computing With MultiWoZ Dialogues

The process for preparing MultiWoZ dialogues in the dataflow format appears to make use of a system for executing dataflow programs within the process. This notebook explores the methods within this to get a better idea of what it would take to actually 'implement' a dialogue with MultiWoZ.

## The outcome of processing dialogues

Their MultiWoZ processing script takes dialogues in the intent-slot-value format used in the [TRADE paper](https://arxiv.org/abs/1905.08743) (which itself is derived from the MultiWoZ dataset) and creates programs for each dialogue. This can be repeated by running
`scripts/multiwoz/download_multiwoz_and_build_dataflow_programs.sh`

Below shows an equivalent validation example from pre/post processing

In [16]:
import json
from pprint import pprint
from typing import Dict

In [26]:
RAW_VALID_TRADE_DIALOGUES = "/scratchdata/bking2/tod_as_df_synthesis/multiwoz/output/trade_dialogues/dev_dials.json"
with open(RAW_VALID_TRADE_DIALOGUES, 'r') as f:
    raw_trade_example: Dict = json.loads(f.read())[0]

VALID_DATAFLOW_DIALOGUES = "/scratchdata/bking2/tod_as_df_synthesis/multiwoz/output/dataflow_dialogues/valid.dataflow_dialogues.jsonl"
with open(VALID_DATAFLOW_DIALOGUES, 'r') as f:
    dataflow_example: Dict = json.loads(f.readlines(1)[0])

# make sure these correspond:
trade_utterance = raw_trade_example['dialogue'][0]['transcript']
dataflow_utterance = dataflow_example['turns'][0]['user_utterance']['original_text']
assert trade_utterance == dataflow_utterance, f"Trade: {trade_utterance}, Dataflow:{dataflow_utterance}"

In [27]:
# A single turn in the raw TRADE dialogue
pprint(raw_trade_example['dialogue'][0])

{'belief_state': [{'act': 'inform', 'slots': [['hotel-area', 'east']]},
                  {'act': 'inform', 'slots': [['hotel-stars', '4']]}],
 'domain': 'hotel',
 'system_acts': [],
 'system_transcript': '',
 'transcript': 'i need to book a hotel in the east that has 4 stars .',
 'turn_idx': 0,
 'turn_label': [['hotel-area', 'east'], ['hotel-stars', '4']]}


In [28]:
# corresponds to the following program
pprint(dataflow_example['turns'][0]['lispress'])

'(find (Constraint[Hotel] :area (?= "east") :stars (?= "4")))'


## Understanding their processing scripts

Their processing script ultimately calls three python modules:

### 1) Creating Data in `src/dataflow/multiwoz/trade_dst/create_data.py`

```bash
python -m dataflow.multiwoz.trade_dst.create_data \
    --use_multiwoz_2_1 \
    --output_dir ${raw_trade_dialogues_dir}
```

Which enters at: `src.dataflow.multiwoz.trade_dst.create_data.main`, with two sub-calls:
- `src.dataflow.multiwoz.trade_dst.create_data.createData`
- `src.dataflow.multiwoz.trade_dst.create_data.divideData` (divides into train/valid/test files)

#### `src.dataflow.multiwoz.trade_dst.create_data.createData`


This takes two arguments: whether to use MultiWoZ 2.1 or 2.0, and where to put data, then it does the following:

1. Downloads and unpacks the chosen version of MultiWoZ
   - see `src.dataflow.multiwoz.trade_dst.create_data.loadData`
2. Normalizes and pre-tokenizes each dialogue (e.g. lowercasing and spacing)
   - see `src.dataflow.multiwoz.trade_dst.create_data.createData` after the call to `loadData`
   
### 2) Patching in `src/dataflow/multiwoz/patch_trade_dialogues.py`

Not super important to this analysis, but it accomplishes the following:

```python
"""
Semantic Machines\N{TRADE MARK SIGN} software.

Patches TRADE-processed dialogues.

In TRADE, there are extra steps to fix belief state labels after the data are dumped from `create_data.py`.
It makes the evaluation and comparison difficult b/c those label correction and evaluation are embedded in the training
code rather than separate CLI scripts.
This new script applies the TRADE label corrections (fix_general_label_errors) and re-dumps the dialogues in the same format:

NOTE: This only patches the "belief_state". Other fields including "turn_label" are unchanged. Thus, there can be
inconsistency between "belief_state" and "turn_label".
"""
```

### 3) Actually converting to programs in `src/dataflow/multiwoz/create_programs.py`

Executes via this bash fragment:

```bash
for subset in "train" "valid" "test"; do
    python -m dataflow.multiwoz.create_programs \
        --trade_data_file ${patched_trade_dialogues_dir}/${subset}_dials.json \
        --outbase ${dataflow_dialogues_dir}/${subset}
done
```

The arguments above are pointers to nput and output files, but they also support flags for in-lining operations like `refer` and `revise` as done in some of the paper experiments. Their argument parser provides useful explanations for each.

Their `main` function does the following:

- instantiate the appropriate salience model (the vanilla model they provide returns the most recent value with a compatible type)
  - **IMPORTANT NOTE ON HOW THIS IMPACTS EVALUATION:** because this salience model decides the program for the **gold** reference in training the dialogue model, its error-rate is actually very important, and the evaluation can't fairly evaluate directly on the dataflow synthesis validation set (because it is inherently a simplification of the MultiWoZ ground truth)
- For each dialogue, create the programs for it, and save to the output file
  - this method is called per processed TRADE dialogue: `src.dataflow.multiwoz.create_programs.create_programs_for_trade_dialogue` (detail below)

### Detailed conversion of a single TRADE dialogue to a Dataflow Program

TODO