Skip to content

briemadu/icr-actions

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Instruction Clarification Requests in the CoDraw Dataset

This is the code repository accompanying the following publication:

  • Brielen Madureira and David Schlangen (2024). Taking Action Towards Graceful Interaction: The Effects of Performing Actions on Modelling Policies for Instruction Clarification Requests. Presented at the UnImplicit Workshop at EACL 2024.

We implement Transformer-based models to learn a policy for when iCRs should be made and what to ask about in the CoDraw dialogue game.

Description

The directories are:

  • checkpoints/: contains the trained model checkpoints at their best validation epoch.
  • codrawmodels/: contains one script from the CoDraw authors with minor adaptations to make it work in our setting.
  • ../data/: is where all downloaded data should live.
  • env/: contains the files to reconstruct the conda environment (see below).
  • icr/: code of our implementation.
  • notebooks/: juypter notebooks used for evaluation.
  • outputs/: the generated outputs of the experiments.
  • scripts/: contains bash scripts.

Dependencies

The directory env/ contains the files that can be used to recreate the conda environment. Running

sh scripts/setup.sh

should create it by calling the same installations one by one as we did. In case is does not work, this directory also contain the .yml files and a spec file auto-generated by comet.ml, as well as the output of pip freeze as requirements.txt.

Data

Four data assets are necessary to run these scripts. They should live at ../data, or the paths to each one can be passed as command-line arguments:

  • CoDraw: data code, download from link. Should replace codrawmodels/, except for the file we had to adjust (codrawmodels/codrawmodels/codraw_data.py). This is necessary for computing scene similatiry scores.
  • CoDraw: data file, download from link.
  • CoDraw-iCR (v2) annotation: available at OSF: https://osf.io/gcjhz/. It is possible to download it manually or clone it via the osfclient.
  • Step-by-step scenes: follow the stes at this repository. The images must be saved as numpy arrays using h5py objects, as done here.
  • AbstractScenes download at this link

Replicating the results

The results can be replicated by regenerating the pretrained embeddings and then calling the bash script that runs all experiments. search.py was used for hyperparameter search.

mkdir outputs
mkdir ../data/text_embeddings/
python3 scripts/get_text_embeddings.py -model bert-base-uncased
python3 scripts/get_bounding_boxes.py
sh scripts/run_experiments.sh

General usage

main.py can be used to run other experiments. It performs both training and evaluation. It accepts different hyperparameters via the CLI. Check the arguments in icr/config.py for details. Not all of them made it to the paper results, but we kept all arguments to facilitate further research.

python3 main.py

Credits

We thank the developers of all the open libraries we use: comet.ml, csv, h5py, matplotlib, numpy, pandas, PIL, positional_encodings, python, pytorch, lightning, scikit-learn, scipy, seaborn, torchmetrics, torchvision, transformers, tqdm.

This work is based on the CoDraw dataset (Kim et al, 2019) and AbstractScenes (see Data section above).

License

Our source code is licensed under the MIT License.

Citation

If you use our work, please cite:

TBA

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published