ZRIGF

This is the code for ZRIGF: An Innovative Multimodal Framework for Zero-Resource Image-Grounded Dialogue Generation.

Reference

If you use any source code included in this repo in your work, please cite the following paper.

@inproceedings{10.1145/3581783.3611810,
  author = {Zhang, Bo and Wang, Jian and Ma, Hui and Xu, Bo and Lin, Hongfei},
  title = {ZRIGF: An Innovative Multimodal Framework for Zero-Resource Image-Grounded Dialogue Generation},
  year = {2023},
  isbn = {9798400701085},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3581783.3611810},
  doi = {10.1145/3581783.3611810},
  booktitle = {Proceedings of the 31st ACM International Conference on Multimedia},
  pages = {5464–5473},
  numpages = {10},
  location = {Ottawa ON, Canada},
  series = {MM '23}
}

Requirements

Python 3.10
Pytorch 2.0
CUDA 11.8

To install the Python dependencies, run:

pip install -r requirements.txt

To install nlg-eval, run:

git clone https://github.com/Maluuba/nlg-eval
cd nlg-eval
pip install -e .

To make the code work, some files need to be modified:

nlg-eval/requirements.txt: change gensim~=3.8.3 to gensim>=3.8.3
nlg-eval/nlgeval/word2vec/evaluate.py: replace line 40 with the following line:

return vectors[self.m.key_to_index[key]]

Datasets

Download MS COCO 2017

This example uses COCO dataset (2017) through a custom dataset script, which requires users to manually download the COCO dataset before training.

cd data/coco
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
wget http://images.cocodataset.org/zips/test2017.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
wget http://images.cocodataset.org/annotations/image_info_test2017.zip

Download Open Images

This example uses Open Images images as candidate images for retrieval. To download the images, refer to here. You can build the image index with the appropriate size (500,000 in our experiments) as needed.

If you already have Open Images dataset on disk, save them as follows:

data
|-- open_images
    |-- images
         |-- 14928b4f367c217e.jpg
         |-- 289d643a8761aa83.jpg
         |-- ......

Download Reddit Conversation

Please download the Reddit data from here.

Download Image-Chat

The Image-Chat dataset can be accessed via ParlAI, with -t image_chat.

Running Codes

Contrastive pre-training:

bash scripts/run_contrastive_train.sh

Extracting the vision features and tokenizing dialogue corpus:

bash scripts/run_extract_vokenize.sh

Generative pre-training:

bash scripts/run_generative_train.sh

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
model		model
scripts		scripts
LICENSE		LICENSE
README.md		README.md
contrastive_train.py		contrastive_train.py
extract_vision_keys.py		extract_vision_keys.py
generative_train.py		generative_train.py
requirements.txt		requirements.txt
vokenize_corpus.py		vokenize_corpus.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

model

model

scripts

scripts

LICENSE

LICENSE

README.md

README.md

contrastive_train.py

contrastive_train.py

extract_vision_keys.py

extract_vision_keys.py

generative_train.py

generative_train.py

requirements.txt

requirements.txt

vokenize_corpus.py

vokenize_corpus.py

Repository files navigation

ZRIGF

Reference

Requirements

Datasets

Download MS COCO 2017

Download Open Images

Download Reddit Conversation

Download Image-Chat

Running Codes

About

Releases

Packages

Languages

License

zhangbo-nlp/ZRIGF

Folders and files

Latest commit

History

Repository files navigation

ZRIGF

Reference

Requirements

Datasets

Download MS COCO 2017

Download Open Images

Download Reddit Conversation

Download Image-Chat

Running Codes

About

Resources

License

Stars

Watchers

Forks

Languages