Diverse-Image-Caption

Promoting Coherence and Diversity in Image Captioning

This repository includes the reference code for conventional diverse image captioning models and CLIP-CVAE.

Environment setup

Python 3.8
Pytorch 1.9
transformers 4.12

Data preparation

To run the code, annotations and images for the COCO dataset are needed. Please download the zip files including the images (train2014.zip, val2014.zip), the zip file containing the annotations (annotations_trainval2014.zip) and extract them. These paths will be set as arguments later.

Acknowledgment

This repository refers to github and huggingface. Thanks for the released code.

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
ckpt/gpt2		ckpt/gpt2
data		data
evaluation		evaluation
model		model
utils		utils
README.md		README.md
feature_extraction.py		feature_extraction.py
inference.py		inference.py
test.py		test.py
train_CVAE.py		train_CVAE.py
train_Transformer.py		train_Transformer.py
vocab_Transformer.pkl		vocab_Transformer.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ckpt/gpt2

ckpt/gpt2

data

data

evaluation

evaluation

model

model

utils

utils

README.md

README.md

feature_extraction.py

feature_extraction.py

inference.py

inference.py

test.py

test.py

train_CVAE.py

train_CVAE.py

train_Transformer.py

train_Transformer.py

vocab_Transformer.pkl

vocab_Transformer.pkl

Repository files navigation

Diverse-Image-Caption

Environment setup

Data preparation

Acknowledgment

About

Releases

Packages

Languages

feizc/Diverse-Image-Caption

Folders and files

Latest commit

History

Repository files navigation

Diverse-Image-Caption

Environment setup

Data preparation

Acknowledgment

About

Resources

Stars

Watchers

Forks

Languages