Microsoft COCO Caption Evaluation

Evaluation codes for MS COCO caption generation.

Requirements

java 1.8.0
python 2 or 3
- gensim

Files

./

cocoEvalCapDemo.py (demo script)

./annotation

captions_val2014.json (MS COCO 2014 caption validation set)
Visit MS COCO download page for more details.

./results

captions_val2014_fakecap_results.json (an example of fake results for running demo)
Visit MS COCO format page for more details.

./pycocoevalcap: The folder where all evaluation codes are stored.

evals.py: The file includes COCOEavlCap class that can be used to evaluate results on COCO.
tokenizer: Python wrapper of Stanford CoreNLP PTBTokenizer
bleu: Bleu evalutation codes
meteor: Meteor evaluation codes
rouge: Rouge-L evaluation codes
cider: CIDEr evaluation codes
spice: SPICE evaluation codes
wmd: Word Mover's Distance evaluation codes

Setup

You will first need to download the Stanford CoreNLP 3.6.0 code and models for use by SPICE. To do this, run: bash get_stanford_models.sh
Note: SPICE will try to create a cache of parsed sentences in ./pycocoevalcap/spice/cache/. This dramatically speeds up repeated evaluations. The cache directory can be moved by setting 'CACHE_DIR' in ./pycocoevalcap/spice. In the same file, caching can be turned off by removing the '-cache' argument to 'spice_cmd'.
You will also need to download the Google News negative 300 word2vec model for use by WMD. To do this, run: bash get_google_word2vec_model.sh

AllSPICE

AllSPICE is a metric measuring both diversity and accuracy of a generated caption set. This is proposed in Analysis of diversity-accuracy tradeoff in image captioning.

See cocoEvalAllSPICEDemo.ipynb to learn how to use it.

You can also check out ruotianluo/self-critical.pytorch/eval_multi.py to see how it is used in practice and ruotianluo/SPICE to see what change was made to the original SPICE code to realize AllSPICE.

References

Microsoft COCO Captions: Data Collection and Evaluation Server
PTBTokenizer: We use the Stanford Tokenizer which is included in Stanford CoreNLP 3.4.1.
BLEU: BLEU: a Method for Automatic Evaluation of Machine Translation
Meteor: Project page with related publications. We use the latest version (1.5) of the Code. Changes have been made to the source code to properly aggreate the statistics for the entire corpus.
Rouge-L: ROUGE: A Package for Automatic Evaluation of Summaries
CIDEr: CIDEr: Consensus-based Image Description Evaluation
SPICE: SPICE: Semantic Propositional Image Caption Evaluation
WMD: From word embeddings to document distances (original metric publication) and Re-evaluating Automatic Metrics for Image Captioning (publication with metric adapted for caption generation)

Also,

Stop words distributed by the NLTK Stopwords Corpus [nltk.corpus.stopwords.words('english')], which originate from [https://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/snowball/stopwords/] and later augmented at [nltk/nltk_data#22], were extracted and put in a text file in pycocoevalcap/wmd/data to avoid requiring users to install NLTK.
Special thanks to David Semedo [https://github.com/davidfsemedo/coco-caption] for writing a Python 3 compatible version of coco-caption first and which was used as a reference to help make this fork.

Developers

Xinlei Chen (CMU)
Hao Fang (University of Washington)
Tsung-Yi Lin (Cornell)
Ramakrishna Vedantam (Virgina Tech)

Acknowledgement

David Chiang (University of Norte Dame)
Michael Denkowski (CMU)
Alexander Rush (Harvard University)
Mert Kilickaya (Hacettepe University)

Name		Name	Last commit message	Last commit date
Latest commit History 175 Commits
annotations		annotations
experimental_data		experimental_data
pycocoevalcap		pycocoevalcap
pycocotools		pycocotools
results		results
.gitignore		.gitignore
README.md		README.md
abstract_50S.csv		abstract_50S.csv
abstract_50S.png		abstract_50S.png
cocoEvalAllSPICEDemo.ipynb		cocoEvalAllSPICEDemo.ipynb
cocoEvalCapDemo.ipynb		cocoEvalCapDemo.ipynb
get_google_word2vec_model.sh		get_google_word2vec_model.sh
get_stanford_models.sh		get_stanford_models.sh
license.txt		license.txt
metrics_accuracy_experiments.py		metrics_accuracy_experiments.py
mscoco_HCI.csv		mscoco_HCI.csv
mscoco_HCI.png		mscoco_HCI.png
mscoco_HII.csv		mscoco_HII.csv
mscoco_HII.png		mscoco_HII.png
pascal_50S.csv		pascal_50S.csv
pascal_50S.png		pascal_50S.png
pascal_50S_only_MM.csv		pascal_50S_only_MM.csv
pascal_50S_only_MM.png		pascal_50S_only_MM.png
results_abstract_50S.json		results_abstract_50S.json
results_pascal_50S.json		results_pascal_50S.json
results_pracegover.json		results_pracegover.json
toy_candidate_classification_results.csv		toy_candidate_classification_results.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Microsoft COCO Caption Evaluation

Requirements

Files

Setup

AllSPICE

References

Developers

Acknowledgement

About

Releases

Packages

Languages

License

gabrielsantosrv/coco-caption---My-changes

Folders and files

Latest commit

History

Repository files navigation

Microsoft COCO Caption Evaluation

Requirements

Files

Setup

AllSPICE

References

Developers

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages