GitHub - jmhessel/catrank: Pretrained models for the ranking task described in Cats and Captions vs. Creators and the Clock (WWW 2017)

What's in here?

This repo contains pretrained models that predict relative upvotes on Reddit for image-only and image + text models. The subreddits these models were trained on are /r/pics, /r/aww, /r/cats, /r/FoodPorn, /r/MakeupAddiction, and /r/RedditLaqueristas, so if you want to know if an image would probably be upvoted within these communities, you've come to the right place! If you want to read more about the technical details, check out the project page and paper here.

What is required to run this package?

To install requirements, run

pip install -r requirements.txt

How do I score images?

If you want to score according to the /r/aww community

python score_example.py examples/bodhi.jpg aww

which outputs:

examples/bodhi.jpg		34.8/100

the first column is the filename, and the second column is the score out of 100 for the image (higher is better). The score is the percentile of the image's score on a test split.

How do I score images plus their captions?

If you want to score a cat alongside a caption according to the /r/cats community, you can do

python score_example.py examples/taz.jpg cats --caption "Please don't sit on me!"

which outputs

examples/taz.jpg		please dont sit on me		55.8/100

How do I score lots of images/captions?

If you want to score many images/captions at once, you can use --list_mode True; in this case, the image and caption arguments are assumed to be text files. The image text file has one filename per line, and the caption text file has one caption per line. The first line of the image file should correspond to the first line of the caption file, and so on. For example, you can run

python score_example.py examples/example_image_list.txt --caption examples/example_caption_list.txt cats --list_mode True

which outputs

examples/bodhi.jpg		who says bulldogs cant be c...	22.1/100
examples/lizzy.jpg            	my 20 year old little girl ...	99.4/100
examples/taz.jpg              	please dont sit on me         	55.8/100

Unsurprisingly, the model doesn't like a dog (Bodhi) being posted in /r/cats, though the model likes the story about an elderly cat (Lizzy). As an interesting experiment, you can check the effect the captions had on the scores by running

python score_example.py examples/example_image_list.txt cats --list_mode True

and comparing to the previous output.

I want to train my own models!

If you want to train your own models, you'll need to get the datasets that these were trained on, which are not in this repo. They are available for download here.

Citation and contact

If you find the models here useful, please cite our paper!

@inproceedings{hessel2017cats,
	title={Cats and Captions vs. Creators and the Clock: Comparing Multimodal Content to Context in Predicting Relative Popularity},
	author={Hessel, Jack and Lee, Lillian and Mimno, David},
	booktitle={Proceedings of the 26th International Conference on the World Wide Web},
	year={2017},
	organization={International World Wide Web Conferences Steering Committee}
}

If you have any questions, you can contact jhessel@cs.cornell.edu

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
examples		examples
pretrained_models		pretrained_models
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
score_example.py		score_example.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

pretrained_models

pretrained_models

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

score_example.py

score_example.py

utils.py

utils.py

Repository files navigation

What's in here?

What is required to run this package?

How do I score images?

How do I score images plus their captions?

How do I score lots of images/captions?

I want to train my own models!

Citation and contact

About

Releases

Packages

Contributors 3

Languages

License

jmhessel/catrank

Folders and files

Latest commit

History

Repository files navigation

What's in here?

What is required to run this package?

How do I score images?

How do I score images plus their captions?

How do I score lots of images/captions?

I want to train my own models!

Citation and contact

About

Topics

Resources

License

Stars

Watchers

Forks

Languages