CIFAR-10H is a new dataset of soft labels reflecting human perceptual uncertainty for the 10,000-image CIFAR-10 test set, first appearing in the paper:
Joshua C. Peterson*, Ruairidh M. Battleday*, Thomas L. Griffiths, & Olga Russakovsky (2019). Human uncertainty makes classification more robust. In Proceedings of the IEEE International Conference on Computer Vision. (preprint)
And more recently in:
Ruairidh M. Battleday*, Joshua C. Peterson*, & Thomas L. Griffiths (2020). Capturing human categorization of natural images by combining deep networks and cognitive models. Nature Communications, 11(1), 1-14. (paper)
Pulkit Singh, Joshua C. Peterson, Ruairidh M. Battleday, & Thomas L. Griffiths (2020). End-to-end deep prototype and exemplar models for predicting human behavior. Proceedings of the 42nd Annual Conference of the Cognitive Science Society. (preprint)
10000 x 10 numpy matrix containing human classification counts (out of ~50) for each image and class.
10000 x 10 numpy matrix containing normalized human classification counts (probabilities) for each image and class. These are the labels used for training and evaluation in the above paper.
The order of the 10,000 labels matches the original CIFAR-10 test set order.
data/cifar10h-raw.zip - Zip archive containing
cifar10h-raw.csv, raw, annotator-level data. The columns are as follows:
annotator_idis a unique integer ID (starting at 0) for the annotator from Amazon Turk (but is not their Worker ID). For each ID, there are 210 associated ratings trials (10 attention checks + 200 normal trials).
trial_indexindexes the trial for each subject, from 0 to 219. The was cleaned up to not count the fixation click jspsych trials nor count practice trials which may vary in amount across participants.
is_attn_checkis 0 for normal trials and 1 for attention checks
true_categoryis the true string class name (e.g., "cat") of the image
chosen_categoryis the string class name (e.g., "cat") chosen by the annotator for the image
true_labelis the true integer class label of the image
chosen_labelthe integer class label chosen by the annotator for the image
correct_guesswhether the guess was correct (i.e., when true_label==chosen_label), marked by 1, or 0 otherwise
cifar10_test_set_idxis the index of the image in the original CIFAR-10 test set. Attention check trials are marked -99999 to prevent people from indexing the last image of the dataset by accident when using python.
image_filenameis the filename of the PNG filename for the stimulus
subcategoryshows the subordinate class label, which we've never used before, but which is recoverable from the image filenames and may be useful in the future.
reaction_timehow long in milliseconds taken by the annotator to provide a response for the current trial
time_elapsedgives milliseconds passed at the end of each trial. The first is equal to the rt for the first trial. The rest are essentially cumulative rt values.
The mapping from category names to labels is: "airplane": 0, "automobile": 1, "bird": 2, "cat": 3, "deer": 4, "dog": 5, "frog": 6, "horse": 7, "ship": 8, "truck": 9, which match the original CIFAR-10 dataset.
- Dataset statistics / summary
- Keras loading example
- PyTorch loading example
- Classifier evaluation comparison table
- Example training scripts
Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images (Vol. 1, No. 4, p. 7). Technical report, University of Toronto. (website)