Skip to content

kreimanlab/DeepLearning-vs-HighLevelVision

 
 

Repository files navigation

This repository contains the code and images used in Jacquot et al., 2020 [arXiv, CVPR]. There is a 1 minute video presentation of our work on youtube.

Summary

Our work builds on the observation that image datasets used in machine learning contain many biases. Those biases help convolutional neural networks to classify images. For example, in the UCF101 dataset, algorithms can rely on the background color to classify human activities.



To address this issue, we followed a rigorous method to build three image datasets corresponding to three human behaviors: drinking, reading, and sitting. Below are some example images from our dataset. The models misclassified the bottom left, middle top, and bottom right pictures, whereas humans correctly classified all six pictures.



We reduced biases in our image datasets by applying 100 to 300 cross-validations of a fine-tuned deep convolutional network (computer-vision/keras/misclassification_rate and computer-vision/matlab/alexnet_misclass_rate.m). The many cross-validations allowed to rank images along their misclassification rate. We then excluded images that were classified too easily. Thus, we obtained datasets that were less biased, more difficult to classify by algorithms.


The ground truth labels for each image was created by asking 3 participants to assign each image to a yes or no class for each action. We also conducted a separate psychophysics experiment (human-vision): images were presented to human participants; each trial consisted of fixation (500 ms), image presentation (50, 150, 400, or 800 ms), and a forced choice yes/no question until the participant answered.

Example gif with 800ms image presentation, played on repeat.

License

We release our work under the Kreiman Lab's license.

Citation

If you find our dataset useful in your research, please consider citing:

@InProceedings{Jacquot_2020_CVPR,
author = {Jacquot, Vincent and Ying, Zhuofan and Kreiman, Gabriel},
title = {Can Deep Learning Recognize Subtle Human Activities?},
booktitle = {The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}

About

Code and database for Jacquot et al. CVPR 2020. Can we decode subtle human activities?

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 36.4%
  • HTML 27.9%
  • Python 25.8%
  • JavaScript 5.6%
  • MATLAB 3.3%
  • CSS 1.0%