Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



57 Commits

Repository files navigation

iFood 2019 Challenge @ FGVC6, CVPR 2019

What did you eat today? Can you develop a computational model smart enough to identify what you are eating? Automatic food identification can assist towards food intake monitoring to maintain a healthy diet. Food classification is a challenging problem due to the large number of food categories, high visual similarity between different food categories, as well as the lack of datasets that are large enough for training deep models. In this competition, we extend our last year's dataset to 251 fine-grained (prepared) food categories with 118,475 training images collected from the web. We provide human verified labels for both the validation set of 11,994 images and the test set of 28,377 images. The goal is to build a model to predict the fine-grained food-category label given an image.

The main challenges are:

  • Fine-grained Classes: The classes are fine-grained and visually similar. For example, the dataset has 15 different types of cakes, and 10 different types of pastas.

  • Noisy Data: Since the training images are crawled from the web, they often include images of raw ingredients or processed and packaged food items. This is refered to as cross-domain noise. Further, due to the fine-grained nature of food-categories, a training image may either be incorrectly labeled into a visually similar class or be annotated with with a single label despite having multiple food items.

This competition is part of the fine-grained visual-categorization workshop (FGVC6 workshop) at CVPR 2019. Individuals/teams with top submissions will present their work as a poster at the FGVC6 workshop.



Data Released April 22, 2019
Submission Deadline June 3, 2019
Winners Announced June 2019

Evaluation Server

The challenge is hosted on Kaggle


There is a total of 251 food categories in the dataset. A complete list of classes is available here.

Training Data

The training data consists of 120,216 images from 251 classes. The training data is collected from web images and consists of noisy labels.

Validation Data

The validation data consists of 12,170 images from 251 classes. The test data is collected from web images and the labels are human verified. It does not contain noisy labels.

Test Data

The training data consists of 28,399 images from 251 classes. The test data is collected from web images and the labels are human verified. It does not contain noisy labels.

Data Download and Format

Data can be downloaded from the links below or from Kaggle.

Annotations (3.0 MB)

  • Running md5sum annot.tar on the tar file should produce 0c632c543ceed0e70f0eb2db58eda3ab
  • The tar contains 4 files
    • class_list.txt: Contains the names of 251 class labels. This can be used to map class_ids with class names.
    • train_info.csv: Each line of this csv containing the "image_name,label" pair for training data. For example, "train_00000.jpg,94" refers to image train_00000.jpg having class_id 94. The class_id can be mapped to class name using class_list.txt.
    • val_info.csv: Each line of this csv containing the "image_name,label" pair for validation data.
    • test_info.csv: csv only provides the list of test images.
  • We provide separate tars for train, val and test images as mentioned below.

Train Images (2.3 GB)

  • Running md5sum train.tar on the tar file should produce 8e56440e365ee852dcb0953a9307e27f
  • Contains training images.
  • For label information see annotation file train_info.csv.

Validation Images (231 MB)

  • Running md5sum val.tar on the tar file should produce fa9a4c1eb929835a0fe68734f4868d3b
  • Contains validation images.
  • For label information see annotation file val_info.csv.

Test Images (548 MB)

  • Running md5sum train.tar on the tar file should produce 32479146dd081d38895e46bb93fed58f
  • Contains testing images.
  • The label will be evaluation on the evaluation server.


We follow a similar metric to the classification tasks of the ILSVRC. For each image , an algorithm will produce 3 labels , . For this competition each image has one ground truth label , and the error for that image is:


The overall error score for an algorithm is the average error over all test images:

Submission File Format

image_name,label1 label2 label3 
test_0001.jpg,0 1 10 
test_0002.jpg,1 3 5 
test_0003.jpg,0 5 1 

Please include the header as shown above for correct parsing. Each line will correspond to one test image and will be identified by the id (e.g test_0001.jpg refers to image test_0001.jpg) for computing accuracy.


  • Participants should use only the provided training and validation images for training models. Validation data should only be used for validation.
  • We do not allow augmentation with any prior datasets or additional data during training. Pretraining with additional data (such as ImageNet) is allowed as long as participants do not actively collect additional data for the target categories in iFood 2019 challenge. Use of any external data should be properly acknowledged and cited. The general rule is that we want participants to use only the provided training and validation images to train a model to classify the test images.
  • Collecting additional annotations for the train images is not allowed.
  • Hand labeling of test data is not allowed and will lead to disqualification.

Terms of Use

By downloading this dataset you agree to the following terms:

  • You will use the data only for non-commercial research and educational purposes.
  • You will NOT distribute the above images.
  • The organizers make no representations or warranties regarding the data, including but not limited to warranties of non-infringement or fitness for a particular purpose.
  • You accept full responsibility for your use of the data and shall defend and indemnify the organizers, including its employees, officers and agents, against any and all claims arising from your use of the data, including but not limited to your use of any copies of copyrighted images that you may create from the data.


We would like to thank CVDF Foundation and Tsung-Yi Lin for helping us with hosting the data.


Please cite the following paper if you are using this dataset for you research.

  title={FoodX-251: A Dataset for Fine-grained Food Classification},
  author={Kaur, Parneet and and Sikka, Karan and Wang, Weijun and Belongie, serge and Divakaran, Ajay},
  journal={arXiv preprint arXiv:1907.06167},


Karan Sikka, SRI International
Parneet Kaur*
Weijun Wang, Google
Ajay Divakaran, SRI International
Serge Belongie, Cornell University and Cornell Tech

*work done while Parneet was an intern at SRI International

For any further inquiries please contact us at


No description, website, or topics provided.






No releases published


No packages published