Skip to content

Implementation of Google NIMA paper by Tensorflow Slim. Evaluate food photos with VGG16 model.

License

Notifications You must be signed in to change notification settings

masterTW/Tensorflow_NIMA_Food_Photo_Aesthetic_Evaluation

Repository files navigation

Tensorflow_NIMA_Food_Photo_Aesthetic_Evaluation

Implementation of Google NIMA paper by Tensorflow Slim. Evaluate food photos with VGG16 model. Score of 5 or above means successful photo. Lower than 5 means not so good.

User Story:

  • User wants to publish a popular food journal, but does not know which food photos to choose from a wide range of photos. This system helps user to select the most appealing food photos for user’s reference.

Introduction:

  • Google NIMA paper[1] mentions evaluating photos esthetically with AVA esthetic photo gallery[3]. Another paper[2] shows that to train a model to evaluate food photos, one only needs to use a few AVA food photos to have successful results. So this project uses 5000 AVA food photos as dataset.

Requirements:

  • Python 3
  • TensorFlow

Training:

  1. Download AVA dataset.
  2. Download Slim VGG16 pre-trained model. These CNNs have been trained on the ILSVRC-2012-CLS image classification dataset.
  3. Convert AVA food data to TFRecords, execute the following instructions.
    python3 convert_tfrecord.py --ava_dir=<path to ava_dir> --dataset_dir=<TFRecord storage path>
  4. Training model, execute the following instructions.
    sudo python3 train_nima_vgg16.py --checkpoint_path=<path to pre-traind model> --dataset_dir=<path to TFRecords_dir>

Evaluation:

  1. Download the model trained by this project or use self-trained model.
  2. Run below instruction and the program will load all the photos in dataset for esthetic evaluation.
    python3 evaluate_nima_vgg16.py --photo_dir=<path to photodir> --vgg16_path=<path to vgg16>
    example:
    python3 evaluate_nima_vgg16.py --photo_dir=image/ --vgg16_path=vgg/nima-22500

Resullt:

  • Esthetic score of 5 or above means successful photo. Lower than 5 means not so good. Tested the model with 500 AVA food photos and confirmed the accuracy is up to 73.5%, which matches the result of the paper[2].

To-Do List:

  • Downsize the file of model with MobileNet.
  • Create Web version with Tensorflow.js.

References:

  1. Talebi, Hossein, and Peyman Milanfar. "NIMA: Neural Image Assessment" IEEE Transactions on Image Processing, 2017
  2. Jiayu Lou, Hang Yang. "Food Image Aesthetic Quality Measurement by Distribution Prediction", 2018
  3. Naila Murray, Luca Marchesotti, Florent Perronnin. "AVA: A Large-Scale Database for Aesthetic Visual Analysis", 2012

About

Implementation of Google NIMA paper by Tensorflow Slim. Evaluate food photos with VGG16 model.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages