Add image classification tutorial

PreferredAI · Jun 1, 2018 · 33860fe · 33860fe
1 parent a7884e0
commit 33860fe
Show file tree

Hide file tree

Showing 25 changed files with 2,034 additions and 0 deletions.
diff --git a/1806-image-classification/README.md b/1806-image-classification/README.md
@@ -0,0 +1,267 @@
+# Prerequisites
+
+This code is written in [Python3](https://www.python.org/downloads/) and requires [TensorFlow 1.7+](https://www.tensorflow.org/install/). In addition, you need to install a few more packages to process data, [Anaconda](https://www.anaconda.com/download/) is recommended for python environment.
+
+First, you need to clone our repository.
+
+```bash
+$ git clone https://github.com/PreferredAI/Tutorials.git
+$ cd tutorials/1806-image-classification
+```
+
+Run command below to install some required packages.
+
+```bash
+$ pip3 install -r requirements.txt
+```
+
+# Tutorial #1 - Facial Expression Recognition
+
+## Dataset
+
+In this tutorial, we provide data consisting of 48x48 pixel grayscale images of faces. The task is to categorize each face based on the emotion shown in the facial expression in to one of two categories (0=Sad, 1=Happy).
+
+I have provided a script to download the dataset. 
+
+```bash
+$ cd face-emotion
+$ chmod +x download.sh
+$ ./download.sh
+```
+
+The data is already split into training and testing sets with the statistics shown in table below.
+
+| Class       | Training (# images) | Test (# images) |
+| :---------: | :-----------------: | :-------------: |
+| happy (1)   | 4347                | 483             |
+| sad (0)     | 4347                | 483             |
+| **Total**   | 8694                | 966             |
+
+## Model
+
+### Multilayer Perceptron (MLP)
+
+The MLP architecture can be viewed as:
+
+| Layer     | Dim/Kernel | Parameters                  |
+| :-------: | :--------: | --------------------------: |
+| fc1       | 512        | 48 x 48 x 512+1 = 1,179,649 |
+| fc2       | 512        | 512 x 512+1 = 262,145       |
+| output    | 2          | 512 x 2+1 = 1025            |
+| **Total** |            | 1,442,819                   |
+
+### Shallow CNN
+
+The shallow CNN architecture can be viewed as:
+
+| Layer     | Dim/Kernel | Parameters                       |
+| :-------: | :--------: | -------------------------------: |
+| conv      | 5 x 5      | 5 x 5 x 32+1 = 801               |
+| pool      |            | 0                                |
+| fc        | 512        | 24 x 24 x 32 x 512+1 = 9,437,185 |
+| output    | 2          | 512 x 2+1 = 1025                 |
+| **Total** |            | 9,439,011                        |
+
+
+### Deep CNN
+
+The deep CNN architecture can be viewed as:
+
+| Layer     | Dim/Kernel | Parameters                        |
+| :-------: | :--------: | --------------------------------: |
+| conv1     | 5 x 5      | 5 x 5 x 32 + 1 = 801              |
+| pool1     |            | 0                                 |
+| conv2     | 3 x 3      | 3 x 3 x 64 + 1 = 577              |
+| pool2     |            | 0                                 |
+| conv3     | 3 x 3      | 3 x 3 x 128 + 1 = 1153            |
+| conv4     | 3 x 3      | 3 x 3 x 128 + 1 = 1153            |
+| pool3     |            | 0                                 |
+| fc        | 512        | 6 x 6 x 128 x 512 + 1 = 2,359,397 |
+| output    | 2          | 512 x 2 + 1 = 1025                |
+| **Total** |            | 2,364,006                         |
+
+
+## Training and Evaluation
+
+Train model:
+
+```bash
+$ cd src
+$ python3 train.py --model [model_name]
+```
+
+```
+optional arguments:
+  -h, --help                show this help message and exit
+  --data_dir                DATA_DIR
+                              Path to data folder (default: ../data)
+  --model                   MODEL
+                              Type of CNN model (shallow or deep)
+  --checkpoint_dir          CHECKPOINT_DIR
+                              Path to checkpoint folder (default: ../checkpoint)
+  --num_classes             NUM_CLASSES
+                              Number of label classes (default: 2)
+  --num_checkpoints         NUM_CHECKPOINTS
+                              Number of checkpoints to store (default: 1)
+  --log_dir                 LOG_DIR
+                              Path to data folder (default: ../log)
+  --num_epochs              NUM_EPOCHS
+                              Number of training epochs (default: 10)
+  --batch_size              BATCH_SIZE
+                              Batch Size (default: 32)
+  --dropout_rate            DROPOUT_RATE
+                              Probability of dropping neurons (default: 0.25)
+  --learning_rate           LEARNING_RATE
+                              Learning rate (default: 0.001)
+  --allow_soft_placement    ALLOW_SOFT_PLACEMENT
+                              Allow device soft device placement
+```
+
+
+## Results
+
+### Multilayer Perceptron
+
+```bash
+$ python3 train.py --model mlp
+```
+
+```text
+Epoch number: 1
+Training: 100%|███████████████████████████████████████████████████████████████████| 272/272 [00:10<00:00, 26.85it/s, loss=0.608]
+train_loss = 0.6452, train_acc = 63.61 %
+Testing: 100%|███████████████████████████████████████████████████████████████████| 31/31 [00:00<00:00, 37.88it/s, loss=0.536]
+test_loss = 0.5819, test_acc = 68.74 %
+
+...
+...
+...
+
+Epoch number: 10
+Training: 100%|███████████████████████████████████████████████████████████████████| 272/272 [00:09<00:00, 28.55it/s, loss=0.53]
+train_loss = 0.4391, train_acc = 79.03 %
+Testing: 100%|███████████████████████████████████████████████████████████████████| 31/31 [00:00<00:00, 35.52it/s, loss=0.18]
+test_loss = 0.5077, test_acc = 74.53 %
+Saved model checkpoint to ..\checkpoints\mlp\epoch_10
+
+Best accuracy = 74.53 %
+```
+
+
+### Shallow CNN
+
+```bash
+$ python3 train.py --model shallow
+```
+
+```text
+Epoch number: 1
+Training: 100%|███████████████████████████████████████████████████████████████████| 272/272 [00:33<00:00,  8.11it/s, loss=0.594]
+train_loss = 0.6214, train_acc = 65.38 %
+Testing: 100%|███████████████████████████████████████████████████████████████████| 31/31 [00:01<00:00, 22.09it/s, loss=0.396]
+test_loss = 0.5418, test_acc = 73.19 %
+
+...
+...
+...
+
+Epoch number: 10
+Training: 100%|███████████████████████████████████████████████████████████████████| 272/272 [00:33<00:00,  8.15it/s, loss=0.64]
+train_loss = 0.2343, train_acc = 90.40 %
+Testing: 100%|███████████████████████████████████████████████████████████████████| 31/31 [00:01<00:00, 20.50it/s, loss=0.108]
+test_loss = 0.4628, test_acc = 78.67 %
+
+Best accuracy = 78.88 %
+```
+
+### Deep CNN
+
+```bash
+$ python3 train.py --model deep
+```
+
+```text
+Epoch number: 1
+Training: 100%|███████████████████████████████████████████████████████████████████| 272/272 [01:00<00:00,  4.47it/s, loss=0.649]
+train_loss = 0.6766, train_acc = 56.68 %
+Testing: 100%|███████████████████████████████████████████████████████████████████| 31/31 [00:02<00:00, 13.47it/s, loss=0.639]
+test_loss = 0.6473, test_acc = 62.94 %
+
+...
+...
+...
+
+Epoch number: 10
+Training: 100%|███████████████████████████████████████████████████████████████████| 272/272 [01:01<00:00,  4.41it/s, loss=0.374]
+train_loss = 0.3042, train_acc = 86.71 %
+Testing: 100%|███████████████████████████████████████████████████████████████████| 31/31 [00:02<00:00, 12.41it/s, loss=0.586]
+test_loss = 0.3569, test_acc = 84.06 %
+
+Best accuracy = 86.13 %
+```
+
+## Test trained models with other images
+
+To test our trained model with other images:
+
+```bash
+$ python3 test.py --model [model_name] --data_dir [path_to_image_folder]
+```
+
+Some images are already in *test_images* folder for a quick test.
+
+```bash
+$ python3 test.py --model deep --data_dir ../test_images
+```
+
+
+# Tutorial #2 - Visual Sentiment Analysis
+
+
+## Dataset and pre-trained models
+
+```bash
+$ cd vs-cnn
+$ chmod +x download.sh
+$ ./download.sh
+$ cd src
+```
+
+## Base Model (VS-CNN)
+
+Evaluate pre-trained model.
+
+```bash
+$ python3 eval_base.py --dataset user
+```
+
+```text
+Boston
+Loading data file: ../data/user/val_Boston.txt
+Testing: 100%|███████████████████████████████████████████████████████████████████| 19/19 [00:36<00:00,  1.94s/it]
+Pointwise Accuracy = 0.544
+Pairwise Accuracy = 0.546
+
+Avg. Pointwise Accuracy = 0.544
+Avg. Pointwise Accuracy = 0.546
+```
+
+
+## Factor Model (uVS-CNN)
+
+Evaluate pre-trained model with trained factor weights.
+
+```bash
+$ python3 eval_factor.py --dataset user
+```
+
+```text
+Boston
+Loading data file: ../data/user/val_Boston.txt
+Testing: 100%|███████████████████████████████████████████████████████████████████| 1194/1194 [00:48<00:00, 24.41it/s]
+Pointwise Accuracy = 0.664
+Pairwise Accuracy = 0.720
+
+Avg. Pointwise Accuracy = 0.664
+Avg. Pointwise Accuracy = 0.720
+```
diff --git a/1806-image-classification/face-emotion/.gitignore b/1806-image-classification/face-emotion/.gitignore
@@ -0,0 +1,8 @@
+data/
+log/
+checkpoints/
+
+src/.idea/
+src/__pycache__/
+src/data_prepare.py
+src/*.pyc
diff --git a/1806-image-classification/face-emotion/download.sh b/1806-image-classification/face-emotion/download.sh
@@ -0,0 +1,17 @@
+#!/bin/bash
+
+if [ -d "data" ]; then
+  echo "'data' folder already exists!"
+  echo "You need to remove it before downloading again."
+else
+  if [ ! -f 'data.tar.gz' ]; then
+    echo 'Data downloading ...'
+    echo
+    curl -L 'https://static.preferred.ai/tutorial/face-emotion/data.tar.gz' -o data.tar.gz
+  fi
+
+  echo "Data extracting ..."
+  tar -zxf data.tar.gz
+  rm data.tar.gz
+  echo "Data is ready!"
+fi
diff --git a/1806-image-classification/face-emotion/src/data_generator.py b/1806-image-classification/face-emotion/src/data_generator.py
@@ -0,0 +1,87 @@
+import tensorflow as tf
+import pandas as pd
+import numpy as np
+
+class DataGenerator(object):
+  def __init__(self, train_file, test_file, batch_size, num_threads, buffer_size=10000, train_shuffle=True):
+    self.batch_size = batch_size
+    self.num_threads = num_threads
+    self.buffer_size = buffer_size
+    self.train_shuffle = train_shuffle
+
+    # read datasets from csv files
+    self.train_img_paths, self.train_labels = self._read_csv_file(train_file)
+    self.test_img_paths, self.test_labels = self._read_csv_file(test_file)
+
+    # number of batches per epoch
+    self.train_batches_per_epoch = int(np.ceil(len(self.train_labels) / batch_size))
+    self.test_batches_per_epoch = int(np.ceil(len(self.test_labels) / batch_size))
+
+    # build datasets
+    self._build_train_set()
+    self._build_test_set()
+
+    # create an reinitializable iterator given the dataset structure
+    self.iterator = tf.data.Iterator.from_structure(self.train_set.output_types,
+                                                    self.train_set.output_shapes)
+    self.train_init_opt = self.iterator.make_initializer(self.train_set)
+    self.test_init_opt = self.iterator.make_initializer(self.test_set)
+    self.next = self.iterator.get_next()
+
+  def load_train_set(self, session):
+    session.run(self.train_init_opt)
+
+  def load_test_set(self, session):
+    session.run(self.test_init_opt)
+
+  def get_next(self, session):
+    return session.run(self.next)
+
+  def _read_csv_file(self, data_file):
+    """Read the content of the text file and store it into lists."""
+    df = pd.read_csv(data_file, header=None)
+    img_paths = df[0].values
+    labels = df[1].values
+    return img_paths, labels
+
+  def _build_data_set(self, img_paths, labels, map_fn, shuffle=False):
+    img_paths = tf.convert_to_tensor(img_paths, dtype=tf.string)
+    labels = tf.convert_to_tensor(labels, dtype=tf.int32)
+    data = tf.data.Dataset.from_tensor_slices((img_paths, labels))
+    if shuffle:
+      data = data.shuffle(buffer_size=self.buffer_size)
+    data = data.map(map_fn, num_parallel_calls=self.num_threads)
+    data = data.batch(self.batch_size)
+    data = data.prefetch(self.num_threads)
+    return data
+
+  def _build_train_set(self):
+    self.train_set = self._build_data_set(self.train_img_paths,
+                                          self.train_labels,
+                                          self._parse_function_train,
+                                          self.train_shuffle)
+
+  def _build_test_set(self):
+    self.test_set = self._build_data_set(self.test_img_paths,
+                                         self.test_labels,
+                                         self._parse_function_test)
+
+  def _parse_function_train(self, filename, label):
+    # load and preprocess the image
+    img_string = tf.read_file(filename)
+    img_decoded = tf.cast(tf.image.decode_jpeg(img_string), dtype=tf.float32)
+    """
+    Data augmentation comes here.
+    """
+    img_flipped = tf.image.random_flip_left_right(img_decoded)
+    img_scaled = tf.divide(img_flipped, tf.constant(255.0, dtype=tf.float32))
+    img_centered = tf.subtract(img_scaled, tf.constant(0.5, dtype=tf.float32))
+    return img_centered, label
+
+  def _parse_function_test(self, filename, label):
+    # load and preprocess the image
+    img_string = tf.read_file(filename)
+    img_decoded = tf.cast(tf.image.decode_jpeg(img_string), dtype=tf.float32)
+    img_scaled = tf.divide(img_decoded, tf.constant(255.0, dtype=tf.float32))
+    img_centered = tf.subtract(img_scaled, tf.constant(0.5, dtype=tf.float32))
+    return img_centered, label