## Steps for Building Network
[Step 1](#step1): Unzip image dataset and check out train, validation and test files.

[Step 2](#step2): Show image and json files in train & validation dataset.

[Step 3](#step3): Initialize features(input) and labels(output) from images and json list.

- read images from train/validation/test path.
- read labels from train/validation json file.
- resize and normalize images.
- get batch and return feature_batch and label_batch.

[Step 4](#step4): Build convolutional network, return training accuracy and training loss.

[Step 5](#step5): Train on steps = 100.

[Step 6](#step6): Train on full dataset.
- epoch x, batch x, training loss, validation accuracy, evluation accuracy

[Step 7](#step7): Test and write json submit file.

<a id='step1'></a>
## Step 1: Unzip image dataset and check out train, validation and test files.

In [23]:
import os, zipfile

train_path = 'E:/ai_challenger/scene classification/dataset/ai_challenger_scene_train_20170904.zip'
validation_path = 'E:/ai_challenger/scene classification/dataset/ai_challenger_scene_validation_20170908.zip'
test_a_path = 'E:/ai_challenger/scene classification/dataset/ai_challenger_scene_test_a_20170922.zip'
extract_path = 'E:/ai_challenger/scene classification/dataset'

def unzip(zipfile_path, extract_path, zipfile_name):
    zipfile = zipfile.ZipFile(zipfile_path, 'r')
    print('Extracting {} ...'.format(zipfile_name))
    zipfile.extractall()
    zipfile.close()
    print('{} has been extracted.'.format(zipfle_name))

if os.path.exists(extract_path):
    print('Found extraced dataset')
else:
    unzip(train_path, extract_path, 'training dataset')
    unzip(validation_path, extract_path, 'validation dataset')
    unzip(test_a_path, extract_path, 'test dataset')

Found extraced dataset


<a id='step2'></a>
## Step 2: Show image and json files in train & validation dataset.

In [24]:
import json
import glob
from scipy.misc import imread, imresize, imsave
import numpy as np

train_features_path = r'E:\ai_challenger\scene classification\dataset\ai_challenger_scene_train_20170904\scene_train_images_20170904'
train_labels_path = r'E:\ai_challenger\scene classification\dataset\ai_challenger_scene_train_20170904\scene_train_annotations_20170904.json'
validation_features_path =r'E:\ai_challenger\scene classification\dataset\ai_challenger_scene_validation_20170908\scene_validation_images_20170908'
validation_labels_path =r'E:\ai_challenger\scene classification\dataset\ai_challenger_scene_validation_20170908\scene_validation_images_20170908\scene_validation_annotations_20170908.json'
test_a_features_path = r'E:\ai_challenger\scene classification\dataset\ai_challenger_scene_test_a_20170922\scene_test_a_images_20170922'

# Show train label list
with open(train_labels_path, 'r') as f:
    train_label_list = json.load(f)
    print(train_label_list[:10])
    train_dict = {}
    for image in train_label_list:
        train_dict[image['image_id']] = int(image['label_id'])
    print('\n')
    print(len(train_dict))     

# Show train features list resulting out of memory...

[{'image_url': 'https://n1-q.mafengwo.net/s1/M00/6B/72/wKgBm04Wc5WzFXU0AAHf09bdpiY84.jpeg?imageView2%2F2%2Fw%2F600%2Fq%2F90', 'label_id': '66', 'image_id': '79f993ae0858ae238b22968c5934d1ddba585ae4.jpg'}, {'image_url': 'http://news.sogou.com/', 'label_id': '61', 'image_id': 'e963208fe9e90df0c385f7367bcdb6d0d5d0b165.jpg'}, {'image_url': 'http://img2.fawan.com/2016/12/30/e967f93e7713c57cd2b00b832dd6091a_500x-_90.jpg', 'label_id': '64', 'image_id': '02df5ecbf7c749ccc9d833f129bbd5d9837940ce.jpg'}, {'image_url': 'https://b1-q.mafengwo.net/s1/M00/F2/C9/wKgBm04Wx3a-gk2FAAKbPKX7E9w91.jpeg?imageView2%2F2%2Fw%2F600%2Fq%2F90', 'label_id': '31', 'image_id': '5620eb385b7567fb087813cf5233b5ceecdeeca3.jpg'}, {'image_url': 'http://news.sogou.com/', 'label_id': '19', 'image_id': 'f8b4d42001a562fc63b9b39c02531661c0e236ca.jpg'}, {'image_url': 'http://www.user2.jqw.com/2014/01/06/1347666/product/b201401072000291460.JPG', 'label_id': '11', 'image_id': '57e7eb438670a4519041dab1482f2594a92f8a09.jpg'}, {'imag

<a id='step3'></a>
## Step 3: Initialize features(input) and labels(output) from images and json list.

In [25]:
import json
from scipy.misc import imread, imresize
import numpy as np
import os

class initialize(object):
    # Get image-label list for train and validation
    def __init__(self, feature_path, label_path):
        self.image_label_dict = {}
        with open(label_path, 'r') as f:
            label_list = json.load(f)
        for image in label_list:
            self.image_label_dict[image['image_id']] = int(image['label_id'])
        self.start = 0
        self.end = 0
        self.length = len(self.image_label_dict) # number of feature images
        self.image_name = list(self.image_label_dict.keys())
        self.feature_path = feature_path
    
    # Read image in feature path, resize and normalize to [-1, 1]
    def get_image(self, image_path, image_size):
        image = imread(image_path)
        image = imresize(image, [image_size, image_size])       
        image = np.array(image).astype(np.float32)
        image = 2 * (image - np.min(image)) / np.ptp(image) - 1
        return image
    
    # Get feature and label batch
    def get_batch(self, batch_size, image_size):
        self.start = self.end
        if self.start >= self.length:
            self.start = 0
        batch_feature = []
        batch_label = []
        index = self.start

        while len(batch_feature) < batch_size:
            i_image_path = os.path.join(self.feature_path, self.image_name[index])
            i_image = self.get_image(i_image_path, image_size)
            i_label = self.image_label_dict[self.image_name[index]]

            batch_feature.append(i_image)
            batch_label.append(i_label)
            index += 1
        self.end = index
        return batch_feature, batch_label

<a id='step4'></a>
## Step 4: Build convolutional network, return training accuracy and training loss.

In [26]:
import tensorflow as tf

def conv_network(feature, label, num_class, image_size, keep_prob):
    # Input layer
    input_layer = tf.reshape(feature, [-1, image_size, image_size, 3])

    # Conv1 
    conv1 = tf.layers.conv2d(
        inputs=input_layer, 
        filters=32, 
        kernel_size=3,
        strides=1, 
        padding='same', 
        activation=tf.nn.relu,
        kernel_initializer=tf.truncated_normal_initializer()
        )
    # Batch normalization 1
    bn1 = tf.layers.batch_normalization(conv1)
    pool1 = tf.layers.max_pooling2d(inputs=bn1, pool_size=[2, 2], strides=2)
    # Max pooling 1: [-1, image_size/2, image_size/2, 32]

    # Conv2
    conv2 = tf.layers.conv2d(
        inputs=pool1, 
        filters=64, 
        kernel_size=3,
        strides=1,
        padding='same', 
        activation=tf.nn.relu,
        kernel_initializer=tf.truncated_normal_initializer()
        )
    # Batch normalization 2
    bn2 = tf.layers.batch_normalization(conv2) 
    pool2 = tf.layers.max_pooling2d(inputs=bn2, pool_size=[2, 2], strides=2)
    # Max pooling 2: [-1, image_size/4, image_size/4, 64]

    # Conv3
    conv3 = tf.layers.conv2d(
        inputs=pool2, 
        filters=128, 
        kernel_size=3,
        strides=1,
        padding='same', 
        activation=tf.nn.relu,
        kernel_initializer=tf.truncated_normal_initializer()
        )
    # Batch normalization 2
    bn3 = tf.layers.batch_normalization(conv3)
    pool3 = tf.layers.max_pooling2d(inputs=bn3, pool_size=[2, 2], strides=2)
    # Max pooling 2: [-1, image_size/8, image_size/8, 128]

    # Flatten layer
    flatten = tf.reshape(pool3, [-1, image_size * image_size * 2]) 

    # Fully connected layer
    dense = tf.layers.dense(inputs=flatten, units=1024)
    dropout = tf.nn.dropout(dense, keep_prob) # or tf.layers.dropout(inputs, rate)

    # Output layer: returns logits and predictions
    logits = tf.layers.dense(dropout, units=num_class) 
    output = tf.sigmoid(logits)

    # Loss and optimizer
    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=label))
    train_opt = tf.train.AdamOptimizer(learning_rate).minimize(cost)
    
    return train_opt, cost, logits

<a id='step5'></a>
## Step 5: Train on steps = 1000

In [30]:
import tensorflow as tf

def train(train_feature_path, train_label_path, num_class, batch_size, image_size, train_steps):
    train = initialize(train_feature_path, train_label_path)
        
    feature = tf.placeholder(tf.float32, shape=[None, image_size, image_size, 3], name='feature')
    label = tf.placeholder(tf.float32, shape=[None], name='label')
    keep_prob = tf.placeholder(tf.float32, name='keep_prob')
    one_hot_label = tf.one_hot(indices=tf.cast(label, tf.int32), depth=80)
    train_opt, cost, logits = conv_network(feature, one_hot_label, num_class, image_size, keep_prob)
    correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(one_hot_label, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, 'float32'))
    
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        
        for steps in range(train_steps):
            train_feature_batch, train_label_batch = train.get_batch(batch_size, image_size)
            sess.run(train_opt, feed_dict={feature: train_feature_batch, label: train_label_batch, keep_prob: 0.5})
                
            if steps % 10 == 0:
                train_accuracy = sess.run(accuracy, feed_dict={feature: train_feature_batch, label: train_label_batch, keep_prob: 0.5})
                train_loss = sess.run(cost, feed_dict={feature: train_feature_batch, label: train_label_batch, keep_prob: 0.5})
                print('Step {}'.format(steps+1),
                      'Training Accuracy {:.3f}...'.format(train_accuracy),
                      'Training Loss {:.3f}...'.format(train_loss),
                     ) 
        print('Training completed')

# Train on 100 steps:
train_feature_path = r'E:\ai_challenger\scene classification\dataset\ai_challenger_scene_train_20170904\scene_train_images_20170904'
train_label_path = r'E:\ai_challenger\scene classification\dataset\ai_challenger_scene_train_20170904\scene_train_annotations_20170904.json'
num_class = 80
image_size = 32
batch_size = 64
learning_rate =1e-3
train_steps = 100
train(train_feature_path, train_label_path, num_class, batch_size, image_size, train_steps)

Step 1 Training Accuracy 0.062... Training Loss 4287.052...
Step 11 Training Accuracy 0.016... Training Loss 5484.894...
Step 21 Training Accuracy 0.062... Training Loss 2701.367...
Step 31 Training Accuracy 0.078... Training Loss 1672.494...
Step 41 Training Accuracy 0.047... Training Loss 1068.504...
Step 51 Training Accuracy 0.047... Training Loss 773.654...
Step 61 Training Accuracy 0.062... Training Loss 697.697...
Step 71 Training Accuracy 0.047... Training Loss 583.394...
Step 81 Training Accuracy 0.172... Training Loss 459.619...
Step 91 Training Accuracy 0.141... Training Loss 415.706...
Training completed
