## Steps for Building Network
[Step 1](#step1): Unzip image dataset and check out train, validation and test files.

[Step 2](#step2): Show image and json files in train & validation dataset.

[Step 3](#step3): Initialize features(input) and labels(output) from images and json list.

- read images from train/validation/test path.
- read labels from train/validation json file.
- resize and normalize images.
- get batch and return feature_batch and label_batch.

[Step 4](#step4): Build convolutional network, return training accuracy and training loss.

[Step 5](#step5): Train on steps = 20000.

[Step 6](#step6): Train on full dataset.
- epoch x, batch x, training loss, validation accuracy, evluation accuracy

[Step 7](#step7): Test and write json submit file.

<a id='step1'></a>
## Step 1: Unzip image dataset and check out train, validation and test files.

In [1]:
import os, zipfile

train_path = 'E:/ai_challenger/scene classification/dataset/ai_challenger_scene_train_20170904.zip'
validation_path = 'E:/ai_challenger/scene classification/dataset/ai_challenger_scene_validation_20170908.zip'
test_a_path = 'E:/ai_challenger/scene classification/dataset/ai_challenger_scene_test_a_20170922.zip'
extract_path = 'E:/ai_challenger/scene classification/dataset'

def unzip(zipfile_path, extract_path, zipfile_name):
    zipfile = zipfile.ZipFile(zipfile_path, 'r')
    print('Extracting {} ...'.format(zipfile_name))
    zipfile.extractall()
    zipfile.close()
    print('{} has been extracted.'.format(zipfle_name))

if os.path.exists(extract_path):
    print('Found extraced dataset')
else:
    unzip(train_path, extract_path, 'training dataset')
    unzip(validation_path, extract_path, 'validation dataset')
    unzip(test_a_path, extract_path, 'test dataset')

Found extraced dataset


<a id='step2'></a>
## Step 2: Show image and json files in train & validation dataset.

In [2]:
import json
import glob
from scipy.misc import imread, imresize, imsave
import numpy as np

train_features_path = r'E:\ai_challenger\scene classification\dataset\ai_challenger_scene_train_20170904\scene_train_images_20170904'
train_labels_path = r'E:\ai_challenger\scene classification\dataset\ai_challenger_scene_train_20170904\scene_train_annotations_20170904.json'
validation_features_path =r'E:\ai_challenger\scene classification\dataset\ai_challenger_scene_validation_20170908\scene_validation_images_20170908'
validation_labels_path =r'E:\ai_challenger\scene classification\dataset\ai_challenger_scene_validation_20170908\scene_validation_images_20170908\scene_validation_annotations_20170908.json'
test_a_features_path = r'E:\ai_challenger\scene classification\dataset\ai_challenger_scene_test_a_20170922\scene_test_a_images_20170922'

# Show train label list
with open(train_labels_path, 'r') as f:
    train_label_list = json.load(f)
    print(train_label_list[:10])
    train_dict = {}
    for image in train_label_list:
        train_dict[image['image_id']] = int(image['label_id'])
    print('\n')
    print(len(train_dict))     

# Show train features list resulting out of memory...

[{'image_id': '79f993ae0858ae238b22968c5934d1ddba585ae4.jpg', 'label_id': '66', 'image_url': 'https://n1-q.mafengwo.net/s1/M00/6B/72/wKgBm04Wc5WzFXU0AAHf09bdpiY84.jpeg?imageView2%2F2%2Fw%2F600%2Fq%2F90'}, {'image_id': 'e963208fe9e90df0c385f7367bcdb6d0d5d0b165.jpg', 'label_id': '61', 'image_url': 'http://news.sogou.com/'}, {'image_id': '02df5ecbf7c749ccc9d833f129bbd5d9837940ce.jpg', 'label_id': '64', 'image_url': 'http://img2.fawan.com/2016/12/30/e967f93e7713c57cd2b00b832dd6091a_500x-_90.jpg'}, {'image_id': '5620eb385b7567fb087813cf5233b5ceecdeeca3.jpg', 'label_id': '31', 'image_url': 'https://b1-q.mafengwo.net/s1/M00/F2/C9/wKgBm04Wx3a-gk2FAAKbPKX7E9w91.jpeg?imageView2%2F2%2Fw%2F600%2Fq%2F90'}, {'image_id': 'f8b4d42001a562fc63b9b39c02531661c0e236ca.jpg', 'label_id': '19', 'image_url': 'http://news.sogou.com/'}, {'image_id': '57e7eb438670a4519041dab1482f2594a92f8a09.jpg', 'label_id': '11', 'image_url': 'http://www.user2.jqw.com/2014/01/06/1347666/product/b201401072000291460.JPG'}, {'imag

<a id='step3'></a>
## Step 3: Initialize features(input) and labels(output) from images and json list.

In [3]:
import json
from scipy.misc import imread, imresize
import numpy as np
import os

class initialize(object):
    # Get image-label list for train and validation
    def __init__(self, feature_path, label_path):
        self.image_label_dict = {}
        with open(label_path, 'r') as f:
            label_list = json.load(f)
        for image in label_list:
            self.image_label_dict[image['image_id']] = int(image['label_id'])
        self.start = 0
        self.end = 0
        self.length = len(self.image_label_dict) # number of feature images
        self.image_name = list(self.image_label_dict.keys())
        self.feature_path = feature_path
    
    # Read image in feature path, resize and normalize to [-1, 1]
    def get_image(self, image_path, image_size):
        image = imread(image_path)
        image = imresize(image, [image_size, image_size])       
        image = np.array(image).astype(np.float32)
        image = 2 * (image - np.min(image)) / np.ptp(image) - 1
        return image
    
    # Get feature and label batch
    def get_batch(self, batch_size, image_size):
        self.start = self.end
        if self.start >= self.length:
            self.start = 0
        batch_feature = []
        batch_label = []
        index = self.start
        while len(batch_feature) < batch_size:
            if index >= self.length:
                index = 0
            i_image_path = os.path.join(self.feature_path, self.image_name[index])
            i_image = self.get_image(i_image_path, image_size)
            i_label = self.image_label_dict[self.image_name[index]]
            batch_feature.append(i_image)
            batch_label.append(i_label)
            index += 1
        self.end = index
        return batch_feature, batch_label

<a id='step4'></a>
## Step 4: Build convolutional network, return training accuracy and training loss.

In [10]:
import tensorflow as tf

is_training = tf.placeholder(tf.bool)

def conv(input_layer, filters, kernel_size):
    output_layer = tf.layers.conv2d(
        inputs=input_layer, 
        filters=filters, 
        kernel_size=kernel_size,
        strides=1, 
        padding='same', 
        activation=None,
        kernel_initializer=tf.truncated_normal_initializer()
    )
    return output_layer

def batch_norm(input_layer):
    output_layer = tf.layers.batch_normalization(
        inputs=input_layer,
        axis=-1,
        momentum=0.9,
        epsilon=0.001,
        center=True,
        scale=True,
        training=is_training,
        name='conv1_bn'
    )
    return output_layer

def maxpool(input_layer):
    return tf.layers.max_pooling2d(inputs=input_layer, pool_size=[2, 2], strides=2)

In [12]:
def conv_network(feature, label, num_class, image_size, keep_prob):
    # Input layer
    input_layer = tf.reshape(feature, [-1, image_size, image_size, 3])

    # Conv1_1: [3, 3]: [image_size x image_size x1]-->[image_size x image_size x64] 
    conv1_1 = conv(input_layer, 64, 3)
    
    # Conv1_2: [3, 3]: [image_size x image_size x 64]-->[image_size x image_size x 64]
    conv1_2 = conv(conv1_1, 64, 3)

    # Batch Norm 1
    conv1_bn = batch_norm(conv1_2)
    
    # ReLU 1
    conv1_relu = tf.nn.relu(conv1_bn)
    
    # Maxpool [2, 2]: [image_size x image_size x 64]-->[image_size/2 x image_size/2 x 64]
    maxpool1 = maxpool(conv1_relu)
 
    # Conv2_1 [3, 3]: [image_size/2 x image_size/2 x 64]-->[image_size/2 x image_size/2 x 128]
    conv2_1 = conv(maxpool1, 128, 3)
    
    # Conv2_2 [3, 3]: [image_size/2 x image_size/2 x 128]-->[image_size/2 x image_size/2 x 128]
    conv2_2 = conv(conv2_1, 128, 3)
    
    # Batch Norm 2
    conv2_bn = batch_norm(conv2_2)
    
    # ReLU 2
    conv2_relu = tf.nn.relu(conv2_bn)
    
    # Maxpool [2, 2]: [image_size/2 x image_size/2 x 128]-->[image_size/4 x image_size/4 x 128]
    maxpool2 = maxpool(conv2_relu)

    # Conv3_1 [3, 3]: [image_size/4 x image_size/4 x 128]-->[image_size/4 x image_size/4 x 256]
    conv3_1 = conv(maxpool2, 256, 3)
    
    conv3_2 = conv(conv3_1, 256, 3)
    
    conv3_3 = conv(conv3_2, 256, 3)
    
    # Conv3_4 [3, 3]: [image_size/4 x image_size/4 x 256]-->[image_size/4 x image_size/4 x 256]
    conv3_4 = conv(conv3_3, 256, 3)

    # Batch Norm 3
    conv3_bn = batch_norm(conv3_4)
    
    # ReLU 3
    conv3_relu = tf.nn.relu(conv3_bn)
    
    # Maxpool [2, 2]: [image_size/4 x image_size/4 x 256]-->[image_size/8 x image_size/8 x 256]
    maxpool3 = maxpool(conv3_relu)
    
    # Conv4_1 [3, 3]: [image_size/8 x image_size/8 x 256]-->[image_size/8 x image_size/8 x 512]
    conv4_1 = conv(maxpool3, 512, 3)
    
    conv4_2 = conv(conv4_1, 512, 3)
    
    conv4_3 = conv(conv4_2, 512, 3)
    
    # Conv4_4 [3, 3]: [image_size/8 x image_size/8 x 512]-->[image_size/8 x image_size/8 x 512]
    conv4_4 = conv(conv4_3, 512, 3)

    # Batch Norm 3
    conv4_bn = batch_norm(conv4_4)
    
    # ReLU 4
    conv4_relu = tf.nn.relu(conv4_bn)
    
    # Maxpool [2, 2]: [image_size/8 x image_size/8 x 512]-->[image_size/16 x image_size/16 x 512]
    maxpool4 = maxpool(conv4_relu)

    # Conv5_1 [3, 3]: [image_size/16 x image_size/16 x 512]-->[image_size/16 x image_size/16 x 512]
    conv5_1 = conv(maxpool4, 512, 3)
    
    conv5_2 = conv(conv5_1, 512, 3)
    
    conv5_3 = conv(conv5_2, 512, 3)
    
    # Conv5_4 [3, 3]: [image_size/16 x image_size/16 x 512]-->[image_size/16 x image_size/16 x 512]
    conv5_4 = conv(conv5_3, 512, 3)

    # Batch Norm 3
    conv5_bn = batch_norm(conv5_4)
    
    # ReLU 5
    conv5_relu = tf.nn.relu(conv5_bn)
    
    # Maxpool [2, 2]: [image_size/16 x image_size/16 x 512]-->[image_size/32 x image_size/32 x 512]
    maxpool5 = maxpool(conv5_relu)

    # Flatten layer
    flatten = tf.reshape(maxpool5, [-1, image_size * image_size / 2]) 

    # Fully connected layer
    dense = tf.layers.dense(inputs=flatten, units=1024)
    dropout = tf.nn.dropout(dense, keep_prob) # or tf.layers.dropout(inputs, rate)

    # Output layer: returns logits and predictions
    logits = tf.layers.dense(dropout, units=num_class) 
    output = tf.sigmoid(logits)

    # Loss and optimizer
    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=label))
    train_opt = tf.train.AdamOptimizer(learning_rate).minimize(cost)
    
    return train_opt, cost, logits

<a id='step5'></a>
## Step 5: Train on steps = 20000

In [13]:
import tensorflow as tf

def train(train_feature_path, train_label_path, checkpoint_path, num_class, batch_size, image_size, max_step):
    train = initialize(train_feature_path, train_label_path)
        
    feature = tf.placeholder(tf.float32, shape=[None, image_size, image_size, 3], name='feature')
    label = tf.placeholder(tf.float32, shape=[None], name='label')
    keep_prob = tf.placeholder(tf.float32, name='keep_prob')
    one_hot_label = tf.one_hot(indices=tf.cast(label, tf.int32), depth=80)
    train_opt, cost, logits = conv_network(feature, one_hot_label, num_class, image_size, keep_prob)
    correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(one_hot_label, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, 'float32'))
    
    with tf.Session() as sess:
        saver = tf.train.Saver()
        ckpt = tf.train.get_checkpoint_state(checkpoint_path)
        if ckpt and ckpt.model_checkpoint_path:
            print('Restore the model from checkpoint {}.'.format(ckpt.model_checkpoint_path))
            start_step = int(ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1])
        else:
            sess.run(tf.global_variables_initializer())
            start_step = 0
            print('Start training from new start.')
        
        for steps in range(start_step, start_step + max_step):
            train_feature_batch, train_label_batch = train.get_batch(batch_size, image_size)
            sess.run(train_opt, feed_dict={feature: train_feature_batch, label: train_label_batch, keep_prob: 0.5})
                
            if steps % 10 == 0:
                train_accuracy = sess.run(accuracy, feed_dict={feature: train_feature_batch, label: train_label_batch, keep_prob: 0.5})
                train_loss = sess.run(cost, feed_dict={feature: train_feature_batch, label: train_label_batch, keep_prob: 0.5})
                print('Step {}'.format(steps),
                      'Training Accuracy {:.3f}...'.format(train_accuracy),
                      'Training Loss {:.3f}...'.format(train_loss),
                     ) 
            if steps % 1000 == 0:
                saver.save(sess, checkfile, global_step=steps)
                print('Writing checkpoing at step {}'.format(steps))

        print('Training completed')

# Train on 100 steps:
train_feature_path = r'E:\ai_challenger\scene classification\dataset\ai_challenger_scene_train_20170904\scene_train_images_20170904'
train_label_path = r'E:\ai_challenger\scene classification\dataset\ai_challenger_scene_train_20170904\scene_train_annotations_20170904.json'
checkpoint_path = './checkpoint/'
checkfile = './checkpoint/model.ckpt'

num_class = 80
batch_size = 128
image_size = 128
max_step = 20000
learning_rate =1e-3

train(train_feature_path, train_label_path, checkpoint_path, num_class, batch_size, image_size, max_step)

ValueError: Variable conv1_bn/beta already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:

  File "D:\Work\Anaconda\envs\ai_challenger\lib\site-packages\tensorflow\python\framework\ops.py", line 1228, in __init__
    self._traceback = _extract_stack()
  File "D:\Work\Anaconda\envs\ai_challenger\lib\site-packages\tensorflow\python\framework\ops.py", line 2336, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "D:\Work\Anaconda\envs\ai_challenger\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 768, in apply_op
    op_def=op_def)


<a id='step6'></a>
## Step 6: Train on full dataset.