
# Image Retraining

主要是做练习时候的一个困惑： 

使用已经训练好的 InceptionV3，对新的图片分类问题进行训练，**很容易** 产生过拟合：对训练集，很容易达到100%的准确率，但是对测试集/验证集，准确率不到 50%，并且很难继续提升。 这说明什么，说明确实学习到了图片的特征，但是学习能力太强，产生了过拟合。

google了一下，有些参考：

- https://stackoverflow.com/questions/37605611/would-adding-dropout-help-reduce-overfitting-when-following-tensorflows-transfe

- [Image Retraining](https://www.tensorflow.org/tutorials/image_retraining)

这里按照 Image Retraining实际上也很简单，取了最后一层，加上新的训练层，对另一类图片重新进行训练。


1. 准备图片，图片处理，数据集分割，以及提供batch方法
2. 准备keras模型
    1. 设置后 x 层重新训练，前 x 层不重新训练
    2. 添加新的训练层
    3. 添加评估方法
3. 训练评估模型
4. error analysis


In [11]:
import os
import re
import hashlib
import sys
import urllib
import tarfile
import random
import keras

import scipy.io

import numpy as np
import tensorflow as tf
import keras.backend as K

from datetime import datetime
from tensorflow.python.util import compat
from tensorflow.python.platform import gfile


def reset_tf_session():
    K.clear_session()
    tf.reset_default_graph()
    return K.get_session()


## 0. 全局参数

In [3]:

flowers102_tar_path = "../readonly/week3/102flowers.tgz"
labels_mat_path = '../readonly/week3/imagelabels.mat'


# 模型参数
learning_rate = 0.01
training_steps = 4000
eval_step_interval = 100
train_batch_size = 100
validation_batch_size = 100
test_batch_size = -1


## 1. 准备训练数据

这里是102flowers数据，包括102种花。

来源： <http://www.robots.ox.ac.uk/~vgg/data/flowers/102/index.html>

准备训练集、验证集和测试集以及标签。以及batch方法


In [12]:
# read filenames firectly from tar
def get_all_filenames(tar_fn):
    with tarfile.open(tar_fn) as f:
        return [m.name for m in f.getmembers() if m.isfile()]

all_files = sorted(get_all_filenames("../readonly/week3/102flowers.tgz"))  # list all files in tar sorted by name
all_labels = scipy.io.loadmat('../readonly/week3/imagelabels.mat')['labels'][0] - 1  # read class labels (0, 1, 2, ...)
# all_files and all_labels are aligned now
N_CLASSES = len(np.unique(all_labels))
print(N_CLASSES)

102


## 2. 准备模型

### 2.1 使用 inception_v3模型

tensorflow的model文件，参考这个文章： <https://www.tensorflow.org/extend/tool_developers/>

In [6]:

# 下载 inception_v3 模型文件 (.pb文件)

data_url = 'http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz'
model_dir = './tmp/imagenet'
bottleneck_dir = '/tmp/bottleneck'
model_file_name = 'classify_image_graph_def.pb'
summaries_dir = '/tmp/retrain_logs2/'

# 如果没有 './tmp/imagenet' 文件夹则创建
if not os.path.exists(model_dir):
    os.makedirs(model_dir)
filename = data_url.split('/')[-1]
filepath = os.path.join(model_dir, filename)

# 进度条
def _progress(count, block_size, total_size):
    sys.stdout.write('\r>> Downloading %s %.1f%%' %
                     (filename,
                      float(count * block_size) / float(total_size) * 100.0))
    sys.stdout.flush()
    
# 没有文件则下载
if not os.path.exists(filepath):
    filepath, _ = urllib.request.urlretrieve(data_url, filepath, _progress)
    print()
    statinfo = os.stat(filepath)
    tf.logging.info('Successfully downloaded %s %d bytes.', filename, statinfo.st_size)

# 没有模型文件，则解压缩
model_path = os.path.join(model_dir, model_file_name)
if not os.path.exists(model_path):
    print('Extracting file from ', filepath)
    tarfile.open(filepath, 'r:gz').extractall(model_dir)
    

In [7]:

# 从pb文件里面，创建 graph：
print('Model path: ', model_path)

def create_model_graph():
    tf.reset_default_graph()
    with tf.Graph().as_default() as graph:
        print(graph)
        with gfile.FastGFile(model_path, 'rb') as f:
            graph_def = tf.GraphDef()
            graph_def.ParseFromString(f.read())
            bottleneck_tensor, resized_input_tensor = (tf.import_graph_def(
                graph_def,
                name='',
                return_elements=['pool_3/_reshape:0', 'Mul:0']
            ))
    return graph, bottleneck_tensor, resized_input_tensor

graph, bottleneck_tensor, resized_input_tensor =  create_model_graph()

Model path:  /tmp/imagenet/classify_image_graph_def.pb
<tensorflow.python.framework.ops.Graph object at 0x7f1a607b0c88>


In [8]:
graph

<tensorflow.python.framework.ops.Graph at 0x7f1a607b0c88>

In [9]:
print(resized_input_tensor)
print(bottleneck_tensor)

Tensor("Mul:0", shape=(1, 299, 299, 3), dtype=float32)
Tensor("pool_3/_reshape:0", shape=(1, 2048), dtype=float32)


### 3.2 添加新的训练层

In [10]:

bottleneck_tensor_size = int(bottleneck_tensor.shape[-1])
class_count = len(image_lists)

In [11]:
def variable_summaries(var):
    """Attach a lot of summaries to a Tensor (for TensorBoard visualization)."""
    with tf.name_scope('summaries'):
        mean = tf.reduce_mean(var)
        tf.summary.scalar('mean', mean)
        with tf.name_scope('stddev'):
            stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
        tf.summary.scalar('stddev', stddev)
        tf.summary.scalar('max', tf.reduce_max(var))
        tf.summary.scalar('min', tf.reduce_min(var))
        tf.summary.histogram('histogram', var)

with graph.as_default():
    # BottleneckInputPlaceholder, input =  bottlenck output
    with tf.name_scope('input'):
        bottleneck_input = tf.placeholder_with_default(
            bottleneck_tensor,
            shape=[None, bottleneck_tensor_size],
            name='BottleneckInputPlaceholder')

        ground_truth_input = tf.placeholder(
            tf.int64, [None], name='GroundTruthInput')

    # Organizing the following ops so they are easier to see in TensorBoard.
    with tf.name_scope('final_retrain_ops'):
        with tf.name_scope('weights'):
            initial_value = tf.truncated_normal(
                [bottleneck_tensor_size, class_count], stddev=0.001)
            layer_weights = tf.Variable(initial_value, name='final_weights')
            variable_summaries(layer_weights)

        with tf.name_scope('biases'):
            layer_biases = tf.Variable(tf.zeros([class_count]), name='final_biases')
            variable_summaries(layer_biases)
            
        # add dropout
        with tf.name_scope('dropout'):
            keep_prob = tf.placeholder_with_default(tf.constant(1.0), [], name='keep_prob')
            drop = tf.nn.dropout(bottleneck_input, keep_prob)
            
        with tf.name_scope('Wx_plus_b'):
            logits = tf.matmul(drop, layer_weights) + layer_biases
            tf.summary.histogram('pre_activations', logits)

    # final output
    final_tensor = tf.nn.softmax(logits, name='final_result')
    tf.summary.histogram('activations', final_tensor)
    
# here we have : (train_step, cross_entropy_mean, bottleneck_input, ground_truth_input, final_tensor)


In [12]:
tf.constant(1.0).shape

TensorShape([])

In [13]:
# cross entropy 和 train：
with graph.as_default():
    # cross entropy
    with tf.name_scope('cross_entropy'):
        cross_entropy_mean = tf.losses.sparse_softmax_cross_entropy(
            labels=ground_truth_input, logits=logits)
    tf.summary.scalar('cross_entropy', cross_entropy_mean)

    # train step, optimizer
    with tf.name_scope('train'):
        optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate)
        train_step = optimizer.minimize(cross_entropy_mean)

In [14]:
# 评估层
with graph.as_default():
    with tf.name_scope('accuracy'):
        with tf.name_scope('correct_prediction'):
            prediction = tf.argmax(final_tensor, 1)
            correct_prediction = tf.equal(prediction, ground_truth_input)
        with tf.name_scope('accuracy'):
            evaluation_step = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    tf.summary.scalar('accuracy', evaluation_step)


### 3.3 在原模型前面加上 `jepg_decoding` 层，用于处理图片输入



In [15]:
# 在输入层添加 jpeg decoding层

input_width = 299
input_height = 299
input_depth = 3
input_mean = 128   # 256/2
input_std = 128    # 256/2
# resize and normalize ~(-0.5, 0.5)

with graph.as_default():
    jpeg_data_tensor = tf.placeholder(tf.string, name='DecodeJPGInput')
    decoded_image = tf.image.decode_jpeg(jpeg_data_tensor, channels=input_depth)
    decoded_image_as_float = tf.cast(decoded_image, dtype=tf.float32)
    decoded_image_4d = tf.expand_dims(decoded_image_as_float, 0)
    resize_shape = tf.stack([input_height, input_width])
    resize_shape_as_int = tf.cast(resize_shape, dtype=tf.int32)
    resized_image = tf.image.resize_bilinear(decoded_image_4d,
                                             resize_shape_as_int)
    offset_image = tf.subtract(resized_image, input_mean)
    decoded_image_tensor = tf.multiply(offset_image, 1.0 / input_std)
    


In [16]:
jpeg_data_tensor.graph

<tensorflow.python.framework.ops.Graph at 0x7f1a607b0c88>

### 3.4 计算并缓存 bottleneck 的输出（对所有图片）


In [17]:
# 计算 bottleneck层 的输出，并缓存。 主要是为了避免重复计算，加快训练速度

def ensure_dir_exists(dir_name):
    if not os.path.exists(dir_name):
        os.makedirs(dir_name)
    
def run_bottleneck_on_image(sess, image_data):
    # image_data_tensor -> decoded_image_tensor | resized_input_tensor -> bottleneck_tensor
    # image_data -> resized_input_values -> bottleneck_values
    resized_input_values = sess.run(decoded_image_tensor, {jpeg_data_tensor: image_data})
    # Then run it through the recognition network.
    bottleneck_values = sess.run(bottleneck_tensor, {resized_input_tensor: resized_input_values})
    bottleneck_values = np.squeeze(bottleneck_values)
    return bottleneck_values

def create_bottleneck_file(sess, image_path, bottleneck_path):
    if not gfile.Exists(image_path):
        tf.logging.fatal('File does not exist %s', image_path)
    image_data = gfile.FastGFile(image_path, 'rb').read()
    try:
        bottleneck_values = run_bottleneck_on_image(sess, image_data)
    except Exception as e:
        raise RuntimeError('Error during processing file %s (%s)' % (image_path, str(e)))
        
    bottleneck_string = ','.join(str(x) for x in bottleneck_values)
    with open(bottleneck_path, 'w') as bottleneck_file:
        bottleneck_file.write(bottleneck_string)

ensure_dir_exists(bottleneck_dir)
def get_or_create_bottleneck(sess, label_name, image_name):
    image_path = os.path.join(image_dir, label_name, image_name)
    bottleneck_path = os.path.join(bottleneck_dir, label_name, image_name + '.txt')
    
    if not os.path.exists(bottleneck_path):
        create_bottleneck_file(sess, image_path, bottleneck_path)
    with open(bottleneck_path, 'r') as bottleneck_file:
        bottleneck_string = bottleneck_file.read()
    bottleneck_values = [float(x) for x in bottleneck_string.split(',')]
    return bottleneck_values
    
    
def cache_bottlenecks(sess):
    how_many_bottlenecks = 0
    for label_name, label_lists in image_lists.items():
        sub_dir = os.path.join(bottleneck_dir, label_name)
        ensure_dir_exists(sub_dir)
        for category in ['training', 'testing', 'validation']:
            category_list = label_lists[category]
            for index, unused_base_name in enumerate(category_list):
                get_or_create_bottleneck(sess, label_name, unused_base_name)
                how_many_bottlenecks += 1
#                 if how_many_bottlenecks % 100 == 0:
#                     tf.logging.info(str(how_many_bottlenecks) + ' bottleneck files created.')
                    



## 4. 开始训练



In [18]:
# 定义batch 方法：

def get_random_cached_bottlenecks(sess, batch_size, category):
    """Retrieves bottleneck values for cached images.
    Args:
      category: training, testing, or validation.
    Returns:
      List of bottleneck arrays, their corresponding ground truths, and the
      relevant filenames.
    """
    class_count = len(image_lists.keys())
    bottlenecks = []
    ground_truths = []
    filenames = []
    if batch_size >= 0:
        for unused_i in range(batch_size):
            # random label
            label_index = random.randrange(class_count)
            label_name = label_name = list(image_lists.keys())[label_index]
            image_list = image_lists[label_name][category]
            image_index = random.randrange(MAX_NUM_IMAGES_PER_CLASS + 1) % len(image_list)
            image_name = image_list[image_index]
            bottleneck = get_or_create_bottleneck(sess, label_name, image_name)
            bottlenecks.append(bottleneck)
            ground_truths.append(label_index)
            filenames.append(image_name)
    else:
        # Retrieve all bottlenecks.
        for label_index, label_name in enumerate(image_lists.keys()):
            for image_index, image_name in enumerate(image_lists[label_name][category]):
                bottleneck = get_or_create_bottleneck(sess, label_name, image_name)
                bottlenecks.append(bottleneck)
                ground_truths.append(label_index)
                filenames.append(image_name)
                
    return bottlenecks, ground_truths, filenames


In [19]:
# sess = tf.InteractiveSession(graph=graph)

# summaries_dir = '/tmp/retrain_logs2/'

# with tf.Session(graph=graph) as sess:
    


In [20]:
summaries_dir = '/tmp/retrain_logs2/'

In [21]:
training_steps= 4000
dropout_keep_prob = 0.5

with tf.Session(graph=graph) as sess:
    # cache bottlenecks
    cache_bottlenecks(sess)

    # summary
    merged = tf.summary.merge_all()

    train_writer = tf.summary.FileWriter(summaries_dir + '/train', sess.graph)
    validation_writer = tf.summary.FileWriter(FLAGS.summaries_dir + '/validation')
    train_saver = tf.train.Saver()

    # 
    # Set up all our weights to their initial default values.
    init = tf.global_variables_initializer()
    sess.run(init)
    # Run the training for as many cycles as requested on the command line.
    for i in range(training_steps):
        (train_bottlenecks, train_ground_truth, _) = get_random_cached_bottlenecks(
                sess, FLAGS.train_batch_size, 'training')
        # Feed the bottlenecks and ground truth into the graph, and run a training
        # step. Capture training summaries for TensorBoard with the `merged` op.
        train_summary, _ = sess.run(
            [merged, train_step],
            feed_dict={
                bottleneck_input: train_bottlenecks,
                ground_truth_input: train_ground_truth,
                keep_prob: dropout_keep_prob,
            })
        train_writer.add_summary(train_summary, i)

        # Every so often, print out how well the graph is training.
        is_last_step = (i + 1 == training_steps)
        if (i % FLAGS.eval_step_interval) == 0 or is_last_step:
            train_accuracy, cross_entropy_value = sess.run(
                [evaluation_step, cross_entropy_mean],
                feed_dict={
                    bottleneck_input: train_bottlenecks,
                    ground_truth_input: train_ground_truth
                })
            tf.logging.info('%s: Step %d: Cross entropy = %f, Train accuracy = %.1f%%' % (
                datetime.now(), i, cross_entropy_value, train_accuracy * 100))
            
            validation_bottlenecks, validation_ground_truth, _ = (
                get_random_cached_bottlenecks(sess, FLAGS.validation_batch_size, 'validation'))

            # Run a validation step and capture training summaries for TensorBoard
            # with the `merged` op.
            validation_summary, validation_accuracy = sess.run(
                [merged, evaluation_step],
                feed_dict={
                    bottleneck_input: validation_bottlenecks,
                    ground_truth_input: validation_ground_truth
                })
            validation_writer.add_summary(validation_summary, i)
            tf.logging.info('%s: Step %d: Validation accuracy = %.1f%% (N=%d)' %
                            (datetime.now(), i, validation_accuracy * 100,
                             len(validation_bottlenecks)))
    # 这个用于保存session中的模型和参数信息
    #     train_saver.save(sess, CHECKPOINT_NAME)

    #     # We've completed all our training, so run a final test evaluation on
    #     # some new images we haven't used before.
    test_bottlenecks, test_ground_truth, test_filenames = (
        get_random_cached_bottlenecks(sess, FLAGS.test_batch_size, 'testing'))
        
    test_accuracy, predictions = sess.run(
        [evaluation_step, prediction],
        feed_dict={
            bottleneck_input: test_bottlenecks,
            ground_truth_input: test_ground_truth
        })
    tf.logging.info('Final test accuracy = %.1f%% (N=%d)' %
                    (test_accuracy * 100, len(test_bottlenecks)))

INFO:tensorflow:2018-03-20 23:21:07.670677: Step 0: Cross entropy = 1.599881, Train accuracy = 38.0%
INFO:tensorflow:2018-03-20 23:21:07.827266: Step 0: Validation accuracy = 31.0% (N=100)
INFO:tensorflow:2018-03-20 23:21:11.690599: Step 100: Cross entropy = 1.187948, Train accuracy = 73.0%
INFO:tensorflow:2018-03-20 23:21:11.730365: Step 100: Validation accuracy = 76.0% (N=100)
INFO:tensorflow:2018-03-20 23:21:15.606394: Step 200: Cross entropy = 1.030228, Train accuracy = 84.0%
INFO:tensorflow:2018-03-20 23:21:15.646094: Step 200: Validation accuracy = 72.0% (N=100)
INFO:tensorflow:2018-03-20 23:21:19.494418: Step 300: Cross entropy = 0.911234, Train accuracy = 82.0%
INFO:tensorflow:2018-03-20 23:21:19.533439: Step 300: Validation accuracy = 77.0% (N=100)
INFO:tensorflow:2018-03-20 23:21:23.396786: Step 400: Cross entropy = 0.785340, Train accuracy = 82.0%
INFO:tensorflow:2018-03-20 23:21:23.436329: Step 400: Validation accuracy = 82.0% (N=100)
INFO:tensorflow:2018-03-20 23:21:27.291



没有 Dropout的时候： INFO:tensorflow:Final test accuracy = 91.6% (N=369)

加了一层dropout训练（keep_prob=0.8)： INFO:tensorflow:Final test accuracy = 91.9% (N=369)

提升了一点点。。

keep_prob=0.5： INFO:tensorflow:Final test accuracy = 92.7% (N=369)
略有提升

将learning rate 改成0.001， Final test accuracy = 88.1% (N=369)  ， 会下降。