[View in Colaboratory](https://colab.research.google.com/github/JozeeLin/google-tensorflow-exercise/blob/master/%E5%8D%B7%E7%A7%AF%E7%A5%9E%E7%BB%8F%E7%BD%91%E7%BB%9C.ipynb)

## 卷积神经网络简介
卷积神经网络最初是为解决图像识别等问题设计的，当然其现在的应用不仅限于图像和视频，也可用于时间序列信号，比如音频信号、文本数据等。
 
在早期的图像识别研究中，最大的挑战是如何组织特征，因为图像数据不像其他类型的数据那样可以通过人工理解来提取特征。
 
在股票预测等模型中，我们可以从原始数据中提取过往的交易价格波动、市盈率、市净率、盈利增长等金融因子，这既是特征工程。
 
但是在图像中，我们很难根据人为理解提取出有效而丰富的特征。
 
在深度学习出现之前，我们必须辅助SIFT、HoG等算法提取具有良好区分性的特征，再集合SVM等机器学习算法进行图像识别。
 
SIFT对一定程度内的缩放、平移、旋转、视角改变、亮度调整等畸变，都具有不变性，是当时最重要的图像特征提取方法之一。
 
在之前只能依靠SIFT等特征提取算法才能勉强进行可靠的图像识别。
 
CNN可以直接使用图像的原始像素作为输入，而不必先使用SIFT等算法提取特征，减轻了使用传统算法如SVM时必需要做得大量重复、繁琐的数据预处理工作。
 
CNN的最大特点在于卷积的权值共享结构，可以大幅减少神经网络的参数量，防止过拟合的同时又降低了神经网络的复杂度。
 
一般的卷积神经网络由多个卷积层构成，每个卷积层中通常会进行如下几个操作:
 1. 图像通过多个不同的卷积核的滤波，并加偏置(bias),提取出局部特征，每一个卷积核会映射出一个新的2D图像
 2. 将前面卷积核的滤波输出结果，进行非线性的激活函数处理。目前最常见的是使用ReLU函数，而以前sigmoid函数用得比较多
 3. 对激活函数的结果再进行池化操作(即降采样，比如将2x2的图片降为1x1的图片)，目前一般是使用最大池化，保留最显著的特征，并提升模型的畸变容忍能力
 <br>
 
这几个步骤就构成了最常见的卷积层，当然也可以再加上一个LRN（局部响应归一化层）层，目前非常流行的Trick还有Batch Normalization等。
 
一个卷积层中可以有多个不同的卷积核，而每一个卷积核都对应一个滤波后映射出的新图像，同一个新图像中每一个图像都来自完全相同的卷积核，这就是卷积核的全职共享。
 
这一小块区域内的像素是相互关联的，每一个神经元不需要接收全部像素点的信息，只需要接收局部的像素点作为输入，而后将所有这些神经元收到的局部信息综合起来就可以得到全局的信息。
这样就可以将之前的全连接的模式修改为局部连接，之前隐含层的每一个隐含节点都和全部像素相连，现在我们只需要将每一个隐含节点连接到局部的像素节点。
 
通过局部连接的方法，将连接数从1万亿降低到1亿，但仍然偏多，需要继续降低参数量。现在隐含层每一个节点都与10x10的像素相连，也就是每一个隐含节点都拥有100个参数。假设我们的局部连接方式是卷积操作，即默认每一个**隐藏节点的参数都完全一样**，那我们的参数不再是1亿，而是100。**参数量只跟卷积核的大小有关，这也就是所谓的权值共享**。
 
卷积神经网络的要点就是局部连接、权值共享和池化层中的降采样。其中，局部连接和权值共享降低了参数数量，使训练复杂度降低，并减轻了过拟合。
 
同时权值共享还赋予了卷积神经网络对平移的容忍性，而池化层降采样则进一步降低了输出餐数量，并赋予模型对轻度型变得容忍性，提高了模型的泛化能力。
 
LeNet5当时的特性有如下几点：
1. 每个卷积层包含三个部分：卷积、池化和非线性激活函数
2. 使用卷积提取空间特征
3. 降采样(Subsample)的平均池化层(Average Pooling)
4. 双曲正切(Tanh)或S型(Sigmoid)的激活函数
5. MLP作为最后的分类器
6. 层与层之间的稀疏连接减少计算复杂度
  

In [15]:
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
sess = tf.InteractiveSession()

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


In [0]:
def weight_variable(shape):
  #用于重复初始化权重和偏置
  
  #使用截断的正态分布噪声来打破完全对称，标准差设为0.1
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)

def bias_variable(shape):
  #使用ReLU，也需要给偏置增加一些小的正值0.1来避免死亡节点
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)

卷积层、池化层也是接下来要重复使用的，因此也为他们分别定义创建函数。

In [0]:
def conv2d(x,W):
  '''
  tf.nn.conv2d是tf中的2维卷积函数，参数中x是输入，W是卷积的参数，如[5,5,1,32],前两个表示卷积核的尺寸，
  第三个表示有多少个channel，因为是灰度，所以为1，如果是RGB的为3，最后一个表示卷积核的数量
  '''
  return tf.nn.conv2d(x,W,strides=[1,1,1,1], padding='SAME')

def max_pool_2x2(x):
  '''
  strides代表卷积模块移动的步长，都是1代表会不遗漏的划过的图片的每一个点
  padding代表卷积模板移动的步长，这里的SAME代表给边界加上padding让卷积的输出和输入保持同样的尺寸
  tf.nn.max_pool是tf中的最大池化函数，这里使用2x2的最大池化，即将一个2x2的像素降为1x1的像素。最大池化会保留原始像素块中灰度值最高的那一个像素，即保留最显著的特征。
  '''
  return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1],padding='SAME')

In [0]:
x = tf.placeholder(tf.float32, [None, 784])
y_ = tf.placeholder(tf.float32, [None, 10])
#因为卷积神经网络会利用空间结构信息，因此需要将1D的输入向量转为2D的图片结构，即从1x784的形式转为原始的28x28的结构。[-1,28,28,1]，-1表示样本数量不固定，1代表颜色通道数量
x_image = tf.reshape(x, [-1,28,28,1])

In [0]:
#定义第一个卷积层。使用前面写好的函数进行参数初始化，包括weights和bias
W_conv1 = weight_variable([5,5,1,32])
b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1)+b_conv1)
h_pool1 = max_pool_2x2(h_conv1)#对卷积的输出结果进行池化操作

In [0]:
#定义第二个卷积层，唯一不同的是，这一层的卷积核数量为64
W_conv2 = weight_variable([5,5,32,64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2)+b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

In [0]:
W_fc1 = weight_variable([7*7*64,1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat,W_fc1)+b_fc1)

In [0]:
#为了减轻过拟合，使用一个dropout层。在训练时，我们通过随机丢弃一部分节点的数据来减轻过拟合，预测时则保留全部数据来追求最好的预测性能
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

In [0]:
W_fc2 = weight_variable([1024,10])
b_fc2 = bias_variable([10])
#最后我们将dropout层的输出连接一个softmax层，得到最后的概率输出
y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop,W_fc2)+b_fc2)

In [0]:
#定义损失函数为cross entropy和之前一样，但是优化器使用Adam，并使用一个比较小的学习速率
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_*tf.log(y_conv),reduction_indices=[1]))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

In [0]:
#再继续定义评测准确率的操作
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

In [26]:
#开始训练过程。首先初始化所有参数，设置训练时dropout的keep_prob为0.5
tf.global_variables_initializer().run()
for i in range(20000):
  batch = mnist.train.next_batch(50)
  if i%100 == 0:
    train_accuracy = accuracy.eval(feed_dict={x:batch[0],y_:batch[1],keep_prob:1.0})
    
  train_step.run(feed_dict={x:batch[0], y_:batch[1], keep_prob:0.5})

TypeError: ignored

In [27]:
print('test accuracy %g' % accuracy.eval(feed_dict={
    x:mnist.test.images,y_:mnist.test.labels,keep_prob:1.0
}))

test accuracy 0.9911


## 实现进阶的卷积网络
本节使用的数据集是CIFAR-10，包含60000张32x32的彩色图像，其中训练集为50000张，测试集为10000张。10种类别。每一类6000张，分别为airplane,automobile,bird,cat,deer,dog,frog,horse,ship,truck。

在这个卷积神经网络中，我们使用一些新的技巧
- 对weight进行L2的正则化
- 对图片进行了翻转、随机剪切等数据增强，制造了更多样本
- 在每个卷积-最大池化层后面使用了LRN层，增强了模型的泛化能力

In [51]:
!git clone https://github.com/tensorflow/models.git 'mymodels'

Cloning into 'mymodels'...
remote: Counting objects: 16005, done.[K
remote: Compressing objects: 100% (40/40), done.[K
remote: Total 16005 (delta 8), reused 16 (delta 6), pack-reused 15958[K
Receiving objects: 100% (16005/16005), 423.97 MiB | 36.35 MiB/s, done.
Resolving deltas: 100% (9459/9459), done.
Checking out files: 100% (2161/2161), done.


In [52]:
!ls

datalab  MNIST_data  models  mymodels


In [64]:
!ls mymodels/tutorials/image/

alexnet  cifar10  cifar10_estimator  imagenet  __init__.py  mnist  __pycache__


In [0]:
import numpy as np
import time

In [66]:
from keras.datasets import cifar10

Using TensorFlow backend.


In [0]:
import os
os.chdir('mymodels/tutorials/image/cifar10')

In [0]:
import cifar10,cifar10_input

In [0]:
max_steps = 3000
batch_size = 128

In [71]:
!wget http://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz /tmp/

--2018-05-03 05:24:34--  http://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz
Resolving www.cs.toronto.edu (www.cs.toronto.edu)... 128.100.3.30
Connecting to www.cs.toronto.edu (www.cs.toronto.edu)|128.100.3.30|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 170052171 (162M) [application/x-gzip]
Saving to: ‘cifar-10-binary.tar.gz’


2018-05-03 05:25:19 (3.67 MB/s) - ‘cifar-10-binary.tar.gz’ saved [170052171/170052171]

/tmp/: Scheme missing.
FINISHED --2018-05-03 05:25:19--
Total wall clock time: 45s
Downloaded: 1 files, 162M in 44s (3.67 MB/s)


In [0]:
!tar -zxf cifar-10-binary.tar.gz

In [0]:
data_dir = 'cifar-10-batches-bin'

In [0]:
def variable_with_weight_loss(shape, stddev, w1):
  var = tf.Variable(tf.truncated_normal(shape,stddev=stddev))
  if w1 is not None:
    weight_loss = tf.multiply(tf.nn.l2_loss(var),w1,name='weight_loss')
    tf.add_to_collection('losses',weight_loss)
  return var

In [80]:
#使用cifar10类下载数据集，并解压、展开到其默认位置
cifar10.maybe_download_and_extract()

UnrecognizedFlagError: ignored

### 数据增强(Data Augmentation)
数据增强操作包括随机的水平翻转、随机剪切一块24x24大小的图片(tf.random_crop)、设置随机的量度和对比度(tf.image.random_brightness、tf.image.random_contrast),

以及对数据进行标准化tf.image.per_image_whitening(对数据减去均值，除以方差，保证数据0均值，方差为1)。通过这些操作，我们可以获得更多的样本(带噪声的)，

原来的一张图片样本可以变为多张图片，相当于扩大样本量，对提高准确率非常有帮助。但是数据增强操作需要耗费大量的CPU时间。

In [81]:
#使用distorted_inputs函数产生训练需要使用的数据，包括特征及其对应的label，这里返回的是已经封装好的tensor
images_train,labels_train=cifar10_input.distorted_inputs(data_dir=data_dir, batch_size=batch_size)

Filling queue with 20000 CIFAR images before starting to train. This will take a few minutes.


In [0]:
#生成测试数据
images_test, labels_test = cifar10_input.inputs(eval_data=True,data_dir=data_dir,batch_size=batch_size)

In [0]:
#输入数据的placeholder，包括特征和label
image_holder = tf.placeholder(tf.float32, [batch_size,24,24,3])
label_holder = tf.placeholder(tf.int32,[batch_size])

In [0]:
weight1 = variable_with_weight_loss(shape=[5,5,3,64],stddev=5e-2,w1=0.0)
kernel1 = tf.nn.conv2d(image_holder,weight1,[1,1,1,1],padding='SAME')
bias1 = tf.Variable(tf.constant(0.0, shape=[64]))
conv1 = tf.nn.relu(tf.nn.bias_add(kernel1, bias1))
pool1 = tf.nn.max_pool(conv1, ksize=[1,3,3,1], strides=[1,2,2,1],padding='SAME')
norm1 = tf.nn.lrn(pool1, 4, bias=1.0, alpha=0.01/9.0,beta=0.75)

In [0]:
#创建第二个卷积层
weight2 = variable_with_weight_loss(shape=[5,5,64,64],stddev=5e-2,w1=0.0)
kernel2 = tf.nn.conv2d(norm1, weight2,[1,1,1,1],padding='SAME')
bias2 = tf.Variable(tf.constant(0.1, shape=[64]))
conv2 = tf.nn.relu(tf.nn.bias_add(kernel2, bias2))
norm2 = tf.nn.lrn(conv2,4,bias=1.0, alpha=0.001/9.0, beta=0.75)
pool2 = tf.nn.max_pool(norm2, ksize=[1,3,3,1], strides=[1,2,2,1],padding='SAME')

In [0]:
#全连接层，先把第二个卷积层的输出结果flatten
reshape = tf.reshape(pool2, [batch_size,-1])
dim = reshape.get_shape()[1].value
weight3 = variable_with_weight_loss(shape=[dim, 384], stddev=0.04, w1=0.004)
bias3 = tf.Variable(tf.constant(0.1, shape=[384]))
local3 = tf.nn.relu(tf.matmul(reshape,weight3)+bias3)

In [0]:
#全连接层，隐藏节点下降一半
weight4 = variable_with_weight_loss(shape=[384,192], stddev=0.04, w1=0.004)
bias4 = tf.Variable(tf.constant(0.1, shape=[192]))
local4 = tf.nn.relu(tf.matmul(local3, weight4)+bias4)

In [0]:
#最后一层
weight5 = variable_with_weight_loss(shape=[192,10], stddev=1/192.0, w1=0.0)
bias5 = tf.Variable(tf.constant(0.0, shape=[10]))
logits = tf.add(tf.matmul(local4, weight5),bias5)

In [0]:
def loss(logits, labels):
  labels = tf.cast(labels, tf.int64)
  cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
      logits = logits, labels=labels, name='cross_entropy_per_example'
  )
  cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy')
  tf.add_to_collection('losses', cross_entropy_mean)
  
  return tf.add_n(tf.get_collection('losses'), name='total_loss')

In [0]:
loss = loss(logits,label_holder)

In [0]:
train_op = tf.train.AdamOptimizer(1e-3).minimize(loss)

In [0]:
top_k_op = tf.nn.in_top_k(logits, label_holder, 1)

In [0]:
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()

In [98]:
tf.train.start_queue_runners()

[<Thread(QueueRunnerThread-input_producer-input_producer/input_producer_EnqueueMany, started daemon 140556638086912)>,
 <Thread(QueueRunnerThread-shuffle_batch/random_shuffle_queue-shuffle_batch/random_shuffle_queue_enqueue, started daemon 140556629694208)>,
 <Thread(QueueRunnerThread-shuffle_batch/random_shuffle_queue-shuffle_batch/random_shuffle_queue_enqueue, started daemon 140556621301504)>,
 <Thread(QueueRunnerThread-shuffle_batch/random_shuffle_queue-shuffle_batch/random_shuffle_queue_enqueue, started daemon 140556612908800)>,
 <Thread(QueueRunnerThread-shuffle_batch/random_shuffle_queue-shuffle_batch/random_shuffle_queue_enqueue, started daemon 140556604516096)>,
 <Thread(QueueRunnerThread-shuffle_batch/random_shuffle_queue-shuffle_batch/random_shuffle_queue_enqueue, started daemon 140556595861248)>,
 <Thread(QueueRunnerThread-shuffle_batch/random_shuffle_queue-shuffle_batch/random_shuffle_queue_enqueue, started daemon 140556587468544)>,
 <Thread(QueueRunnerThread-shuffle_batch/

In [99]:
for step in range(max_steps):
  start_time = time.time()
  image_batch, label_batch=sess.run([images_train,labels_train])
  _, loss_value = sess.run([train_op, loss],
                          feed_dict={image_holder:image_batch, label_holder:label_batch})
  duration = time.time()-start_time
  
  if step % 10 ==0:
    examples_per_sec = batch_size/duration
    sec_per_batch=float(duration)
    
    format_str=('step %d, loss=%.2f(%.1f examples/sec; %.3f sec/batch)')
    print(format_str % (step, loss_value, examples_per_sec, sec_per_batch))

step 0, loss=4.68(152.4 examples/sec; 0.840 sec/batch)
step 10, loss=3.76(176.4 examples/sec; 0.726 sec/batch)
step 20, loss=3.09(175.7 examples/sec; 0.729 sec/batch)
step 30, loss=2.59(172.1 examples/sec; 0.744 sec/batch)
step 40, loss=2.64(176.8 examples/sec; 0.724 sec/batch)
step 50, loss=2.40(177.1 examples/sec; 0.723 sec/batch)
step 60, loss=2.18(148.4 examples/sec; 0.862 sec/batch)
step 70, loss=1.99(176.5 examples/sec; 0.725 sec/batch)
step 80, loss=2.23(175.1 examples/sec; 0.731 sec/batch)
step 90, loss=2.05(178.3 examples/sec; 0.718 sec/batch)
step 100, loss=1.98(175.8 examples/sec; 0.728 sec/batch)
step 110, loss=2.02(180.2 examples/sec; 0.710 sec/batch)
step 120, loss=1.88(174.7 examples/sec; 0.733 sec/batch)
step 130, loss=1.96(178.4 examples/sec; 0.718 sec/batch)
step 140, loss=1.85(179.2 examples/sec; 0.714 sec/batch)
step 150, loss=1.77(177.8 examples/sec; 0.720 sec/batch)
step 160, loss=1.95(177.4 examples/sec; 0.722 sec/batch)
step 170, loss=1.73(178.9 examples/sec; 0.

step 580, loss=1.57(182.0 examples/sec; 0.703 sec/batch)
step 590, loss=1.41(179.5 examples/sec; 0.713 sec/batch)
step 600, loss=1.39(178.9 examples/sec; 0.716 sec/batch)
step 610, loss=1.52(179.5 examples/sec; 0.713 sec/batch)
step 620, loss=1.50(181.9 examples/sec; 0.704 sec/batch)
step 630, loss=1.28(181.5 examples/sec; 0.705 sec/batch)
step 640, loss=1.23(177.5 examples/sec; 0.721 sec/batch)
step 650, loss=1.58(180.2 examples/sec; 0.710 sec/batch)
step 660, loss=1.43(180.2 examples/sec; 0.710 sec/batch)
step 670, loss=1.35(180.2 examples/sec; 0.710 sec/batch)
step 680, loss=1.41(178.7 examples/sec; 0.716 sec/batch)
step 690, loss=1.39(180.0 examples/sec; 0.711 sec/batch)
step 700, loss=1.41(179.9 examples/sec; 0.711 sec/batch)
step 710, loss=1.31(178.1 examples/sec; 0.719 sec/batch)
step 720, loss=1.42(182.0 examples/sec; 0.703 sec/batch)
step 730, loss=1.34(182.0 examples/sec; 0.703 sec/batch)
step 740, loss=1.25(177.0 examples/sec; 0.723 sec/batch)
step 750, loss=1.28(177.7 examp

step 1160, loss=1.22(181.6 examples/sec; 0.705 sec/batch)
step 1170, loss=1.36(181.3 examples/sec; 0.706 sec/batch)
step 1180, loss=1.32(178.4 examples/sec; 0.717 sec/batch)
step 1190, loss=1.31(179.0 examples/sec; 0.715 sec/batch)
step 1200, loss=1.00(179.7 examples/sec; 0.712 sec/batch)
step 1210, loss=1.19(178.6 examples/sec; 0.717 sec/batch)
step 1220, loss=1.45(180.6 examples/sec; 0.709 sec/batch)
step 1230, loss=1.28(181.5 examples/sec; 0.705 sec/batch)
step 1240, loss=1.06(180.3 examples/sec; 0.710 sec/batch)
step 1250, loss=1.36(179.4 examples/sec; 0.714 sec/batch)
step 1260, loss=1.26(180.4 examples/sec; 0.710 sec/batch)
step 1270, loss=1.37(179.5 examples/sec; 0.713 sec/batch)
step 1280, loss=1.26(183.0 examples/sec; 0.699 sec/batch)
step 1290, loss=1.28(179.6 examples/sec; 0.713 sec/batch)
step 1300, loss=1.24(181.6 examples/sec; 0.705 sec/batch)
step 1310, loss=1.16(183.3 examples/sec; 0.698 sec/batch)
step 1320, loss=1.30(180.8 examples/sec; 0.708 sec/batch)
step 1330, los

step 1740, loss=1.28(182.2 examples/sec; 0.703 sec/batch)
step 1750, loss=1.18(179.9 examples/sec; 0.712 sec/batch)
step 1760, loss=1.08(180.1 examples/sec; 0.711 sec/batch)
step 1770, loss=1.24(179.9 examples/sec; 0.711 sec/batch)
step 1780, loss=1.15(179.8 examples/sec; 0.712 sec/batch)
step 1790, loss=1.17(180.7 examples/sec; 0.708 sec/batch)
step 1800, loss=1.12(181.8 examples/sec; 0.704 sec/batch)
step 1810, loss=1.01(183.3 examples/sec; 0.698 sec/batch)
step 1820, loss=1.14(181.7 examples/sec; 0.704 sec/batch)
step 1830, loss=1.06(183.4 examples/sec; 0.698 sec/batch)
step 1840, loss=1.06(179.0 examples/sec; 0.715 sec/batch)
step 1850, loss=1.12(181.7 examples/sec; 0.704 sec/batch)
step 1860, loss=1.17(182.7 examples/sec; 0.701 sec/batch)
step 1870, loss=1.05(183.6 examples/sec; 0.697 sec/batch)
step 1880, loss=1.24(181.3 examples/sec; 0.706 sec/batch)
step 1890, loss=1.21(179.1 examples/sec; 0.715 sec/batch)
step 1900, loss=1.24(181.0 examples/sec; 0.707 sec/batch)
step 1910, los

step 2320, loss=0.98(179.3 examples/sec; 0.714 sec/batch)
step 2330, loss=0.96(181.3 examples/sec; 0.706 sec/batch)
step 2340, loss=1.17(180.8 examples/sec; 0.708 sec/batch)
step 2350, loss=1.02(180.5 examples/sec; 0.709 sec/batch)
step 2360, loss=1.22(182.9 examples/sec; 0.700 sec/batch)
step 2370, loss=1.10(184.3 examples/sec; 0.695 sec/batch)
step 2380, loss=0.92(183.5 examples/sec; 0.698 sec/batch)
step 2390, loss=0.95(183.3 examples/sec; 0.698 sec/batch)
step 2400, loss=1.01(186.0 examples/sec; 0.688 sec/batch)
step 2410, loss=1.16(179.3 examples/sec; 0.714 sec/batch)
step 2420, loss=1.10(180.9 examples/sec; 0.708 sec/batch)
step 2430, loss=1.27(186.3 examples/sec; 0.687 sec/batch)
step 2440, loss=0.96(180.6 examples/sec; 0.709 sec/batch)
step 2450, loss=0.89(186.2 examples/sec; 0.688 sec/batch)
step 2460, loss=1.14(184.4 examples/sec; 0.694 sec/batch)
step 2470, loss=1.04(185.4 examples/sec; 0.691 sec/batch)
step 2480, loss=1.06(183.9 examples/sec; 0.696 sec/batch)
step 2490, los

step 2900, loss=1.10(181.6 examples/sec; 0.705 sec/batch)
step 2910, loss=1.04(183.5 examples/sec; 0.698 sec/batch)
step 2920, loss=1.08(186.6 examples/sec; 0.686 sec/batch)
step 2930, loss=0.97(185.2 examples/sec; 0.691 sec/batch)
step 2940, loss=1.00(183.6 examples/sec; 0.697 sec/batch)
step 2950, loss=1.01(183.0 examples/sec; 0.699 sec/batch)
step 2960, loss=1.23(188.1 examples/sec; 0.681 sec/batch)
step 2970, loss=1.09(183.7 examples/sec; 0.697 sec/batch)
step 2980, loss=0.98(182.6 examples/sec; 0.701 sec/batch)
step 2990, loss=1.26(181.8 examples/sec; 0.704 sec/batch)


In [100]:
num_examples = 10000
import math
num_iter = int(math.ceil(num_examples/batch_size))
true_count = 0
total_sample_count = num_iter*batch_size
step = 0
while step < num_iter:
  image_batch, label_batch = sess.run([images_test,labels_test])
  predictions = sess.run([top_k_op], feed_dict={image_holder:image_batch,label_holder:label_batch})
  true_count += np.sum(predictions)
  step+=1
  
  
precision = true_count/total_sample_count
print('precision @ 1 = %.3f' % precision)

precision @ 1 = 0.712
