# TensorFlow Tutorial
- NamHyuk Ahn

### Reference
- TensorFlowKR-2017-talk by Jongwook Choi, Beomjun Shin
    + https://wookayin.github.io/TensorFlowKR-2017-talk-bestpractice/ko/
- Stanford CS224d Lecture 7
    + http://cs244d.stanford.edu/lectures/CS244d-Lecture7.pdf(X)
    + 현재 주소 : http://web.stanford.edu/class/cs224n/lectures/cs224n-2017-tensorflow.pdf

### What is TensorFlow?
- 구글에서 개발한 딥러닝 오픈소스 라이브러리 (Open-source deep learning library developed by Google)
- why do people use TensorFlow?
    + Easy numerical computation
    + <font color='red'>Automatic derivatives</font> computation
    + Deploy computation to multiple CPUs or GPUs
    + 강력한 데이터 시각화 툴킷 (Powerful data visualization toolkit)
- https://github.com/tensorflow/tensorflow

### What is Tensor?
- Terminology used in math and physics

- Simply, tensor is a expansion of vector and matrix (not correct actually...)

- Zero-order tensor: Scalar
- First-order tensor: Vector
- Second-order tensor: Matrix

- So, tensor can be represented as a N-Dimension array

### If you know Numpy* 

In [1]:
import numpy as np
a = np.zeros((2, 2)); b = np.ones((2, 2))
np.sum(b, axis=1)

array([ 2.,  2.])

In [2]:
a.shape

(2, 2)

* Python library for numerical computation

### Repeat in Tensorflow

In [50]:
import tensorflow as tf
sess = tf.Session()

In [51]:
a = tf.zeros((2, 2)); b = tf.ones((2, 2))

In [62]:
print(a) #모든 원소의 값이 0인 Tensor
print(b) #모든 원소의 값이 1인 Tensor

Tensor("zeros_18:0", shape=(2, 2), dtype=float32)
Tensor("ones_1:0", shape=(2, 2), dtype=float32)


In [65]:
sess.run(tf.reduce_sum(b, axis=1)) #모든 원소의 값이 1인 2x2텐서를 1x2로 합 : [2, 2]

array([ 2.,  2.], dtype=float32)

In [63]:
sess.run(tf.reduce_sum(a, axis=1))

array([ 0.,  0.], dtype=float32)

In [54]:
a.get_shape()

TensorShape([Dimension(2), Dimension(2)])

In [55]:
a.get_shape()

TensorShape([Dimension(2), Dimension(2)])

In [67]:
sess.run(tf.reshape(a, (1, 4)))

array([[ 0.,  0.,  0.,  0.]], dtype=float32)

In [68]:
sess.run(tf.reshape(b, (1, 4)))

array([[ 1.,  1.,  1.,  1.]], dtype=float32)

### sess.run()

In [69]:
sess = tf.Session()
a = np.zeros((2, 2)); ta = tf.zeros((2, 2))
print(a)

[[ 0.  0.]
 [ 0.  0.]]


In [70]:
print(ta)

Tensor("zeros_19:0", shape=(2, 2), dtype=float32)


ta <- 
Tensorflow first defines a <font color='red'>computation graph</font>
that has no numerical value until evaluated.
(very similar concept to Theano)

텐서플로우는 Computation Graph를 먼저 정의함. Theano와 매우 유사한 개념

In [74]:
print(sess.run(ta))

[[ 0.  0.]
 [ 0.  0.]]


In [76]:
print(sess.run(ta+1*2))

[[ 2.  2.]
 [ 2.  2.]]


### Computation Graph

- In TensorFlow program, it first assembles a <font color='blue'>graph</font>, and uses a session to <font color='blue'>execute</font> ops in the graph.

In [11]:
a = tf.constant(5.0)
b = tf.constant(6.0)
c = a * b

In [78]:
print(c)

Tensor("mul:0", shape=(), dtype=float32)


- Define a <font color='red'>computation graph.</font> At this time, we cannot see real value.

위 시점에서, Computation Graph르 정의함, 이 시점에서 실제 값을 볼 수 없음

In [12]:
sess = tf.Session()
print(sess.run(c))

30.0


- Evaluate pre-defined computation graph. Graph evaluation must be followed by tf.<font color='blue'>Session</font> creation.

- 미리 정의된 그래프, sess.run에서 계산

### TensorFlow Session

- A <font color='blue'>Session</font> object encapsulates the environment in which operation and tensor objects are evaluated

- Both type of the session creation has same behavior

In [13]:
sess = tf.Session()
print(sess.run(c))

30.0


In [14]:
with tf.Session() as sess:
    print(sess.run(c))
    print(c.eval())

30.0
30.0


c.eval() <- Syntactic sugar for sess.run(c) in the current active session. (Must locate inside of the tf.<font color='blue'>Session</font> scope)

### TensorFlow Variables

- All tensors used previously were constant tensors
- To train a model, we need variable type tensor to hold and update values
  
  모델을 학습하기 위해, 값을 hold하고 업데이트하기 위한 variable 타입의 텐서가 필요함

In [79]:
#Attempting to use uninitialized value weight
w = tf.Variable(tf.zeros((2, 2)), name="weight")
with tf.Session() as sess:
    print(sess.run(w))

FailedPreconditionError: Attempting to use uninitialized value weight_2
	 [[Node: weight_2/_0 = _Send[T=DT_FLOAT, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_3_weight_2", _device="/job:localhost/replica:0/task:0/gpu:0"](weight_2)]]
	 [[Node: weight_2/_1 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_3_weight_2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

### Variables need <font color='red'>initialization</font>!

In [85]:
w = tf.Variable(tf.random_normal([5, 2], stddev=0.1), name="weight") #5x2 shape의 random value를 같는 variable 생성
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer()) # 다른 op가 수행되기 전에 실행되어야 함.
    print(sess.run(w))

[[-0.04597397  0.01791486]
 [-0.16906111 -0.03670346]
 [-0.12624092  0.12996647]
 [-0.18826525 -0.04081801]
 [ 0.09149408 -0.01670777]]


tf.<font color='blue'>Variable</font> can be initialized from constants or random values

variable 타입은 상수나 랜덤값으로 초기화될 수 있음

tf.<font color='blue'>Variable</font> needs initialization step.

variable 타입은 반드시 초기화 단계를 거쳐야 함

### Updating Variables

In [94]:
state = tf.Variable(0, name="counter") # 상수 0으로 variable 생성 : 변수
new_value = tf.add(state, tf.constant(1)) # 1을 더하는 op : 함수?
update = tf.assign(state, new_value) # 업데이트를 적용하는 op : 함수?

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(state))
    for _ in range(3): # 0~3 반복
        sess.run(update)
        print(sess.run(state))

0
1
2
3


### Fetching Variables

In [100]:
x1 = tf.constant(1)
x2 = tf.constant(2)
x3 = tf.constant(3)

temp = tf.add(x2, x3)
mul = tf.multiply(x1, temp)

with tf.Session() as sess:
    result1, result2 = sess.run([mul, temp])
    print(result1, result2)

5 5


Calling sess.run(var) outputs its value. 

Of course, we can output multiple tensors

Simultaneously with sess.run([var1, ... ,])

<img src="https://github.com/you359/aiclass/blob/master/1632036005_YOUNGJINKIM/files/tensorflow-tutorial-22-638.jpg?raw=true"/>

### TemsorFlow Placeholder

- All previous examples have manually defined tensors
    + How can we input external data into TensorFlow?
    + Most simple way is using tf.placeholder and feed_dict

In [102]:
a = tf.placeholder(tf.int16)
b = tf.placeholder(tf.int16)

Create a <font color='red'>placeholder</font> to hold data.

tf.placeholder is dummy node to provide entry points for data to computation graph.

placeholder는 계산 시, 데이터를 위한 entry point를 제공하기 위한 더미 노드임

In [103]:
add = tf.add(a, b)
mul = tf.multiply(a, b)

#runtime에서 데이터가 들어감?
with tf.Session() as sess:
    print(sess.run(add, feed_dict={a: 2, b: 3}))
    print(sess.run(mul, feed_dict={a: 2, b: 3}))

5
6


A feed_dict is a python dictionary from tf.placeholder to data value

In [104]:
# using tf.constant
matrix1 = tf.constant([[3., 3.]])
matrix2 = tf.constant([[2.], [2.]])
product = tf.matmul(matrix1, matrix2)

with tf.Session() as sess:
    result = sess.run(product)
    print(result)
    
# using placeholder
import numpy as np

matrix1 = tf.placeholder(tf.float32, [1, 2])
matrix2 = tf.placeholder(tf.float32, [2, 1])
product = tf.matmul(matrix1, matrix2)

with tf.Session() as sess:
    mv1 = np.array([[3., 3.]])
    mv2 = np.array([[2.], [2.]])
    result = sess.run(product, feed_dict={matrix1: mv1, matrix2: mv2})
    print(result)

[[ 12.]]
[[ 12.]]


### Example = MNIST with MLP

In [105]:
import tensorflow as tf

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data", one_hot=True)

learning_rate = 0.001
max_steps = 15000
batch_size = 128

x = tf.placeholder(tf.float32, [None, 784])
y = tf.placeholder(tf.float32, [None, 10])

Extracting MNIST_data\train-images-idx3-ubyte.gz
Extracting MNIST_data\train-labels-idx1-ubyte.gz
Extracting MNIST_data\t10k-images-idx3-ubyte.gz
Extracting MNIST_data\t10k-labels-idx1-ubyte.gz


In [106]:
def MLP(inputs): #3Layer MLP(784- 256 - 10)
    w_1 = tf.Variable(tf.random_normal([784,256]))
    b_1 = tf.Variable(tf.zeros([256]))
    
    w_2 = tf.Variable(tf.random_normal([256,256]))
    b_2 = tf.Variable(tf.zeros([256]))
    
    w_out = tf.Variable(tf.random_normal([256, 10]))
    b_out = tf.Variable(tf.zeros([10]))
    
    h_1 = tf.add(tf.matmul(inputs, w_1), b_1)
    h_1 = tf.nn.relu(h_1)
    
    h_2 = tf.add(tf.matmul(h_1, w_2), b_2)
    h_2 = tf.nn.relu(h_2)
    
    out = tf.add(tf.matmul(h_2, w_out), b_out)
    
    return out

net = MLP(x)

# define loss and optimizer
loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=net, labels=y))
opt = tf.train.AdamOptimizer(learning_rate).minimize(loss_op)

In [107]:
# initializing the variables
init_op = tf.global_variables_initializer()

sess = tf.Session()
sess.run(init_op)

# train model
for step in range(max_steps):
    batch_X, batch_y = mnist.train.next_batch(batch_size)
    _, loss = sess.run([opt, loss_op], feed_dict={x: batch_X, y: batch_y})
    
    if (step+1) % 1000 == 0:
        print("[{}/{}] loss:{:.3f}".format(step+1, max_steps, loss))
print("Optimization Finished!")

#test model
correct_prediction = tf.equal(tf.argmax(net, 1), tf.argmax(y, 1))

#calculate accuracy
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print("Train accuracy: {:.3f}".format(sess.run(accuracy, feed_dict={x: mnist.train.images, y: mnist.train.labels})))
print("Test accuracy: {:.3f}".format(sess.run(accuracy, feed_dict={x: mnist.test.images, y: mnist.test.labels})))

[1000/15000] loss:16.770
[2000/15000] loss:14.138
[3000/15000] loss:22.632
[4000/15000] loss:3.010
[5000/15000] loss:1.108
[6000/15000] loss:1.487
[7000/15000] loss:0.400
[8000/15000] loss:0.000
[9000/15000] loss:0.969
[10000/15000] loss:0.539
[11000/15000] loss:0.000
[12000/15000] loss:0.546
[13000/15000] loss:0.000
[14000/15000] loss:0.000
[15000/15000] loss:0.715
Optimization Finished!
Train accuracy: 0.997
Test accuracy: 0.951


### Training a model
- <font color='red'>tf.nn.softmax_cross_entropy_with_logits</font> is for calculate cross entropy loss between predictions and labels (instead, you can make own loss function)
- <font color='red'>tf.train.####Optimizer</font> create an optimizer such as SGD, RMSProp, Adam etc..*
- <font color='red'>tf.train.####Optimizer.minimize(loss_op)</font> adds optimization operation to computation graph
- we don't have to be concern about computing gradients and updating variables
    * https://www.tensorflow/org/api_docs/python/train/optimizers(X)
    + https://www.tensorflow.org/api_docs/python/tf/train/Optimizer

### Variable Scope
- Complicated TensorFlow models can have hundreds of variables - 모델은 수백개 이상의 variable를 갖음, 따라서 분리가 필요
    - tf.variable_scope() provides simples name-spacing to avoid clashes
    - tf.get_variable() creates/accesses variables from within a variable scope

### tf.variable_scope()
- Variable scope is a simple type of name-spacing that adds prefixes to variable names within scope

In [25]:
var1 = tf.Variable([1], name="var")
with tf.variable_scope("foo"):
    with tf.variable_scope("bar"):
        var2 = tf.Variable([1], name="var")
        var3 = tf.Variable([1], name="var")
        
print("var1 {}".format(var1.name))
print("var2 {}".format(var2.name))
print("var3 {}".format(var3.name))

var1 var:0
var2 foo/bar/var:0
var3 foo/bar/var_1:0


### tf.get_variable()
- tf.Variable doesn't provide variable reuse => variable 재사용을 제공하지 않음
    + Variable reuse is necessary to implement RNN or Recursive NN => RNN이나 Recursive NN을 구현하기 위해서는 재사용이 필요

In [108]:
var1 = tf.Variable([1], name="var")
with tf.variable_scope("foo"):
    with tf.variable_scope("bar") as scp:
        var2 = tf.Variable([1], name="var")
        scp.reuse_variables() # allow resue variables
        var3 = tf.Variable([1], name="var")
        
print("var1 {}".format(var1.name))
print("var2 {}".format(var2.name))
print("var3 {}".format(var3.name))

var1 var_3:0
var2 foo_4/bar/var:0
var3 foo_4/bar/var_1:0


We want var3 refer to foo/bar/var:0 But impossible with tf.Variable
- This works

In [113]:
import tensorflow as tf
tf.reset_default_graph() #그래프 리셋

var1 = tf.get_variable("var", [1])
with tf.variable_scope("foo"):
    with tf.variable_scope("bar") as scp:
        var2 = tf.get_variable("var", [1])
        scp.reuse_variables() # allow resue variables
        var3 = tf.get_variable("var", [1])
        
print("var1 {}".format(var1.name))
print("var2 {}".format(var2.name))
print("var3 {}".format(var3.name))

var1 var:0
var2 foo/bar/var:0
var3 foo/bar/var:0


- Behavior depends on whether variable reuse enabled
- <font color='blue'>Case 1</font>: reuse flag set to False
    + Create new variable and return
    + Raise <font color='green'>ValueError</font> if there exists variable with given name
- <font color='blue'>Case 2</font>: reuse flag set to True
    + Search for existing variable with given name
    + Raise <font color='green'>ValueError</font> if not found

### Parameter sharing

In [118]:
import tensorflow as tf
tf.reset_default_graph() #그래프 리셋

with tf.variable_scope("foo"):
    with tf.variable_scope("bar") as scp:
        var1 = tf.get_variable("var", [1])
        scp.reuse_variables() # allow resue variables
        var2 = tf.get_variable("var", [1])
        
    with tf.variable_scope("bar", reuse=True):
        var3 = tf.get_variable("var", [1])
        
print("var1 {}".format(var1.name))
print("var2 {}".format(var2.name))
print("var3 {}".format(var3.name))

var1 foo/bar/var:0
var2 foo/bar/var:0
var3 foo/bar/var:0


### Tips
- <font color='red'>ALWAYS</font> use tf.get_variable instead of tf.Variable
- 항상 tf.get_variable을 사용하자
    + (tf.Variable doesn't provide parameter sharing)
    + tf.Variable 은 parameter sharing을 제공하지 않음
    + (Can't use initializer and regularizer in tf.Variable)
    + tf.variable은 initializer나 regularizer를 사용할 수 없음
- <font color='red'>GOOD scoping</font> is essential
    + When visualize and recognize network
    + 네트워크(모델)의 시각화나 이해를 위해
    + Easy to get variables and ops by name or suffix
    tf.###.get_variables_by_name("my_var", "my_scope")
    + this is useful when debug, load pre-trained model
    + 디버깅이나 pre-trained 모델을 로드할 때 유용함

### Example - MNIST with CNN
- Most of the code is same as previous MLP example except code for building network
- CNN model needs conv, pool and FC layer
    + tf.nn.conv2d(input, filter, strides, padding, name=None)
    + tf.nn.##_pool(value, ksize, strides, padding, name=None)
    + https://www.tensorflow.org/api_docs/python/

### Wrappers

In [119]:
from tensorflow.contrib.layers import variance_scaling_initializer
he_init = variance_scaling_initializer()

Another init strategy called He init introduced in PReLuNet

In [120]:
def conv(bottom, num_filter, ksize=3, stride=1, padding="SAME", scope=None):
    bottom_shape = bottom.get_shape().as_list()[3]
    # To calculate shape of previous(input) layer
    
    with tf.variable_scope(scope or "conv"):
        W = tf.get_variable("W", [ksize, ksize, bottom_shape, num_filter], initializer=he_init)
        b = tf.get_variable("b", [num_filter], initializer=tf.constant_initializer(0))
        
        x = tf.nn.conv2d(botton, W, strides=[1, stride, stride, 1], padding=padding)
        x = tf.nn.relu(tf.nn.bias_add(x, b))
        
    return x

In [121]:
def maxpool(bottom, ksize=2, stride=2, padding="SAME", scope=None):
    with tf.variable_scope(scope or "maxpool"):
        pool = tf.nn.max_pool(bottom, ksize=[1, ksize, ksize, 1], strides=[1, stride, stride, 1], padding=padding)
    return pool

In [122]:
def fc(bottom, num_dims, scope=None):
    bottom_shape = bottom.get_shape().as_list()
    if len(bottom_shape) > 2:
        bottom = tf.reshape(bottom, [-1, reduce(lambda x, y: x*y, bottom_shape[1:])])
        bottom_shape = bottom.get_shape().as_list()
    
    with tf.variable_scope(scope or "fc"):
        W = tf.get_variable("W", [bottom_shape[1], num_dims], initializer=he_init)
        b = tf.get_variable("b", [num_dims], initializer=tf.constant_initializer(0))
        out = tf.nn.bias_add(tf.matmul(bottom, W), b)
    return out

if dim of bottom layer is more than 2, flatten
   - output of conv has (N, H, W, C) shape
   - output of FC has (N, C) shape

In [123]:
def fc_relu(bottom, num_dims, scope=None):
    with tf.variable_scope(scope or "fc"):
        out = fc(bottom, num_dims, scope="fc")
        relu = tf.nn.relu(out)
        
    return relu

### All Together

In [124]:
keep_prob = tf.placeholder(tf.float32, None)

def conv_net(x, keep_prob):
    x = tf.reshape(x, shape=[-1, 28, 28, 1])
    
    conv1 = conv(x,32, 5, scope="conv_1")
    conv1 = maxpool(conv1, scope="maxpool_1")
    conv2 = conv(conv1, 64, 5, scope="conv_2")
    conv2 = maxpool(conv2, scope="maxpool_2")
    
    fc1 = fc_relu(conv2, 1024, scope="fc_1")
    fc1 = tf.nn.dropout(fc1, keep_prob)
    
    out = fc(fc1, 10, scope="out")
    return out

To hold keeping probability in dropout

Feed value between(0, 1.0) in training, feed 1.0 in test time

In [125]:
config = tf.ConfigProto(gpu_options=tf.GPUOptions(allow_growth=True))
sess = tf.Session(config=config)

### Session config
- by default, session consume all the GPU resource, no matter how a model capacity is
    + With Killing all the other process!!! :(

- Try this config

In [46]:
config = tf.ConfigProto(
    gpu_options=tf.GPUOptions(allow_growth=True))
sess = tf.Session(config=config)

- There are a bunch of configuration, see

  https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/protobuf/config.proto

### tf.contrib
- tf.contrib contains contributed code
    + Contain contributions that eventually should get merged into core Tensorflow
    + Code in tf.contrib isn't supported by the Tensorflow team
      it is included in the hope that it is helful, but it might change or be removed at any time
    + Many useful interfaces or utils exist
    + RNN, Seq2Seq, layers(wrapper) and slim are quite useful

### tf.contrib.slim
- Lightweight library in TensorFlow
    + Components of slim can be freely mixed with native TensorFlow
    + Gather useful tf.contrib component into tf.contrib.slim namespace

- Why tf.slim?
    + Makes building, training and evaluation neural networks simple
    + Allows to define models compactly by eliminating boilerplate code
    + Serveral widely used models have been developed in slim
    + Other useful utilities are provided (initializer, regularizers, losses ...)

In [126]:
import tensorflow.contrib.slim as slim

<img src="https://github.com/you359/aiclass/blob/master/1632036005_YOUNGJINKIM/files/tensorflow-tutorial-55-638.jpg?raw=true"/>

<img src="https://github.com/you359/aiclass/blob/master/1632036005_YOUNGJINKIM/files/tensorflow-tutorial-56-638.jpg?raw=true"/>

### Layers

In [133]:
import tensorflow as tf
input = ...
with tf.name_scope('conv1_1') as scope:
    kernel = tf.Variable(tf.truncated_normal([3, 3, 64, 128], dtype=tf.float32, stddev=1e-1), name="weights")
    conv = tf.nn.conv2d(input, kernel, [1, 1, 1, 1], padding='SAME')
    biases = tf.Variable(tf.constant(0.0, shape=[128], dtype=tf.float32), trainable=True, name='biases')
    bias = tf.nn.bias_add(conv, biases)
    conv1 = tf.nn.relu(bias, name=scope)

TypeError: Expected binary or unicode string, got Ellipsis

In TF-Slim code:

In [130]:
input = tf.placeholder(tf.float32)#...
net = slim.conv2d(input, 128,[3,3], padding = 'SAME', scope='conv1_1')

ValueError: ('Convolution not supported for input with rank', None)

In [134]:
# 1. simple network generation with slim
net = ...
net = slim.conv2d(net, 256, [3, 3], scope='conv3_1')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_2')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_3')
net = slim.max_pool2d(net, [2, 2], scope='pool3')

# 1. cleaner by repeat operation:
net = ...
net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3],
scope='conv3')
net = slim.max_pool(net, [2, 2], scope='pool3')

# 2. Verbose way:
x = slim.fully_connected(x, 32, scope='fc/fc_1')
x = slim.fully_connected(x, 64, scope='fc/fc_2')
x = slim.fully_connected(x, 128, scope='fc/fc_3')

# 2. Equivalent, TF-Slim way using slim.stack:
slim.stack(x, slim.fully_connected, [32, 64, 128], scope='fc')

TypeError: Expected binary or unicode string, got Ellipsis

### Initializer, Regularizer

- Initializer:
    + tf.truncated_normal_initializer (Included in tf)
    + slim.xavier_initializer
    + slim.variance_scaling_initializer

- Regularizer:
    + slim.l1_regularizer
    + slim.l2_regularizer
    + …

In [135]:
net = slim.conv2d(inputs, 64, [11, 11], 4, padding='SAME', weights_initializer=slim.xavier_initializer(),weights_regularizer=slim.l2_regularizer(0.0005),scope='conv1')

NameError: name 'inputs' is not defined

### argscope


- Eliminate boilerplate codes

In [136]:
he_init = slim.variance_scaling_initializer()
xavier_init = slim.xavier_initializer()

with slim.arg_scope([slim.conv2d, slim.fully_connected], activation_fn=tf.nn.relu, weights_initializer=he_init, weights_regularizer=slim.l2_regularizer(0.0005)):
    with slim.arg_scope([slim.conv2d], stride=1, padding='SAME'):
        net = slim.conv2d(inputs, 64, [11, 11], 4, scope='conv1')
        net = slim.conv2d(net, 256, [5, 5],weights_initializer=xavier_init,scope='conv2')
        net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc')

NameError: name 'inputs' is not defined

- Arguments are injected to [..]
- Arguments are overwritten

### Losses

- Easy to handle multiple loss and regularize loss

In [137]:
# Define the loss functions and get the total loss.
loss1 = slim.losses.softmax_cross_entropy(pred1, label1)
loss2 = slim.losses.mean_squared_error(pred2, label2)
# The following two lines have the same effect:
total_loss = loss1 + loss2
slim.losses.get_total_loss(add_regularization_losses=False)
# If you want to add regularization loss
reg_loss = tf.add_n(slim.losses.get_regularization_losses())
total_loss = loss1 + loss2 + reg_loss
# or
total_loss = slim.losses.get_total_loss()

NameError: name 'pred1' is not defined

### Collection

- Included in tf (not slim), anyway it's quite useful
    - GLOBAL_VARIABLES
    - TRAINABLE_VARIABLES
    - SUMMARIES
    - MOVING_AVERAGE_VARIABLES
    - REGULARIZATION_LOSSES
    - WEIGHTS
    - BIASES
    - ACTIVATIONS

In [138]:
@add_arg_scope
def add_loss(loss, loss_collection=ops.GraphKeys.LOSSES):
    if loss_collection:
        ops.add_to_collection(loss_collection, loss)


NameError: name 'add_arg_scope' is not defined

Add custom loss to LOSSES collection

In [139]:
def get_losses(scope=None, loss_collection=ops.GraphKeys.LOSSES):
    return ops.get_collection(loss_collection, scope)

def get_regularization_losses(scope=None):
    return ops.get_collection(ops.GraphKeys.REGULARIZATION_LOSSES, scope)

NameError: name 'ops' is not defined

Easy to get all losses with graph Collection

In [140]:
for var in tf.trainable_variables():
    print(var.name)
    
def trainable_variables():
    return ops.get_collection(ops.GraphKeys.TRAINABLE_VARIABLES)

foo/bar/var:0
conv1_1/weights:0
conv1_1_1/weights:0
conv1_1_2/weights:0
conv1_1_2/biases:0
conv1_1_4/weights:0
conv1_1_4/biases:0
conv1_1_5/weights:0
conv1_1_6/weights:0


Useful to visualize or debugging

### Save/Restore

- pages 71 ~ 74
- My save wrapper function in my custom network class

In [141]:
def save(self, ckpt_dir, global_step=None):
    if self.config.get("saver") is None:
        self.config["saver"] = \
            tf.train.Saver(max_to_keep=30)            

Create saver to save a model

In [142]:
    saver = self.config["saver"]
    sess = self.config["sess"]
    
    dirname = os.path.join(ckpt_dir, self.name)

    if not os.path.exists(dirname):
        os.makedirs(dirname)
    saver.save(sess, dirname, global_step)

NameError: name 'self' is not defined

Save a model in dirname directory.
global_step is for recording "current step" in filename

- My restore wrapper function

In [143]:
def load_latest_checkpoint(self, ckpt_dir, exclude=None):
    path = tf.train.latest_checkpoint(ckpt_dir)
    if path is None:
        raise AssertionError("No ckpt exists in {0}.".format(ckpt_dir))
    print("Load {} save file".format(path))
    self._load(path, exclude)
def load_from_path(self, ckpt_path, exclude=None):
    self._load(ckpt_path, exclude)
def _load(self, ckpt_path, exclude):
    init_fn = slim.assign_from_checkpoint_fn(ckpt_path,
        slim.get_variables_to_restore(exclude=exclude),
        ignore_missing_vars=True)
    init_fn(self.config["sess"])

Exclude exclude variables when load a model
If False, raise error when missing variables exist in checkpoint(save) file.
When fine-tune model, have to set True to avoid error

### VGG-16

In [144]:
def vgg_16(inputs,
           num_classes=1000,
           is_training=True,
           dropout_keep_prob=0.5,
           spatial_squeeze=True,
           scope='vgg_16'):
    with tf.variable_scope(scope, 'vgg_16', [inputs]) as sc:
        end_points_collection = sc.name + '_end_points'
        # collect outputs for conv2d, fully_connected, slim.max_pool2d],
        with slim.arg_scope([slim.conv2d, slim.fully_connected, slim.max_pool2d],
                           outputs_collections=end_points_collection):
            net = slim.report(input, 2, slim.conv2d, 64, [3,3],scope='conv1')
            net = slim.max_pool2d(net, [2,2], scope='pool1')
            net = slim.repeat(net, 2, slim.conv2d, 128, [3.3], scope='conv2')
            net = slim.max_pool2d(net, [2,2], scope='pool2')
            net = slim.repeat(net, 3, slim.conv2d, 256, [3,3], scope='conv3')
            net = slim.max_pool2d(net, [2,2], scope='pool3')
            net = slim.repeat(net, 3, slim.conv2d, 512, [3,3], scope='conv4')            
            net = slim.max_pool2d(net, [2,2], scope='pool4')
            net = slim.repeat(net, 3, slim.conv2d, 512, [3,3], scope='conv5')            
            net = slim.max_pool2d(net, [2,2], scope='pool5')
            # Use conv2d instead of fully_connected layers.
            net = slim.conv2d(net, 4096, [7,7], padding='VALID', scope='fc6')
            net = slim.dropout(net, dropout_keep_prob, is_training=is_training,
                              scope='dropout6')
            net = slim.conv2d(net, 4096, [1,1], scope='fc7')
            net = slim.dropout(net, dropout_keep_prob, is_training=is_training,
                              scope='dropout7')
            net = slim.conv2d(net, num_classes, [1,1],
                             activation_fn=None,
                             normalizer_fn=None,
                             scope='fc8') 
            # Convert end_points_collection into a end_point dict.
            end_points = slim.utils.convert_collection_to_dict(end_points_collection)
            if spatial_squeeze:
                net = tf.squeeze(net, [1,2], name='fc8/squeezed')
                end_points[sc.name + '/fc8'] = net
            return net, end_points
vgg_16.default_image_size = 224

def vgg_arg_scope(weight_decay=0.0005):
    with slim.arg_scope([slim.conv2d, slim.fully_connected],
                       activation_fn=tf.nn.relu,
                       weights_regularizer=slim.l2_regularizer(weight_decay),
                       biases_initializer=tf.zeros_initializer):
        with slim.arg_scope([slim.conv2d],padding='SAME') as arg_sc:
            return arg_sc

https://github.com/tensorflow/models/blob/master/slim/nets/vgg.py

In [152]:
X = tf.placeholder(tf.float32, [None, 224, 224, 3], name="X")
y = tf.placeholder(tf.int32, [None, 8], name="y")
is_training = tf.placeholder(tf.bool, name="is_training")
with slim.arg_scope(vgg.vgg_arg_scope()):
    net, end_pts = vgg.vgg_16(X, is_training=is_training,
                              num_classes=1000)
with tf.variable_scope("losses"):
    cls_loss = slim.losses.softmax_cross_entropy(net, y)
    reg_loss = tf.add_n(slim.losses.get_regularization_losses())
    loss_op = type_loss + reg_loss
with tf.variable_scope("opt"):
    opt = tf.train.AdamOptimizer(0.001).minimize(loss_op)
self.load_from_path(ckpt_path=VGG_PATH, exclude=["vgg_16/fc8"])
...

NameError: name 'vgg' is not defined

### TensorBoard
- Very useful visualization tool
    + Graph visualization, loss/accuracy plot, weight histogram plot ...

- Simple usage
1. Make a tf.summary.FileWriter to write summaries
2. Create summaries using tf.summary.###
3. Run summaries with sess.run(...)
4. Add summaries to FileWriter with add_summary API
5. Run command
   tensorboard -- logdir=## --host=## --port=##
6. Default port is 6006

Exclude fc8 layer to train last layer from scratch
To freeze other layers, trainable=False argument is needed in argscope

In [153]:
tf.summary.FileWriter

tensorflow.python.summary.writer.writer.FileWriter

In [154]:
tf.summary.tenboard

AttributeError: module 'tensorflow.python.summary.summary' has no attribute 'tenboard'

### TensorBoard & slim example

In [165]:
import tensorflow as tf
slim = tf.contrib.slim

# Import MINST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data", one_hot=True)

max_steps = 10000
batch_size = 128
lr = 0.001
keep_prob = 0.5
weight_decay = 0.0004
logs_path = "MNIST_data/tensorflow_logs/example"

def my_arg_scope(is_training, weight_decay):
    with slim.arg_scope([slim.conv2d],
        activation_fn=tf.nn.relu,
        weights_regularizer=slim.l2_regularizer(weight_decay),
        weights_initializer=slim.variance_scaling_initializer(),
        biases_initializer=tf.zeros_initializer,
        stride=1, padding="SAME"):
        with slim.arg_scope([slim.dropout],
                            is_training=is_training) as arg_sc:
            return  arg_sc

Extracting MNIST_data\train-images-idx3-ubyte.gz
Extracting MNIST_data\train-labels-idx1-ubyte.gz
Extracting MNIST_data\t10k-images-idx3-ubyte.gz
Extracting MNIST_data\t10k-labels-idx1-ubyte.gz


In [166]:
def my_net(x, keep_prob, outputs_collections="my_net"):
    x = tf.reshape(x, shape=[-1, 28, 28, 1])
    
    with slim.arg_scope([slim.conv2d, slim.max_pool2d], outputs_collections=outputs_collections):
        net = slim.conv2d(x, 64, [3, 3], scope="conv1")
        net = slim.max_pool2d(net, [2, 2], scope="pool1")
        net = slim.conv2d(net, 128, [3, 3], scope="conv2")
        net = slim.max_pool2d(net, [2, 2], scope="pool2")
        net = slim.conv2d(net, 256, [3, 3], scope="conv3")
        # global average pooling
        net = tf.reduce_mean(net, [1, 2], name="pool3", keep_dims=True)
        net = slim.dropout(net, keep_prob, scope="dropout3")
        net = slim.conv2d(net, 1024, [1, 1], scope="fc4")
        net = slim.dropout(net, keep_prob, scope="dropout4")
        net = slim.conv2d(net, 10, [1, 1],
                          activation_fn=None, scope="fc5")
    end_points = slim.utils.convert_collection_to_dict(outputs_collections)
    return tf.reshape(net, [-1, 10]), end_points

Gather variables into end_points

In [174]:
x = tf.placeholder(tf.float32, [None, 784])
y = tf.placeholder(tf.float32, [None, 10])
is_training = tf.placeholder(tf.bool)

with slim.arg_scope(my_arg_scope(is_training, weight_decay)):
    net, end_pts = my_net(x, keep_prob)
    pred = slim.softmax(net, scope="prediction")

with tf.variable_scope("losses"):
    cls_loss = slim.losses.softmax_cross_entropy(net, y)
    reg_loss = tf.add_n(slim.losses.get_regularization_losses())
    loss_op = cls_loss + reg_loss
    
with tf.variable_scope("Adam"):
    opt = tf.train.AdamOptimizer(lr)
    # Op to calculate every variable gradient
    grads = tf.gradients(loss_op, tf.trainable_variables())
    grads = list(zip(grads, tf.trainable_variables()))
    # Op to update all variables according to their gradient
    apply_grads = opt.apply_gradients(grads_and_vars=grads)
    
with tf.variable_scope("accuracy"):
    correct_op = tf.equal(tf.argmax(net, 1), tf.argmax(y, 1))
    acc_op = tf.reduce_mean(tf.cast(correct_op, tf.float32))

TypeError: __init__() got an unexpected keyword argument 'partition_info'

To gather grad to visualize

In [163]:
# Create a summary to monitor loss and accuracy
summ_loss = tf.summary.scalar("loss", loss_op)
summ_acc = tf.summary.scalar("accuracy_test", acc_op)

# Create summaries to visualize weights and grads
for var in tf.trainable_variables():
    tf.summary.histogram(var.name, var, collections=["my_summ"])
for grad, var in grads:
    tf.summary.histogram(var.name + "/gradient", grad,
collections=["my_summ"])

summ_wg = tf.summary.merge_all(key="my_summ")

sess = tf.Session()
sess.run(tf.global_variables_initializer())
summary_writer = tf.summary.FileWriter(logs_path,
                                       graph=sess.graph)

NameError: name 'acc_op' is not defined

In [164]:
for step in range(max_steps):
    batch_X, batch_y = mnist.train.next_batch(batch_size)
    _, loss, plot_loss, plot_wg = sess.run([apply_grads, loss_op,
                                           summ_loss, summ_wg],
                                           feed_dict={x: batch_X, y: batch_y, is_training: True})

    summary_writer.add_summary(plot_loss, step)
    summary_writer.add_summary(plot_wg, step)

    if (step+1) % 100 == 0:
        plot_acc = sess.run(summ_acc, feed_dict={x: mnist.test.images,
                                                 y: mnist.test.labels,
                                                 is_training: False})
        summary_writer.add_summary(plot_acc, step)
print("Optimization Finished!")

test_acc = sess.run(acc_op, feed_dict={x: mnist.test.images,
                                       y: mnist.test.labels,
                                       is_training: False})

print("Test accuracy: {:.3f}".format(test_acc))

NameError: name 'apply_grads' is not defined

### Graph visualization

<img src="https://raw.githubusercontent.com/you359/aiclass/master/1632036005_YOUNGJINKIM/files/graph%20visualization.JPG"/>