https://www.tensorflow.org/ 

![TensorFlow](images/tensorflow.png)

# Learning TensorFlow





## Install 

Install using conda in linux
```bash
conda create -n tensorflow python=3.5
source activate tensorflow
conda install pandas matplotlib jupyter notebook scipy scikit-learn
pip install tensorflow
```

## Activate/Deactivate python env


```bash
source activate tensorflow
deactivate tensorflow
```

## Write the first program 

In [129]:
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np
# Download the mnist dataset. 
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


In [130]:
import tensorflow as tf
import numpy as np
helloworld = tf.constant("Hello World !") # Objects created by TF is call tensor
output = tf.Session().run(helloworld)     # Run the constant in the TF session
print(output)

b'Hello World !'


---
# Tensor

A mathematical object analogous to but more general than a vector, represented by an array of components that are functions of the coordinates of a space. **The central unit of data in TensorFlow is the tensor**. Everything store in TF as a tensor object.

- **Rank 0 Tensor**

Scaler is a rank 0 tensor and in TF it represent as follows 

```
A = tf.constant(123)
``` 

*This is a constant tensor that does not change*

- **Rank 1 Tensor**

Vector is a rank 1 tensor and in TF it represents as follows 

```
B = ft.constant([23, 49, 42])
```

- **Rank 2 Tensor**

Matrix is a rank 2 tensor and in TF it represent as follows

```
C = ft.constant([[23,21,89], [34,982,83]])
```

- **Rank 3 Tensor**

3D array can be represented as a rank 3 tensor and in TF it represent as follows

```
D = ft.constant([[[1., 2., 3.]], [[7., 8., 9.]]])
```


---
# Session

TF api is build around the idea of building a computational grpah of notes and running them. A session encapsulates the control and state of the TensorFlow runtime. 


#### Work with constant tensors

In [2]:
node1 = tf.constant(3.0, dtype=tf.float32)
node2 = tf.constant(4.0) 
print(node1)

Tensor("Const_1:0", shape=(), dtype=float32)


Add *node1* and *node2*. This will procuce a another tensor. 

In [3]:
node3 = tf.add(node1, node2)

![](images/tf.add.run.jpeg)

In [4]:
sess = tf.Session()
sess.run(node3)

7.0

#### Work with non-constant tensors 

In [5]:
x = tf.placeholder(tf.float32)  # x is a Tensor variable that takes integer value
y = tf.placeholder(tf.float32)  # y is a Tensor variable that takes integer value
sum_xy = x + y # this is same as tf.add(x, y)

In [6]:
sess.run(sum_xy, feed_dict={x:10, y:2})

12.0

In [7]:
sess.run(sum_xy, feed_dict={x:2, y:2})

4.0

#### TF Variables 

TF variables can be modify as usual variables.
Regular variables does not get initialize when creating them, you have to call the *global_variables_initializer()* method and run that within the session.

In [8]:
x = tf.Variable(5) # x is initialize with 5 and can be modify
init = tf.global_variables_initializer()
sess.run(init)
sess.run(x)

5

# TF Math 

In [9]:
a = tf.add(5, 2)  # 7

In [10]:
b = tf.subtract(10, 4) # 6

In [11]:
c = tf.multiply(2, 5)

In [12]:
print(sess.run([a, b, c]))

[7, 6, 10]


# Type Casting in TF

Type cast float to integer

In [13]:
x = tf.constant(2.0) # Floating point number 
print(sess.run(x))

2.0


In [14]:
y = tf.cast(x, tf.int32) # Integer number
print(sess.run(y))

2


# Define Linear Function 

$y=xW + b$

![](images/linearfunction.jpg)

## Define the input variable $x$


$x_i$ is an input. For example this can be a vector of an image. So lets assume this is a vector length of 5

![](images/x_matrix.jpg)

In [15]:
number_of_data_vectors = 120 # this could be the number of images I have in my data set
length_of_data_vector = 6 # length of the ibserved data vector
number_of_categories = 5 # this is the number of categories I have to categorize

_Note: Randomly generating $x_i$ vectors just for the demonstration. In real project these are the observed data (ex, image with (20px) x (20px) = (300px))_

In [16]:
x = tf.Variable(tf.truncated_normal((number_of_data_vectors, length_of_data_vector)))
sess.run(tf.global_variables_initializer()) # Initialize the variable x in the session (sess)
sess.run(x)[:10] # Print the first 10 

array([[-0.45693526, -0.52290267,  1.3766675 , -0.27219331,  1.14847934,
         0.87284625],
       [-1.28326416, -0.63142151,  1.43161869,  1.36415756, -0.79241031,
         1.11476231],
       [ 0.6918447 , -0.76002383, -0.12883557, -0.61266303,  0.21287961,
         1.67242241],
       [-0.14879832, -1.96100414,  0.17042372,  0.97559816,  0.72694212,
         0.36855826],
       [ 0.10577615,  1.41718757,  0.01636093, -0.94101745, -1.8351202 ,
         0.23742022],
       [ 0.56623954,  0.37197945, -0.10749652, -0.11388438,  1.10594249,
        -0.05199287],
       [-1.00691795, -0.4253588 ,  0.30210772,  0.33538565,  0.04371909,
         1.69481599],
       [ 0.08324673, -0.52695876,  0.19547546,  0.58698237,  0.57427913,
        -0.66436243],
       [-0.62323833, -1.92079937,  0.91602671, -0.6252895 ,  0.78574425,
         1.99637783],
       [-1.6445117 , -1.76367831, -1.78615165, -0.20332737,  1.77431822,
         0.52247059]], dtype=float32)

## Define weights $W$

Initializing the weights with random values is important. Randomizing weights help the model get stuck in local minima everytime we train it. Choosing random weights from a normal distribution is a good practice. 


![](images/w_matrix.jpg)

In [17]:
W = tf.Variable(tf.truncated_normal((length_of_data_vector, number_of_categories)))
sess.run(tf.global_variables_initializer())
sess.run(W)

array([[ 1.15524042, -0.07092341,  1.57022703, -1.08708596, -0.00983607],
       [-1.58749902, -0.26416558, -0.8540172 ,  0.38636714,  0.32315025],
       [ 0.37003389, -0.66748691,  0.02387263, -0.80765718, -1.68267286],
       [ 1.24784064,  0.44291234, -0.42946509, -0.41863248,  0.36794561],
       [-0.83994818,  1.56017733, -0.48999667, -0.5300253 , -1.06569219],
       [ 1.08756256,  0.84270841, -1.73785377,  0.88978809, -1.71383536]], dtype=float32)

## Define biases $b$

Biases also can be initialize with random values. Since weights are initialize with random values, we can define biases with zero.

![](images/b.jpg)

In [18]:
b = tf.Variable(tf.zeros(number_of_categories))

In [19]:
sess.run(tf.global_variables_initializer()) 
sess.run(b)

array([ 0.,  0.,  0.,  0.,  0.], dtype=float32)

## Calculate the linear function   
$y=xW + b$

![](images/linear.jpg)

In [20]:
y = tf.add(tf.matmul(x, W), b)
sess.run(y)[:10]

array([[-3.21003699,  1.57533538,  0.76778084, -0.08080262,  0.38994288],
       [ 0.10521334, -2.03897095, -0.10381129,  1.59654737, -2.12001419],
       [ 2.67974472,  0.5302121 ,  0.70923263,  0.91936636, -1.91123331],
       [ 0.06339495,  1.68909192,  1.76008499, -0.25631928,  3.09792948],
       [-1.24846613, -3.65432572, -1.36266029,  1.56352353, -0.31966019],
       [ 2.00285029, -1.44086969,  1.31967235,  1.45819366, -2.31067204],
       [ 1.04547918, -0.48274046,  1.90127373,  3.09686327, -1.12880874],
       [-0.77865219, -1.7714349 ,  0.35757378,  0.81294727, -1.32572901],
       [-2.71481752, -2.95453715,  0.03781496,  1.68836153, -2.41609144],
       [ 1.00461364, -0.07493253,  0.37313727, -2.09890938, -1.20498121]], dtype=float32)

# Calculate the softmax

Softmax function scale $y$ to be between $0$ and $1$

![](images/softmax.jpg)

![](images/softmaxy.jpg)

In [21]:
y = tf.nn.softmax(y)    
sess.run(y)[:10]

array([[ 0.00428082,  0.51261044,  0.22859724,  0.09784438,  0.15666719],
       [ 0.15433052,  0.01808191,  0.12521996,  0.68569332,  0.01667431],
       [ 0.69538766,  0.08103952,  0.09692692,  0.11959262,  0.00705327],
       [ 0.03025218,  0.15374035,  0.16505161,  0.02197387,  0.62898201],
       [ 0.04726622,  0.00426284,  0.04216548,  0.78665167,  0.11965372],
       [ 0.46939933,  0.01499526,  0.23705114,  0.27227083,  0.00628353],
       [ 0.08724091,  0.01892443,  0.20529906,  0.6786173 ,  0.00991834],
       [ 0.10024288,  0.03714442,  0.31225556,  0.49235275,  0.05800442],
       [ 0.00994737,  0.00782708,  0.15601324,  0.8128019 ,  0.01341045],
       [ 0.49353677,  0.16767895,  0.26246583,  0.02215525,  0.05416325]], dtype=float32)

# Clculate Cross Entropy in TensorFlow

In [104]:
# start from here 
y_hat_data = mnist.test.labels.astype(np.float32) # just to simulate, I am loading lables from mist dataset
y_hat_data = y_hat_data[:,[1, 2, 3, 4, 5]][:120]
print(y_hat_data[:10]) # show the first 10 rows 
y_hat = tf.Variable(y_hat_data)
sess.run(tf.global_variables_initializer())
print(y_hat)

[[ 0.  0.  0.  0.  0.]
 [ 0.  1.  0.  0.  0.]
 [ 1.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  1.  0.]
 [ 1.  0.  0.  0.  0.]
 [ 0.  0.  0.  1.  0.]
 [ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  1.]
 [ 0.  0.  0.  0.  0.]]
<tf.Variable 'Variable_13:0' shape=(120, 5) dtype=float32_ref>


![](images/crossentropydata.jpg)

In [114]:
crosse = -tf.reduce_sum(tf.multiply(tf.log(y), y_hat))

# Mini Batching 

It takes so much memory to train a model with large dataset. This may be not be realaistic even with less expensive memory. As a solution, we can use the mini batching: take subsets of data from the whole dataset and train the model for each subset. This is not ideal but its allows us to train the model with less amount of memory.

In [30]:
n_input = 784  # MNIST data input (img shape: 28*28)
n_classes = 10  # MNIST total classes (0-9 digits)

# Import MNIST data
# mnist = input_data.read_data_sets('/datasets/ud730/mnist', one_hot=True)

# The features are already scaled and the data is shuffled
train_features = mnist.train.images
test_features = mnist.test.images

train_labels = mnist.train.labels.astype(np.float32)
test_labels = mnist.test.labels.astype(np.float32)

# Weights & bias
weights = tf.Variable(tf.random_normal([n_input, n_classes]))
bias = tf.Variable(tf.random_normal([n_classes]))

## Divide the data in to batches

## Create the model


![](images/graph1.jpg)

In [154]:
learning_rate = 0.001
n_input = 784  # MNIST data input (img shape: 28*28)
n_classes = 10  # MNIST total classes (0-9 digits)

# The features are already scaled and the data is shuffled
train_features = mnist.train.images
test_features = mnist.test.images

train_labels = mnist.train.labels.astype(np.float32)
test_labels = mnist.test.labels.astype(np.float32)

# Features and Labels
features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32, [None, n_classes])

# Weights & bias
weights = tf.Variable(tf.random_normal([n_input, n_classes]))
bias = tf.Variable(tf.random_normal([n_classes]))
 
# Logits - xW + b
logits = tf.add(tf.matmul(features, weights), bias)

# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)

# Calculate accuracy
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


# TODO: Set batch size
batch_size = 128
assert batch_size is not None, 'You must set the batch size'

# Train the model using entier dataset

In [149]:
init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    # Use the enter dataset
    sess.run(optimizer, feed_dict={features: train_features, labels: train_labels})
    test_accuracy = sess.run(
        accuracy,
        feed_dict={features: test_features, labels: test_labels})

print('Test Accuracy: {}'.format(test_accuracy))


Test Accuracy: 0.10689999908208847


# Train the model using Mini-batches

## Create batches
![](images/batch.jpg)

In [153]:
import math
def batches(batch_size, features, labels):
    """
    Create batches of features and labels
    :param batch_size: The batch size
    :param features: List of features
    :param labels: List of labels
    :return: Batches of (Features, Labels)
    """
    assert len(features) == len(labels)
    outout_batches = []
    
    sample_size = len(features)
    for start_i in range(0, sample_size, batch_size):
        end_i = start_i + batch_size
        batch = [features[start_i:end_i], labels[start_i:end_i]]
        outout_batches.append(batch)
        
    return outout_batches


## Train 

In [152]:
init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)    
    # Train the optimizer for each batch
    for batch_features, batch_labels in batches(batch_size, train_features, train_labels):
        sess.run(optimizer, feed_dict={features: batch_features, labels: batch_labels})

    # Calculate accuracy for test dataset
    test_accuracy = sess.run(
        accuracy,
        feed_dict={features: test_features, labels: test_labels})

print('Test Accuracy: {}'.format(test_accuracy))


Test Accuracy: 0.06440000236034393


# Epochs

- Epoch is a single training (forward and backward pass) of the entier dataset. If we are using mini-batches, we have to train with all the batches within a epoch. 
- This increase the accuracy of th model without requiring more data. 

![](images/epochs.jpg)