# <center>Introduction to Tensorflow Notes</center>

<center><i>Notes from the Tensorflow section from the Self Driving Car Nanodegree on Udacity</i></center>

#### Notable Tensorflow Functions

1. `tf.matmul(a, b)` - the dot product of a and b



#### Helpful Tips

Inputs like features and labels use placeholders not variables

Weights and biases use variables because they are suposed to change througout the `tf.Session()`

In [1]:
import tensorflow as tf

## Tensorflow Constants

In [2]:
hello_constant = tf.constant('Hellow World!')

This sets the variable hello_constant as a tensorflow Tensor object. In this case, hello_constant is a 0-dimensional string (0 dimensional array). Let's see what happens when you try and print it:

In [8]:
print(hello_constant)
print(type(hello_constant))

Tensor("Const:0", shape=(), dtype=string)
<class 'tensorflow.python.framework.ops.Tensor'>


If we want to print the constant as the string we put into the function `tf.constant()` we have to run `tf.Session()`

In [9]:
with tf.Session() as sess:
    
    output = sess.run(hello_constant)
    print(output)
    print(type(output))

b'Hellow World!'
<class 'bytes'>


As you can see, it still doesn't print the constant as a string. In fact it prints the output in 'byte' form

`tf.constant()` objects can be more than just 0-dimensional strings

In [16]:
string = tf.constant('abc')
print(string.shape, string.dtype)

A = tf.constant(1234)
print(A.shape, A.dtype)

B = tf.constant([123, 456, 789])
print(B.shape, B.dtype)

C = tf.constant([[123, 456, 789], [987, 654, 321]])
print(C.shape, C.dtype)

() <dtype: 'string'>
() <dtype: 'int32'>
(3,) <dtype: 'int32'>
(2, 3) <dtype: 'int32'>


`obj.dtype` returns the data type of the object. As you can see, `tf.constant()` objects can be any shape. Object `C` is a 2x3 matrix and would be written as so:

`|123 456 789|
 |987 654 321|`

`tf.Session()` evaluates the tensor hello_constant in a session

A session is what allocates computational power and operations.

`sess.run()` evaluates the tensor and returns the results

## Inputs with non-constants

If we were to want to pass in a non-constant, we would first have to instatiate a `tf.placeholder()`

In [48]:
x = tf.placeholder(tf.string)
y = tf.placeholder(tf.int32, shape = (3,))
print(x)
print(x.shape)
print(x.dtype)
print('\n')
print(y)
print(y.shape)
print(y.dtype)

Tensor("Placeholder_19:0", dtype=string)
<unknown>
<dtype: 'string'>


Tensor("Placeholder_20:0", shape=(3,), dtype=int32)
(3,)
<dtype: 'int32'>


Starting with `x`. We've instatiated a `tf.placeholder()` object with the one necessary positional argument which is the data type. We've set it to be a tf.string which we will use later. Beacuse we didn't pass in the shape parameter, we can pass in a `tf.string` object of any shape

With `y`, we've instatiated it as an int32 and a shape of (3,). This means that we can only pass in an int32 with a shape of (3,)

In [49]:
with tf.Session() as sess:
    output = sess.run(x, feed_dict={x: 'Hello World'})
    print(output)
    print(type(output))

Hello World
<class 'numpy.ndarray'>


In [50]:
with tf.Session() as sess:
    output = sess.run(y, feed_dict = {y: [123, 456, 789]})
    print(output)
    print(type(output))

[123 456 789]
<class 'numpy.ndarray'>


If we have a placeholder and want to run something using the `tf.Session()`, we can use the feed_dict parameter to set x and y to 'Hello World' and [123, 456, 789] respectively

The biggest takeaway from printing out the types of the outputs is the fact that they are both numpy arrays. I believe this is because of `tf.placeholder` but I'm not sure

#### CHECK WHY THEY ARE NUMPY ARRAYS ^^

You can also pass both `x` and `y` in at the same time

In [51]:
with tf.Session() as sess:
    output_x = sess.run(x, feed_dict={x:'Test', y: [1, 2, 3]})
    print(output_x)

Test


With this you can only return one of the instatiated objects

## Tensorflow Math

In [64]:
x = tf.add(5, 2)
print(x)

y = tf.subtract(10, 4)
print(y)

z = tf.multiply(2, 5)
print(z)

with tf.Session() as sess:
    print('x: ' + str(sess.run(x)))
    print('y: ' + str(sess.run(y)))
    print('z: ' + str(sess.run(z)))

Tensor("Add_14:0", shape=(), dtype=int32)
Tensor("Sub_3:0", shape=(), dtype=int32)
Tensor("Mul_3:0", shape=(), dtype=int32)
x: 7
y: 6
z: 10


All data types have to be the same when using tensorflow math

In [66]:
#tf.subtract(tf.constant(2.0), tf.constant(1))
# TypeError: Input 'y' of 'Sub' Op has type int32 
# that does not match type float32 of argument 'x'.

Instead you would have to cast either the float or the int32 as either or

In [74]:
sub = tf.subtract(tf.cast(tf.constant(2.1), tf.int32), tf.constant(1))

with tf.Session() as sess:
    print(sess.run(sub))

1


When casting from a float to an int it <b><u>ALWAYS</u></b> rounds down

QUIZ

Convert this python code into tensorflow math

In [80]:
x1 = 10
y1 = 2
z1 = x1/y1 - 1

# TODO: Print z from a session as the variable output
output1 = z1
print(output1)

4.0


1. `divxy = tf.divide(x, y)` - returns a float of x/y
2. `cast_divxy = tf.cast(divxy, tf.int32)` - casts x/y as an int32
3. `out = tf.subtract(cast_divxy, tf.constant(1)` - subtracts int(x/y) and int 1
4. `tf.cast(out, tf.float32)` - casts out to a float32

In [81]:
x = tf.constant(10)
y = tf.constant(2)
z = x/y - 1
z = tf.cast(tf.subtract(tf.cast(tf.divide(x, y), tf.int32), 
                        tf.constant(1)), tf.float32)

# TODO: Print z from a session as the variable output
with tf.Session() as sess:
    output = sess.run(z)
    print(output)

4.0


## Tensorflow Linear Function

`tf.Variable()` creates a tensor with an initial value that can be changed, just like a normal Python variable

In [88]:
x = tf.Variable(5)

`tf.global_variables_initializer()` initialzes the state of all Variable tensors

In [90]:
init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)

`tf.Variable()` lets us change the weights and biases. But we do need to pass in an initial value. We will normally initialize the weights with random numbers from a normal distrubution. This will ensure the model doesn't get stuck at the beginning of training each time.

We will use `tf.truncated_normal()` to generate these random numbers from a normal distrubution. One parameter so either pass in one number or a tuple

Returns a tensor with random values which is no more than 2 standard deviations from the mean

In [93]:
n_features = 120
n_labels = 5
weights = tf.Variable(tf.truncated_normal((n_features, n_labels)))

There is no need to randomize the bias. The easiest thing to do is to set the bias to 0.

In [94]:
n_labels = 5
bias = tf.Variable(tf.zeros(n_labels))

## Softmax using Numpy

Softmax equation:

<b>sigmoid(x) = e^x/sum(x^x)</b>

Have to define which axis of the array you are adding. In this case we are using the x axis so we will set the axis = 0: `np.sum(x, axis = 0)`

In [98]:
import numpy as np

In [99]:
def softmax(x):
    return np.exp(x)/np.sum(np.exp(x), axis = 0)

In [100]:
logits = [3.0, 1.0, 0.2]
print(softmax(logits))

[0.8360188  0.11314284 0.05083836]


## Softmax using Tensorflow

In [102]:
def softmax_tensorflow():
    output = None
    logit_data = [2.0, 1.0, 0.1]
    logits = tf.placeholder(tf.float32)
    
    softmax = tf.nn.softmax(logits)
    
    with tf.Session() as sess:
        output = sess.run(softmax, feed_dict={logits: logit_data})
        
    return output
        
print(softmax_tensorflow())

[0.6590012  0.24243298 0.09856589]


## Cross Entropy

The distance between the two probability vectors (1. softmax output, 2. onehot encoded correct output)

`S = softmax output`;
`L = onehot encoded output`

`D(S, L) = cross entropy`

`D(S, L) = - sum(L * log(S))`

`S = softmax(Wx + b)`

#### Multinomial Logistic Classification

<i>`D(S(Wx+b), L)`</i> (axis=1)

#### Training Loss

The average cross-entropy of the entire training set

n = number of features

`l = 1/n * (sum(D(S(Wx+b), L))`

To minimize the training loss, we must update the weights and bias. The easiest way to do this is through gradient descent

<b>Next Steps:</b>

1. How do you fill image pixels to this calssifier?
2. Where do you initialize the optimization?

Numerical stability in Python

The numerical value of this equation should be 1.0, but the code says it is not:

In [120]:
a = 1000000000
for i in range(1000000):
    a = a + 1e-6
print(a - 1000000000)

0.95367431640625


How do we avoid this?

We want our variables to have 0 mean and equal variance whenever possible

#### Normalize inputs

For each color layer in an image, subtract 128 and divide that by 128

`(R - 128) / 128)`

`(G - 128) / 128)`

`(B - 128) / 128)`

#### Normalize weights

Draw the weights randomly with a gaussian distribution of mean 0 and of standard deviation sigma

larger sigma = large peaks and be very opinionated

small sigma = uncertainty

It is better to start out with uncertainty and let the model become more confident with the data

## Measuring Performance and Overfitting

Using training, validation, and testing sets

The larger the validation set, the more precise your numbers will be

The bigger the test set, the less noise there will be

#### Rule of 30

The more examples you have, the greater the accuracy has to change for it to be truly changing. Should be 30 examples at minimum

#### Stochastic Gradient Descent

Very scalable, but fundamentally pretty bad

Instead of computing the actual loss we will compute an estimate. Take a random 1/1000 sliver of the training data. From there take that sliver and compute the loss and the derivative and assume it is the right direction for the rest of the training set

#### Hyperparameters

If you ever have problems, ALWAYS try lowering your learning rate first

ADAGRAD: takes care of initial learning rate, learning rate decay, and momentum for you. Often makes learning less sensetive to hyperparameters

## Creating our Model

#### todo: a breakdown



In [124]:
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
import numpy as np
import math
'''
Don't worry about how to do this right now, this was given as a helper
funciton
'''
def batches(batch_size, features, labels):
    """
    Create batches of features and labels
    :param batch_size: The batch size
    :param features: List of features
    :param labels: List of labels
    :return: Batches of (Features, Labels)
    """
    assert len(features) == len(labels)
    outout_batches = []
    
    sample_size = len(features)
    for start_i in range(0, sample_size, batch_size):
        end_i = start_i + batch_size
        batch = [features[start_i:end_i], labels[start_i:end_i]]
        outout_batches.append(batch)
        
    return outout_batches

def print_epoch_stats(epoch, sess, last_features, last_labels):
    current_cost = sess.run(cost, feed_dict={features:last_features,
                                            labels: last_labels})
    valid_accuracy = sess.run(accuracy, feed_dict={features:valid_features,
                                                  labels: valid_label})
    print('Epoch: {:<4} - Cost: {:<8.3} Valid Accuracy: {:<5.3}'.format(
            epoch, current_cost, valid_accuracy))

In [None]:
'''
DO NOT RUN THIS CODE BLOCK

tensorflow datasets don't seem to be working
'''
learning_rate = 0.001
n_input = 784 # mnist is (28,28,1)
n_classes = 10 # number 0-9

mnist = input_data.read_data_sets('/datasets/ud730/mnist', one_hot= True)

train_features = mnist.train.images
test_features = mnist.test.images

train_labels = mnist.train.labels.astype(np.float32)
test_labels = mnist.test.labels.astype(np.float32)

features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32, [None, n_classes])

# one video has these and so does the lab but one has the one below
#weights = tf.Variable(tf.truncated_normal((n_input, n_classes)))
#bias = tf.Variable(tf.zeros(n_classes))

weights = tf.Variable(tf.random_normal([n_input, n_classes]))
bias = tf.Variable(tf.random_normal([n_classes]))

# Wx + b
logits = tf.add(tf.matmul(features, weights, bias))

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits,
                                                             labels=labels))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)

correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


In [None]:
batch_size = 128
epochs = 10

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    
    for epoch in range(epochs):
    
        for batch_features, batch_labels in batches(batch_size, 
                                                    train_features, 
                                                    train_labels):
            sess.run(optimizer, feed_dict={features: batch_features, 
                                          labels: batch_labels})
            
        print_epoch_stats(epoch, sess, batch_features, batch_labels)
        
    test_accuracy = sess.run(accuracy, feed_dict={features:test_features,
                                                 labels:test_labels})

## todo for the lab

1. Normalize the features
2. Use tensorflow to create features, labels, weights, and bisaes
3. Tune the learning rate and epochs


1. Normalizing a Grayscale Image

    - the min-max scaling method will range from 0.1 to 0.9
    
    - Because the images are grayscale they will range from 0 to 255


In [127]:
# Step 1: Normalizing a Grayscale Image

def normalize_grayscale(image_data):
    
    a = 0.1
    b = 0.9
    x_min = 0
    x_max = 255
    
    X = a + ((image_data-x_min)*(b-a))/(x_max-x_min)
    
    return X

In [None]:
# Step 2: Features, labels, and weights

features_count = 784
labels_count = 10

# TODO: Set the features and labels tensors
features = tf.placeholder(tf.float32, [None, features_count])
labels = tf.placeholder(tf.float32, [None, labels_count])

# TODO: Set the weights and biases tensors
weights = tf.Variable(tf.truncated_normal((features_count, labels_count)))
biases = tf.Variable(tf.zeros(labels_count))