# [Tensorflow](https://www.tensorflow.org/) basics

In this tutorial we are going to classify images from the notMNIST dataset . The goal is to automatically detect the letter based on the image in the dataset.

In [2]:
import tensorflow as tf

In [3]:
# Create TensorFlow object called hello_constant
hello_constant = tf.constant('Hello World!')

with tf.Session() as sess:
    # Run the tf.constant operation in the session
    output = sess.run(hello_constant)
    print(output)

b'Hello World!'


## Tensor

In TensorFlow data is not stored as strings, floats or strings. These values are encapsulated in an object called a [tensors](https://en.wikipedia.org/wiki/Tensor). In the case of `hello_constant = tf.constant('Hello World!')`, `hello_constant` is a 0-dimensional string tensor, but tensors come in a variety of sizes as shown below:

In [5]:
# A is a 0-dimensional int32 tensor
A = tf.constant(1234)

# B is a 1-dimensional in32 tensor
B = tf.constant([123, 456, 789])

# C is a 2-dimensional int 32 tensor
C = tf.constant([ [123, 456, 789,], [222,333,444] ])

The tensor returned by `tf.constant()` is called a constant tensor, because the value of the tensor never changes.

## Session

TensorFlow’s api is built around the idea of a computational graph. The previous TensorFlow code can be turned into a graph:

![TensorFlow_Session](./figures/session.png)

A "TensorFlow Session", as shown above, is an environment for running a graph. The session is in charge of allocating the operations to GPU(s) and/or CPU(s), including remote machines.

In [13]:
with tf.Session() as sess:
    # Run the tf.constant operation in the session
    output = sess.run(hello_constant)

The code has already created the tensor, `hello_constant`, from the previous lines. The next step is to evaluate the tensor in a session.

The code creates a session instance, `sess`, using `tf.Session`. The `sess.run()` function then evaluates the tensor and returns the results.

## Input

If we want to use a non-constant we use [`tf.placeholder()`](https://www.tensorflow.org/api_docs/python/tf/placeholder) and `feed_dict`. Next we go over the basics of feeding data into TensorFlow.

### tf.placeholder()

We use [`tf.placeholder()`](https://www.tensorflow.org/api_docs/python/tf/placeholder) to use it as placeholder for arbitrary data input. Thus allowing TensorFlow to take in different datasets with different parameters.

[`tf.placeholder()`](https://www.tensorflow.org/api_docs/python/tf/placeholder) returns a tensor that gets its value from data passed to the tf.session.run() function, allowing you to set the input right before the session runs.

### Session's feed_dict

In [17]:
x = tf.placeholder(tf.string)

with tf.Session() as sess:
    output = sess.run(x, feed_dict={x: 'Hello World'})

Use the `feed_dict` parameter in [`tf.session.run()`](https://www.tensorflow.org/api_docs/python/tf/Session#run) to set the placeholder tensor. The above example shows the tensor `x` being set to the string `"Hello, world"`. It's also possible to set more than one tensor using `feed_dict` as shown below.

In [18]:
x = tf.placeholder(tf.string)
y = tf.placeholder(tf.int32)
z = tf.placeholder(tf.float32)

with tf.Session() as sess:
    output = sess.run(x, feed_dict={x: 'Test String', y: 123, z: 45.67})

**Note:** If the data passed to the `feed_dict` doesn’t match the tensor type and can’t be cast into the tensor type, we get the error `“ValueError: invalid literal for`...”.

## TensorFlow Math

After getting the input we are going to use it by applying basic math functions - add, subtract, multiply, and divide - with tensors. (There's many more math functions, see in the [documentation](https://www.tensorflow.org/api_docs/python/math_ops/).)

### Addition

In [16]:
x = tf.add(5, 2)  # 7

The [`tf.add()`](https://www.tensorflow.org/api_guides/python/math_ops) function does exactly what you expect it to do. It takes in two numbers, two tensors, or one of each, and returns their sum as a tensor.

### Subtraction and Multiplication

Here’s an example with subtraction and multiplication.

In [27]:
x = tf.subtract(10, 4) # 6
y = tf.multiply(2, 5)  # 10

The x tensor will evaluate to `6`, because `10 - 4 = 6`. The `y` tensor will evaluate to `10`, because `2 * 5 = 10`.

### Converting types

It may be necessary to convert between types to make certain operators work together. For example, if we'd try the following, it would fail with an exception:

In [22]:
#tf.subtract(tf.constant(2.0),tf.constant(1))
# Fails with ValueError: Tensor conversion requested dtype float32 
# for Tensor with dtype int32:

That's because the constant `1` is an integer but the constant `2.0` is a floating point value and subtract expects them to match.

In cases like these, you can either make sure our data is all of the same type, or we can cast a value to another type. In this case, converting the `2.0` to an integer before subtracting, like so, will give the correct result:

In [23]:
tf.subtract(tf.cast(tf.constant(2.0), tf.int32), tf.constant(1))   # 1

<tf.Tensor 'Sub_3:0' shape=() dtype=int32>

In [28]:
x = tf.constant(10)
y = tf.constant(2)
z = tf.subtract(tf.cast(tf.divide(x,y), tf.int32),tf.constant(1))

# Note:TensorFlow has multiple ways to divide.
#   tf.divide(x,y) uses Python 3 division semantics and will return a float here
#          It would be the best choice if all the other values had been floats
#   tf.div(x,y) uses Python 2 division semantics and will return an integer here
#          TensorFlow documentation suggests we should prefer tf.divide
#   tf.floordiv(x,y) will do floating point division and then round down to the nearest
#          integer (but the documentation says it may still represent
#          its result as a floating point value)
#   tf.cast(tf.divide(x,y), tf.int32)
#          This lets us do floating point division and then cast it to an integer
#          to match the 1 passed to subtract


# TODO: Print z from a session
with tf.Session() as sess:
    output = sess.run(z)
    print(output)

4


### Recap

We did the following:
- Ran operations in [`tf.Session`](https://www.tensorflow.org/api_docs/python/tf/Session).
- Created a constant tensor with [`tf.constant()`](https://www.tensorflow.org/api_docs/python/tf/constant).
- Used [`tf.placeholder()`](https://www.tensorflow.org/api_docs/python/tf/placeholder) and `feed_dict` to get input.
- Applied the [`tf.add()`](https://www.tensorflow.org/api_docs/python/tf/add), [`tf.subtract()`](https://www.tensorflow.org/api_docs/python/tf/subtract), [`tf.multiply()`](https://www.tensorflow.org/api_docs/python/tf/multiply), and [`tf.divide()`](https://www.tensorflow.org/api_docs/python/tf/divide) functions using numeric data.
- Learned about casting between types with [`tf.cast()`](https://www.tensorflow.org/api_docs/python/tf/cast)

These are the basics of TensorFlow. Next we learn about one of the most popular applications of neural networks - classification.

## Supervised Classification

https://en.wikipedia.org/wiki/Statistical_classification

## Training a Logistic Classifier

A logistic classifier takes an input e.g. the pixels of an image $X$ and applies a linear funciton to them to generate its predictions.

$$
WX + b = y
$$

$W$ are the weights and $b$ is the bias term. Output vector $y$ reflects the class of the input. This should be a probability vector where we want the probability of the correct class to be very close to one and the probability to every other class to be close to zero.
The way to turn scores in to probabilities is to use a softmax function:

$$
S(y_i) = \frac{e^{y_i}}{\sum_j e^{y_j}}
$$

This function denoted by $S$ can turn any score into proper probabilities.
Proper probabilities sum to one and they will be larger when the scores are large and small when the scores are comparatively smaller. Scores in the terms of logistic regression, are also often called logits.

By training our network we are going to try to find the values for the weights and bias which are good at performing correct predictions.



## Linear functions in TensorFlow

The most common operation in neural networks is calculating the linear combination of inputs, weights, and biases. As a reminder, we can write the output of the linear operation as

$$
y = xW + b
$$

Here, $W$ is a matrix of the weights connecting two layers. The output $y$, the input $x$, and the biases $b$ are all vectors.

### Weights and Bias in TensorFlow

The goal of training a neural network is to modify weights and biases to best predict the labels. In order to use weights and bias, we'll need a Tensor that can be modified. This leaves out [`tf.placeholder()`](https://www.tensorflow.org/api_docs/python/tf/placeholder) and [`tf.constant()`](https://www.tensorflow.org/api_docs/python/tf/constant), since those Tensors can't be modified. This is where [`tf.Variable`](https://www.tensorflow.org/api_docs/python/tf/Variable) class comes in.

#### tf.Variable()

In [29]:
x = tf.Variable(5)

The [`tf.Variable`](https://www.tensorflow.org/api_docs/python/tf/Variable) class creates a tensor with an initial value that can be modified, much like a normal Python variable. This tensor stores its state in the session, so we must initialize the state of the tensor manually. We'll use the [`tf.global_variables_initializer()`](https://www.tensorflow.org/api_docs/python/tf/global_variables_initializer) function to initialize the state of all the Variable tensors.

##### Initialization

In [30]:
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)

The [`tf.global_variables_initializer()`](https://www.tensorflow.org/api_docs/python/tf/global_variables_initializer) call returns an operation that will initialize all TensorFlow variables from the graph. We call the operation using a session to initialize all the variables as shown above. Using the [`tf.Variable`](https://www.tensorflow.org/api_docs/python/tf/Variable) class allows us to change the weights and bias, but an initial value needs to be chosen.

Initializing the weights with random numbers from a normal distribution is good practice. Randomizing the weights helps the model from becoming stuck in the same place every time we train it. We'll see more about this in the next section, when we study gradient descent.

Similarly, choosing weights from a normal distribution prevents any one weight from overwhelming other weights. We'll use the [`tf.truncated_normal()`](https://www.tensorflow.org/api_docs/python/tf/truncated_normal) function to generate random numbers from a normal distribution.

#### tf.truncated_normal()

In [31]:
n_features = 120
n_labels = 5
weights = tf.Variable(tf.truncated_normal((n_features, n_labels)))

The [`tf.truncated_normal()`](https://www.tensorflow.org/api_docs/python/tf/truncated_normal) function returns a tensor with random values from a normal distribution whose magnitude is no more than 2 standard deviations from the mean.

Since the weights are already helping prevent the model from getting stuck, you don't need to randomize the bias. Let's use the simplest solution, setting the bias to 0.

#### tf.truncated_normal()

In [32]:
n_labels = 5
bias = tf.Variable(tf.zeros(n_labels))

The [`tf.zeros()`](https://www.tensorflow.org/api_docs/python/tf/zeros) function returns a tensor with all zeros.

### Linear Classifier Example