# Chapter One. Introduction to TensorFlow
What is graph-based computation? In this chapter, you'll learn about the engine that powers TensorFlow
and what makes it such an attractive choice for data science projects. We will talk about constants and
variables, basic operations, such as addition and multiplication, and advanced operations, such as
differentiation. By the end of the chapter, you'll know how to construct and solve graph-based
computational models.

> **Topics:**
- 1. Constants and variables
    - 1.1 Defining constants with convenience functions
    - 1.2 Defining variables
    - 1.3 Checking properties of tensors
- 2. Basic operations
    - 2.1. Performing element-wise multiplication
    - 2.2. Making predictions with matrix multiplication
    - 2.3. Summing over tensor dimensions
- 3. Advanced operations
    - 3.1. Reshaping tensors
    - 3.2. Optimizing with gradients
    - 3.3. Working with image data

In [1]:
import numpy as np
import tensorflow as tf

filepath = '../_datasets/'


## 1. Constants and variables

### What is TensorFlow?
- **Open-source library for graph-based numerical computation**
    - Developed by the Google Brain Team
- **Low and high level APIs**
    - Addition, multiplication, differentiation
    - Machine learning models
- **Important changes in TensorFlow 2.0**
    - Eager execution by default
    - Model building with Keras and Estimators

### What is a tensor?
- Generalization of vectors and matrices
- Collection of numbers
- Specific shape

![][tensor]

### Defining tensors in TensorFlow
```Python
import tensorflow as tf
# 0D Tensor
d0 = tf.ones((1,))

# 1D Tensor
d1 = tf.ones((2,))

# 2D Tensor
d2 = tf.ones((2, 2))

# 3D Tensor
d3 = tf.ones((2, 2, 2))

# Print the 3D tensor
print(d3.numpy())
```

### Defining constants in TensorFlow
- **A constant is the simplest category of tensor**
    - Not trainable
    - Can have any dimension

```Python
from tensorflow import constant

# Define a 2x3 constant.
a = constant(3, shape=[2, 3])

# Define a 2x2 constant.
b = constant([1, 2, 3, 4], shape=[2, 2])
```

### Using convenience functions to define constants

![][02-functions]

### Defining and initializing variables

```Python
import tensorflow as tf

# Define a variable
a0 = tf.Variable([1, 2, 3, 4, 5, 6], dtype=tf.float32)
a1 = tf.Variable([1, 2, 3, 4, 5, 6], dtype=tf.int16)

# Define a constant
b = tf.constant(2, tf.float32)

# Compute their product
c0 = tf.multiply(a0, b)
c1 = a0*b
```

[tensor]: _Docs/01-tensor.png
[02-functions]: _Docs/02-functions.png

### 1.1 Defining constants with convenience functions
A constant is the simplest category of tensor. It can't be trained, which makes it a bad choice for a model's parameters, but a good choice for input data. Input data may be transformed after it is defined or loaded, but is typically not modified by the training process.

In this exercise, we will practice defining constants using some of the operations discussed in the video. Note that we have not imported the entire `tensorflow` API and will not import it for most exercises. You can complete this exercise using the operations `fill()`, `ones_like()`, and `constant()`, which have been imported from tensorflow version 2.0.

In [2]:
# Define a 3x4 tensor with all values equal to 9
A34 = tf.fill([3, 4], 9)

# Define a tensor of ones with the same shape as A34
B34 = tf.ones_like(A34)

# Define the one-dimensional vector, C1
C1 = tf.constant([1, 2, 3, 4])

# Print C1 as a numpy array
print(C1.numpy())

[1 2 3 4]


### 1.2 Defining variables
Unlike a constant, a variable's value can be modified. This will be quite useful when we want to train a model by updating its parameters. Constants can't be used for this purpose, so variables are the natural choice.

Let's try defining and working with a variable.

In [3]:
# Define the 1-dimensional variable A1
A1 = tf.Variable([1, 2, 3, 4])

# Print the variable A1
print(A1)

# Convert A1 to a numpy array and assign it to B1
B1 = A1.numpy()

# Print B1
print(B1)

<tf.Variable 'Variable:0' shape=(4,) dtype=int32, numpy=array([1, 2, 3, 4])>
[1 2 3 4]


### 1.3 Checking properties of tensors
In later chapters, you will make use of constants and variables to train models. Your datasets will be represented as constant tensors of type `tf.Tensor()` or numpy arrays. The model's parameters will be represented by variables that are updated during computation.

In this exercise, you will examine the properties of two tensors: `A1` and `B1`. Note that they have already been defined and are available in the Python shell. Use the `print()` function to determine which statement about `A1` and `B1` is true.



In [4]:
A1 = tf.constant([1., 2., 3., 4.])
B1 = tf.Variable([7., 3., 4.])

print(A1)
print(B1)

tf.Tensor([1. 2. 3. 4.], shape=(4,), dtype=float32)
<tf.Variable 'Variable:0' shape=(3,) dtype=float32, numpy=array([7., 3., 4.], dtype=float32)>



**Possible Answers**
- [ ] A1 is a variable and B1 is a constant.
- [ ] A1 is a constant, B1 is a variable, and both are of type int16.
- [ ] A1 is a 1-dimensional tensor of shape 3.
- [x] A1 is a constant, B1 is a variable, and A1 is the tensor [1, 2, 3, 4].

## 2. Basic operations

### What is a TensorFlow operation?

![][03-TensorFlow_operation]

### Applying the addition operator

```Python
#Import constant and add from tensorflow
from tensorflow import constant, add

# Define 0-dimensional tensors
A0 = constant([1])
B0 = constant([2])

# Define 1-dimensional tensors
A1 = constant([1, 2])
B1 = constant([3, 4])

# Define 2-dimensional tensors
A2 = constant([[1, 2], [3, 4]])
B2 = constant([[5, 6], [7, 8]])

# Perform tensor addition with add()
C0 = add(A0, B0)
C1 = add(A1, B1)
C2 = add(A2, B2)
```

### Performing tensor addition

- The `add()` operation performs element-wise addition with two tensors
- Element-wise addition requires both tensors to have the same shape:
    - Scalar addition: $1 + 2 = 3$
    - Vector addition: $[1, 2] + [3, 4] = [4, 6]$
    - Matrix addition: 
    \begin{eqnarray} 
        \\ \nonumber    
        \left(
        \begin{array}{ll}
          1 & 2 \\
          3 & 4 \\
        \end{array}
        \right) + 
        \left(
        \begin{array}{ll}
        5 & 6   \\
        7 & 8   \\
        \end{array}
        \right) = 
        \left(
        \begin{array}{llll}
        6  & 8  \\
        10 & 12  \\
        \end{array}
        \right)
    \end{eqnarray}
- The `add()` operator is overloaded

### How to perform multiplication in TensorFlow
- **Element-wise multiplication** performed using `multiply()` operation
    - The tensors multiplied must have the same shape
    - E.g. [1,2,3] and [3,4,5] or [1,2] and [3,4]
- **Matrix multiplication** performed with `matmul()` operator
    - The `matmul(A,B)` operation multiplies `A` by `B`
    - Number of columns of `A` must equal the number of rows of `B`

### Applying the multiplication operators

```Python
# Import operators from tensorflow
from tensorflow import ones, matmul, multiply

# Define tensors
A0 = ones(1)
A31 = ones([3, 1])
A34 = ones([3, 4])
A43 = ones([4, 3])
```

- What types of operations are valid?
    - `multiply(A0, A0)` , `multiply(A31, A31)` , and `multiply(A34, A34)`
    - `matmul(A43, A34)`, but not `matmul(A43, A43)`

###  Summing over tensor dimensions
- The `reduce_sum()` operator sums over the dimensions of a tensor
    - `reduce_sum(A)` sums over all dimensions of A
    - `reduce_sum(A, i)` sums over dimension i

```Python
# Import operations from tensorflow
from tensorflow import ones, reduce_sum

# Define a 2x3x4 tensor of ones
A = ones([2, 3, 4]) 

# Sum over all dimensions
B = reduce_sum(A)

# Sum over dimensions 0, 1, and 2
B0 = reduce_sum(A, 0)
B1 = reduce_sum(A, 1)
B2 = reduce_sum(A, 2)
```

[03-TensorFlow_operation]:_Docs/03-TensorFlow_operation.png

### 2.1 Performing element-wise multiplication
Element-wise multiplication in TensorFlow is performed using two tensors with identical shapes. This is because the operation multiplies elements in corresponding positions in the two tensors. An example of an element-wise multiplication, denoted by the ⊙ symbol, is shown below:

\begin{eqnarray} 
        \\ \nonumber    
        \left(
        \begin{array}{ll}
            1 & 2 \\
            2 & 1 \\
        \end{array}
        \right) ⊙ 
        \left(
        \begin{array}{ll}
            3 & 1   \\
            2 & 5   \\
        \end{array}
        \right) = 
        \left(
        \begin{array}{llll}
            3  & 2  \\
            4 & 5  \\
        \end{array}
        \right)
    \end{eqnarray}


In this exercise, you will perform element-wise multiplication, paying careful attention to the shape of the tensors you multiply.

In [5]:
# Define tensors A1 and A23 as constants
A1 = tf.constant([1, 2, 3, 4])
A23 = tf.constant([[1, 2, 3], [1, 6, 4]])

# Define B1 and B23 to have the correct shape
B1 = tf.ones_like(A1)
B23 = tf.ones_like(A23)

# Perform element-wise multiplication
C1 = tf.multiply(A1,B1)
C23 = tf.multiply(A23, B23)

# Print the tensors C1 and C23
print(C1.numpy())
print(C23.numpy())

[1 2 3 4]
[[1 2 3]
 [1 6 4]]


### 2.2 Making predictions with matrix multiplication
In later chapters, you will learn to train linear regression models. This process will yield a vector of parameters that can be multiplied by the input data to generate a vector of predictions. In the exercise, we will use the following tensors:

![][04-tensor]

`X` is the matrix of input data, `b` is the parameter vector, and `y` is the target vector. You will use `matmul()` to perform matrix multiplication of `X` by `b` to generate predictions, `ypred`, which you will compare with `y`. 

[04-tensor]:_Docs/04-tensor.png


In [6]:
# Define X, b, and y as constants
X = tf.constant([[1, 2], [2, 1], [5, 8], [6, 10]])
b = tf.constant([[1], [2]])
y = tf.constant([[6], [4], [20], [23]])

# Compute ypred using X and b
ypred = tf.matmul(X,b)

# Compute and print the error
error = y - ypred
print(error.numpy())

[[ 1]
 [ 0]
 [-1]
 [-3]]


### 2.3 Summing over tensor dimensions
You've been given a matrix, wealth. This contains the value of bond and stock wealth for five individuals. Note that this is given in thousands of dollars.

![][05-Summing]

The first row corresponds to bonds and the second corresponds to stocks. Each column gives the stock and bond wealth for a single individual. Use `wealth`, `reduce_sum()`, and `.numpy()` to determine which statements are correct about wealth.

[05-Summing]:_Docs/05-Summing.png

In [7]:
wealth = tf.constant([[11,  7,  4,  3, 25], [50,  2, 60,  0, 10]])
print('Sum over columns:', tf.reduce_sum(wealth,0).numpy())
print('Sum over rows:', tf.reduce_sum(wealth,1).numpy())

Sum over columns: [61  9 64  3 35]
Sum over rows: [ 50 122]


**Possible Answers**
- [ ] The individual in the first column has the highest total wealth (i.e. stocks + bonds).
- [ ] Combined, the 5 individuals hold USD 50,000 in stocks.
- [X] Combined, the 5 individuals hold USD 50,000 in bonds.
- [ ] The individual in the second column has the lowest total wealth (i.e. stocks + bonds).

## 3. Advanced operations

### Overview of advanced operations
- We have covered basic operations in TensorFlow
    - `add()` , `multiply()` , `matmul()` , and `reduce_sum()`
- In this lesson, we explore advanced operations
    - `gradient()` , `reshape()` , and `random()`
    
### Overview of advanced operations

![][06-advanced_operations]

### Finding the optimum
- In many problems, we will want to find the optimun of a function.
    - Minimum: Lowest value of a loss function.
    - Maximum: Highest value of objective function.
- We can do this using the `gradient()` operation.
    - Optimum: Find a point where gradient = 0.
    - Minimum: Change in gradient > 0
    - Maximum: Change in gradient < 0

### Calculating the gradient

![][07-y_x]
![][08-y_x2]

### Gradients in TensorFlow

```Python
# Import tensorflow under the alias tf
import tensorflow as tf

# Define x
x = tf.Variable(-1.0)

# Define y within instance of GradientTape
with tf.GradientTape() as tape:
tape.watch(x)
y = tf.multiply(x, x)

# Evaluate the gradient of y at x = -1
g = tape.gradient(y, x)
print(g.numpy())
-2.0
```

### Images as tensors

![][09-images_as_tensors]

### How to reshape a grayscale image
```Python
# Import tensorflow as alias tf
import tensorflow as tf

# Generate grayscale image
gray = tf.random.uniform([2, 2], maxval=255, dtype='int32')

# Reshape grayscale image
gray = tf.reshape(gray, [2*2, 1])
```

![][10-reshape_grayscale]

### How to reshape a color image

```Python
# Import tensorflow as alias tf
import tensorflow as tf

# Generate color image
color = tf.random.uniform([2, 2, 3], maxval=255, dtype='int32')

# Reshape color image
color = tf.reshape(color, [2*2, 3])
```

![][11-reshape_color]

[06-advanced_operations]:_Docs/06-advanced_operations.png
[07-y_x]:_Docs/07-y_x.png
[08-y_x2]:_Docs/08-y_x2.png
[09-images_as_tensors]:_Docs/09-images_as_tensors.png
[10-reshape_grayscale]:_Docs/10-reshape_grayscale.png
[11-reshape_color]:_Docs/11-reshape_color.png

### 3.1 Reshaping tensors
In many machine learning problems, you will need to reshape your input data. 

Some models are designed to use image data as an input. Many, however, require you to transform the data into a vector. In this exercise, we will use the `reshape()` operation to practice transforming tensors. 

Reshape the grayscale `image` into a 256x1 `image_vector` and a 4x4x4x4 `image_tensor`.

In [8]:
# Define input image
image = tf.ones([16, 16])

# Reshape image into a vector
image_vector = tf.reshape(image, (256, 1))

# Reshape image into a higher dimensional tensor
image_tensor = tf.reshape(image, (4, 4, 4, 4))

Adapt `image` by adding three color channels and then adjust `image_vector` and `image_tensor` accordingly.

In [9]:
# Add three color channels
image = tf.ones([16, 16, 3])

# Reshape image into a vector
image_vector = tf.reshape(image, (768, 1))

# Reshape image into a higher dimensional tensor
image_tensor = tf.reshape(image, (4, 4, 4, 4, 3))

### 3.2 Optimizing with gradients
You are given a loss function, $y=x^2$, which you want to minimize. You can do this by computing the slope using the `GradientTape()` operation at different values of `x`. If the ***slope is positive***, you can ***decrease the loss by lowering `x`***. If it is ***negative***, you can ***decrease it by increasing `x`***. **This is how gradient descent works**.

The image shows a plot of y equals x squared. It also shows the gradient at x equals -1, x equals 0, and x equals 1.

![][08-y_x2]

In practice, you will use a high level tensorflow operation to perform gradient descent automatically. In this exercise, however, you will compute the slope at `x` values of `-1`, `1`, and `0`. 

[08-y_x2]:_Docs/08-y_x2.png

In [10]:
def compute_gradient(x0):
  	# Define x as a variable with an initial value of x0
	x = tf.Variable(x0)
	with tf.GradientTape() as tape:
		tape.watch(x)
        # Define y using the multiply operation
		y = tf.multiply(x,x)
    # Return the gradient of y with respect to x
	return tape.gradient(y, x).numpy()

# Compute and print gradients at x = -1, 1, and 0
print(compute_gradient(-1.0))
print(compute_gradient(1.0))
print(compute_gradient(0.0))

-2.0
2.0
0.0


Notice that the slope is positive at `x` = 1, which means that we can lower the loss by reducing `x`. The slope is negative at `x` = -1, which means that we can lower the loss by increasing `x`. The slope at `x` = 0 is 0, which means that we cannot lower the loss by either increasing or decreasing `x`. This is because the loss is minimized at `x` = 0.

### 3.3 Working with image data
You are given a black-and-white image of a letter, which has been encoded as a tensor, `letter`. You want to determine whether the letter is an X or a K. You don't have a trained neural network, but you do have a simple model, `model`, which can be used to classify `letter`.

The 3x3 tensor, `letter`, and the 1x3 tensor, `model`, are available in the Python shell. You can determine whether letter is a K by multiplying `letter` by `model`, summing over the result, and then checking if it is equal to 1. As with more complicated models, such as neural networks, `model` is a collection of weights, arranged in a tensor.

Note that the functions `reshape()`, `matmul()`, and `reduce_sum()` have been imported from tensorflow and are available for use.

In [11]:
letter = np.array ([[1., 0., 1.],
                    [1., 1., 0.],
                    [1., 0., 1.]])
model = np.array([[ 1.,  0., -1.]])

In [12]:
# Reshape model from a 1x3 to a 3x1 tensor
model = tf.reshape(model, (3, 1))

# Multiply letter by model
output = tf.matmul(letter, model)

# Sum over output and print prediction using the numpy method
prediction = tf.reduce_sum(output)
print(prediction.numpy())

1.0
