# Introduction to TensorFlow in Python
Not long ago, cutting-edge computer vision algorithms couldn’t differentiate between images of cats and dogs. Today, a skilled data scientist equipped with nothing more than a laptop can classify tens of thousands of objects with greater accuracy than the human eye. In this course, you will use TensorFlow 2.6 to develop, train, and make predictions with the models that have powered major advances in recommendation systems, image classification, and FinTech. You will learn both high-level APIs, which will enable you to design and train deep learning models in 15 lines of code, and low-level APIs, which will allow you to move beyond off-the-shelf routines. You will also learn to accurately predict housing prices, credit card borrower defaults, and images of sign language gestures.

**Instructor:** Isaiah Hull, senior economist at Sweden's Central Bank

In [1]:
import tensorflow as tf
from tensorflow import constant, add, ones, matmul, multiply, reduce_sum

In [22]:
tf.__version__

'2.4.0'

In [26]:
def compute_gradient(x0):
  	# Define x as a variable with an initial value of x0
	x = Variable(x0)
	with GradientTape() as tape:
		tape.watch(x)
        # Define y using the multiply operation
		y = multiply(x,x)
    # Return the gradient of y with respect to x
	return tape.gradient(y, x).numpy()

# $\star$ Chapter 1: Introduction to TensorFlow
Before you can build advanced models in TensorFlow 2, you will first need to understand the basics. In this chapter, you’ll learn how to define constants and variables, perform tensor addition and multiplication, and compute derivatives. Knowledge of linear algebra will be helpful, but not necessary.

### Constants and variables
* TensorFlow's two basic objects of computation are: **constants** and **variables**

#### What is TensorFlow?
* An open-source library for graph-based numerical computation
    * Developed by the Google Brain Team
* Low- and high-level APIs
    * Addition, multiplication, differentiation
    * Design and train machine learning models
* Important changes in TensorFlow 2.0
    * Eager execution enabled by default
        * Allows users to write simpler and more intuitive code
        * Model building with Keras and Estimators (high-level APIs)
        
#### What is a tensor?
* The TensorFlow documentation describes a **tensor** as "generalization of vectors and matrices to potentially higher dimensions."
* If you're not familiar with linear algebra, think of a tensor as **a collection of numbers, which is arranged into a particular shape**.
    * 0-dimensional: point
    * 1-dimensional: line
    * etc
    
### Defining tensors in TensorFlow
* Each object defined below will be a `tf.Tensor object`

In [3]:
# import tensorflow as tf

# 0D Tensor
d0 = tf.ones((1,))

# 1D Tensor
d1 = tf.ones((2,))

# 2D Tensor
d2 = tf.ones((2, 2))

# 3D Tensor
d3 = tf.ones((2, 2, 2))

If we want to print the array contained in that object, we can apply the `.numpy()` method and pass the resulting object to the print function

In [4]:
# Print the 3D tensor
print(d3.numpy())

[[[1. 1.]
  [1. 1.]]

 [[1. 1.]
  [1. 1.]]]


### Defining constants in TensorFlow
* A **constant** the simplest category of tensor
* A constant does not change and cannot be trained
    * Immutable
    * Untrainable
* A constant can have any dimension
* In the code below, we've defined two constants:
    * `a` is a 2x3 tensor of 3s
    * `b` is a 2x2 tensor which is constructed from the 1-dimensional tensor: 1, 2, 3, 4

In [5]:
# from tensorflow import constant

# Define a 2x3 constant
a = constant(3, shape=[2, 3])

# Define a 2x2 constant
b = constant([1, 2, 3, 4], shape=[2, 2])

In [6]:
print(a.numpy())

[[3 3 3]
 [3 3 3]]


In [7]:
print(b.numpy())

[[1 2]
 [3 4]]


* Above we worked exclusively with the constant operation
* However, in some cases, there are more convenient options for defining certain types of special tensors
<img src='data/convenience_functions.png' width="400" height="200" align="center"/>

* Use the `.zeros` or `.ones` operations to generate a tensor of arbitrary (but defined) dimension, that is populated entirely with zeros or ones
* Use the `zeros_like` or `ones_like` operations to populate tensors with zeros and ones, copying the dimensions of some input tensor passed to it.
* Use the `.fill` operation to populate a tensor of arbitrary dimension with the same scalar value in each element

In [8]:
fill_ex = tf.fill([3, 3],7)

In [9]:
print(fill_ex.numpy())

[[7 7 7]
 [7 7 7]
 [7 7 7]]


### Defining and initializing variables
* Unlike a constant, a variable's value can change during computation
* The value of a variable is **shared**, **persistent**, and **modifiable**.
* A variable's **data type and shape are fixed**.

In [10]:
# import tensorflow as tf

# Define a variable
a0 = tf.Variable([1, 2, 3, 4, 5, 6], dtype=tf.float32)
a1 = tf.Variable([1, 2, 3, 4, 5, 6], dtype=tf.int16)

# Define a constant
b = tf.constant(2, tf.float32)

# Compute their product
c0 = tf.multiply(a0, b)
c1 = a0 * b

In [11]:
print(c0.numpy())
print(c1.numpy())

[ 2.  4.  6.  8. 10. 12.]
[ 2.  4.  6.  8. 10. 12.]


* Note that certain TensorFlow operations, such as `tf.multiply` are overloaded, which allows us to use the simpler `a0*b` expression instead.

#### Exercises: Defining data as constants
Throughout this course, we will use `tensorflow` version 2.6.0 and will exclusively import the submodules needed to complete each exercise. This will usually be done for you, but you will do it in this exercise by importing `constant` from `tensorflow`.

After you have imported `constant`, you will use it to transform a `numpy` array, `credit_numpy`, into a `tensorflow` constant, `credit_constant`. This array contains feature columns from a dataset on credit card holders and is previewed in the image below. We will return to this dataset in later chapters.

Note that `tensorflow` 2 allows you to use data as either a `numpy` array or a `tensorflow` `constant` object. Using a constant will ensure that any operations performed with that object are done in `tensorflow`.

```
# Import constant from TensorFlow
from tensorflow import constant

# Convert the credit_numpy array into a tensorflow constant
credit_constant = constant(credit_numpy)

# Print constant datatype
print('\n The datatype is:', credit_constant.dtype)

# Print constant shape
print('\n The shape is:', credit_constant.shape)
```

#### Exercises: Defining variables
Unlike a constant, a variable's value can be modified. This will be useful when we want to train a model by updating its parameters.

Let's try defining and printing a variable. We'll then convert the variable to a `numpy` array, print again, and check for differences. Note that `Variable()`, which is used to create a variable tensor, has been imported from `tensorflow` and is available to use in the exercise.

```
# Define the 1-dimensional variable A1
A1 = Variable([1, 2, 3, 4])

# Print the variable A1
print('\n A1: ', A1)

# Convert A1 to a numpy array and assign it to B1
B1 = A1.numpy()

# Print B1
print('\n B1: ', B1)
```

### Basic operations
* TensorFlow has a model of computation that revolves around the use of graphs
* A TensorFlow graph contains edges and nodes, where the edges are tensors and the nodes are operations

<img src='data/tf_operation_flow.png' width="400" height="200" align="center"/>

### Applying the addition operator
* We first import the constant and add operations so that we may now define 0-, 1-, and 2-dimensional tensors. 

In [12]:
# Import constant and add from tensorflow
# from tensorflow import constant, add

# Define 0-dimensional tensors
A0 = constant([1])
B0 = constant([2])

# Define 1-dimensional tensors
A1 = constant([1, 2])
B1 = constant([3, 4])

# Define 2-dimensional tensors
A2 = constant([[1, 2], [3, 4]])
B2 = constant([[5, 6], [7, 8]])

### Applying the addition operator
* Finally, let's add them together using the operation for tensor addition
* Note that we can perform scalar addition with `A0` and `B0`, vector addition with `A1` and `B1`, and matrix addition with `A2` and `B2`
* The `add()` operation performs **element-wise addition** with two tensors
* **Element-wise addition requires that both tensors have the same shape:**
    * Scalar addition: 1 + 2 = 3
    * Vector addition: [1, 2] + [3, 4] = [4, 6]
    * Matrix addition:

```
A = [[1, 2],
     [3, 4]]
B = [[5, 6], 
     [7, 8]]
A + B = [[6, 8],
         [10,12]]
```
* Furthermore, the `add()` operator is **overloaded**
    * We can also perform addition using the plus symbol

In [13]:
# Perform tensor addition with add()
C0 = add(A0, B0)
C1 = add(A1, B1)
C2 = add(A2, B2)

In [14]:
print(C0.numpy())
print(C1.numpy())
print(C2.numpy())

[3]
[4 6]
[[ 6  8]
 [10 12]]


### How to perform multiplication in TensorFlow
* We will consider both element-wise and matrix multiplication
* **Element-wise multiplication** performed using the `multiply()` operation
    * Tensors involved **must have the same shape**
* **Matrix multiplication** performed with `matmul()` operator 
    * The `matmul(A, B)` operation multiplies `A` by `B`
    * **Note** that number of columns of `A` must equal the number of rows of `B`
    
#### Applying the multiplication operators

In [15]:
# Import operators from tensorflow
# from tensorflow import ones, matmul, multiply

# Define tensors
A0 = ones(1)
A31 = ones([3, 1])
A34 = ones([3, 4])
A43 = ones([4, 3])

* What types of operations are valid on these tensors of ones?
    * We can perform element-wise multiplication of any element by itself
        * `multiply(A0, A0)`, `multiply(A31, A31)`, and `multiply(A34, A34)`
    * We can perform matrix multiplication on `matmul(A43, A34)`
        * but **not** matmul(A43, A43)
        
### Summing over tensor dimensions
* The `reduce_sum()` operator sums over the dimensions of a tensor
* This can be used to sum over all dimensions of a tensor or just one.
* The `reduce_sum()` operator sums over th dimensions of a tensor
    * `reduce_sum(A)` sums over all dimensions of A
    * `reduce_sum(A, i)` sums over dimension i 

In [16]:
# Import operations from tensorflow
# from tensorflow import ones, reduce_sum

# Define a 2x3x4 tensor of ones
F = ones([2, 3, 4])

* If we sum over all elements of A, we get 24, since the tensor contains 24 elements, all of which are 1 

In [17]:
# Sum over all dimensions
D = reduce_sum(F)

# Sum over dimensions 0, 1, and 2
D0 = reduce_sum(F, 0)
D1 = reduce_sum(F, 1)
D2 = reduce_sum(F, 2)

* If we sum over dimension 0, we get a 3 x 4 matrix of 2s
* If we sum over 1, we get a 2 by 4 matrix of 3s
* If we sum over 2, we get a 2 x3 matrix of 4s
* In each case, we reduce the size of the tensor by summing over one of its dimensions

In [18]:
print(D)

tf.Tensor(24.0, shape=(), dtype=float32)


In [19]:
print(D0)

tf.Tensor(
[[2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]], shape=(3, 4), dtype=float32)


In [20]:
print(D1)

tf.Tensor(
[[3. 3. 3. 3.]
 [3. 3. 3. 3.]], shape=(2, 4), dtype=float32)


In [21]:
print(D2)

tf.Tensor(
[[4. 4. 4.]
 [4. 4. 4.]], shape=(2, 3), dtype=float32)


#### Exercises: Making predictions with matrix multiplication
In later chapters, you will learn to train linear regression models. This process will yield a vector of parameters that can be multiplied by the input data to generate predictions. In this exercise, you will use input data, `features`, and a target vector, `bill`, which are taken from a credit card dataset we will use later in the course.

<img src='data/mat_mult.png' width="400" height="200" align="center"/>

The matrix of input data, `features`, contains two columns: education level and age. The target vector, `bill`, is the size of the credit card borrower's bill.

Since we have not trained the model, you will enter a guess for the values of the parameter vector, `params`. You will then use `matmul()` to perform matrix multiplication of `features` by `params` to generate predictions, `billpred`, which you will compare with `bill`. Note that we have imported `matmul()` and `constant()`.

```
# Define features, params, and bill as constants
features = constant([[2, 24], [2, 26], [2, 57], [1, 37]])
params = constant([[1000], [150]])
bill = constant([[3913], [2682], [8617], [64400]])

# Compute billpred using features and params
billpred = matmul(features, params)

# Compute and print the error
error = bill-billpred
print(error.numpy())
```

### Advanced Operations
* In this lesson, we explore advanced operations:
    * `gradient()`
    * `reshape()`
    * `random()`
    
<img src='data/adv_ops.png' width="400" height="200" align="center"/>

* **`gradient()`:** 
    * We will use this function in conjuction with gradient tape
    * Computes the slope of a function at a point
* **`reshape()`:**
    * Changes the shape of a tensor (e.g. 10x10 to 100x1)
* **`random()`:**
    * Generates a tensor out of randomly-drawn values

#### Finding the optimum 
* In many ML problems, we will need to find the optimum (minimum or maximum) of a function 
    * **Minimum:** Lowest value of a loss function
    * **Maximum:** Highest value of objective function
* We can do this using the `gradient()` operation, which tells us the slope of a function at a point
    * We start this process by passing points to the gradient operation until we find one where the gradient is zero
    * **Optimum:** Find a point where gradient = 0
    * **Minimum:** Change in gradient > 0 (if it is increasing, we have a minimum)
    * **Maximum:** Change in gradient < 0 (if it is decreasing, we have a maximum)
  
<img src='data/fixed_gradient.png' width="400" height="200" align="center"/>

* The plot above shows the function `y = x`; notice that the gradient (the slope at a given point) is constant
* This is not true is we instead consider the function `y = x**2` ($y=x^2$)
    * When `x` is less than 0, `y` decreases when `x` increases
    * When `x` is greater than 0, `y` increases when `x` increases
    * Thus, the gradient is initially negative, but becomes positive for `x` larger than 0.
    * This means that `x = 0` **minimizes** `y`

<img src='data/varying_gradient.png' width="400" height="200" align="center"/>

### Gradients in TensorFlow
* We define `x` as `-1.0`
* We then define `y` as `x**2` *within an instance of gradient tape*.
* **Note** that we apply the `watch()` method to an instance of gradient tape and then pass the variable `x`.
* This will allow us to compute the rate of change of `y` with respect to `x`
* Next, we compute the gradient of `y` with respect to `x` using the tape instance of gradient tape
* **Note that y is the first argument and x is the second**
* As written, the operation computes the slope of `y` at a point

In [23]:
# Import tensorflow under the alias tf
# import tensorflow as tf

# Define x
x = tf.Variable(-1.0)

# Define y within instance of GradientTape
with tf.GradientTape() as tape:
    tape.watch(x)
    y = tf.multiply(x, x)
    
# Evaluate the gradient of y at x = -1
g = tape.gradient(y, x)
print(g.numpy())

-2.0


* Running the code and printing we find that the slope is -2 at `x = -1`, which means that `y` is initially decreasing in `x`, as seen in the graph above. 
* Much of the differentiation you do in deep learning models will be handled by high level APIs
* However, **gradient tape remains an invaluable tool for building advanced and custom models.**

### Reshaping images as tensors
* A tool that is particularly usseful for image classification problems: **reshaping**
* While some algorithms allow you to exploit the shape of the original image, other require you to `reshape` matrices into vectors before using them as inputs, as shown in the diagram

#### Reshaping a grayscale image
* Below we create a random grayscale image by drawing numbers from the set of integers between 0 and 255 (grayscale pixel scale) and use these to populate a 2x2 matrix
* We can then reshape this into a 4x1 vector

In [24]:
# Import tensorflow as alias tf
# import tensorflow as tf

# Generate grayscale image
gray = tf.random.uniform([2, 2], maxval=255, dtype='int32')

# Reshape grayscale image
gray = tf.reshape(gray, [2*2, 1])

<img src='data/reshape_grayscale.png' width="300" height="150" align="center"/>

#### How to reshape a color image
* For color images, we generate 3 such matrices to form a 2x2x3 tensor
* We could then reshape the image into a 4x3 tensor, as shown in the diagram

In [25]:
# Import tensorflow as alias tf
# import tensorflow as tf

# Generate color image
color = tf.random.uniform([2, 2, 3], maxval= 255, dtype='int32')

# Reshape color image
color = tf.reshape(color, [2*2, 3])

#### Exercises: Reshaping tensors
Later in the course, you will classify images of sign language letters using a neural network. In some cases, the network will take 1-dimensional tensors as inputs, but your data will come in the form of images, which will either be either 2- or 3-dimensional tensors, depending on whether they are grayscale or color images.

The figure below shows grayscale and color images of the sign language letter A. The two images have been imported for you and converted to the numpy arrays `gray_tensor` and `color_tensor`. Reshape these arrays into 1-dimensional vectors using the `reshape` operation, which has been imported for you from `tensorflow`. Note that the shape of `gray_tensor` is 28x28 and the shape of `color_tensor` is 28x28x3.

<img src='data/asl_a.png' width="200" height="100" align="center"/>

```
# Reshape the grayscale image tensor into a vector
gray_vector = reshape(gray_tensor, (28*28, 1))

# Reshape the color image tensor into a vector
color_vector = reshape(color_tensor, (28*28*3, 1))
```

#### Exercises: Optimizing with gradients
You are given a loss function, $y = x^2$, which you want to minimize. You can do this by computing the slope using the `GradientTape()` operation at different values of `x`. If the slope is positive, you can decrease the loss by lowering `x`. If it is negative, you can decrease it by increasing `x`. This is how gradient descent works.

<img src='data/varying_gradient.png' width="300" height="150" align="center"/>

In practice, you will use a high level `tensorflow` operation to perform gradient descent automatically. In this exercise, however, you will compute the slope at `x` values of -1, 1, and 0. The following operations are available: `GradientTape()`, `multiply()`, and `Variable()`.

```
def compute_gradient(x0):
  	# Define x as a variable with an initial value of x0
	x = Variable(x0)
	with GradientTape() as tape:
		tape.watch(x)
        # Define y using the multiply operation
		y = multiply(x,x)
    # Return the gradient of y with respect to x
	return tape.gradient(y, x).numpy()

# Compute and print gradients at x = -1, 1, and 0
print(compute_gradient(-1.0))
print(compute_gradient(1.0))
print(compute_gradient(0.0))
```

#### Exercises: Working with image data
You are given a black-and-white image of a `letter`, which has been encoded as a tensor, `letter`. You want to determine whether the letter is an X or a K. You don't have a trained neural network, but you do have a simple model, `model`, which can be used to classify `letter`.

The 3x3 tensor, `letter`, and the 1x3 tensor, `model`, are available in the Python shell. You can determine whether `letter` is a K by multiplying `letter` by `model`, summing over the result, and then checking if it is equal to 1. As with more complicated models, such as neural networks, `model` is a collection of weights, arranged in a tensor.

Note that the functions `reshape()`, `matmul()`, and `reduce_sum()` have been imported from `tensorflow` and are available for use.

```
# Reshape model from a 1x3 to a 3x1 tensor
model = reshape(model, (3, 1))

# Multiply letter by model
output = matmul(letter, model)

# Sum over output and print prediction using the numpy method
prediction = reduce_sum(output)
print(prediction.numpy())
```

<img src='data/mat_mult.png' width="400" height="200" align="center"/>