# Class 01 - TensorFlow Fundamentals

# Some Setup

In [None]:
import random
import tensorflow as tf
import numpy as np

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

Instructions for updating:
non-resource variables are not supported in the long term


# Computation Graphs

The primary construct in the TensorFlow API is the "computation graph", or just "graph." A computation graph is a useful way to illustrate the flow of data through multiple operations. Here's a basic example showing the graph for adding two numbers together:

<img src="https://i.ibb.co/zNNGfZR/01.png" width=40% />


The arrows represent data flowing through through the graph (in this case the numbers 7, 3, and 10) while the nodes represent some form of computation (in this case addition).

We can chain multiple of these computations together to form a more complex transformation of the data:

<img src="https://i.ibb.co/CQFMbrJ/02.png" width=60% />

Here, the numbers 3 and 7 are sent to an addition operation and a multiplication operation. The outputs of both of these are then sent to a subtraction operation. There are a few ways to illustrate the above operations.

* We could show this as a one-liner math equation, using parentheses to show the order of operations:

$$
output = \left( 3 + 7 \right) - \left( 3 \times 7 \right) = -11
$$

* We could also define the separate operations as mathematical functions:

$$
\begin{split}
a(x, y) &= x + y \\
b(x, y) &= x \times y \\
c(x, y) &= a(x, y) - b(x, y) \\ \\
c(3, 7) &= a(3, 7) - b(3, 7) = -11
\end{split}
$$

* Programmatically, the previous equations could be created with something like this:

In [None]:
def add(x, y):
    return x + y

def multiply(x, y):
    return x * y

def subtract(x, y):
    return x - y

x = 3
y = 7
a = add(x, y)
b = multiply(x, y)
c = subtract(a, b)

c

-11

The data isn't just one and done, either; it can be reused multiple times throughout the computation:

<img src="https://i.ibb.co/LQspNMt/03.png" width=70%/>

This graph is similar to the previous model, but now we're reusing the 3 by adding it back in at the end. The code might look like this:

In [None]:
x = 3
y = 7
a = add(x, y)
b = multiply(x, y)
c = subtract(a, b)
d = add(c, x)

d

-8

# My First TensorFlow Graph

## Fundamental TensorFlow work flow

This pattern will be used again and again throughout the course. When working with TensorFlow, your code will effectively be divided into two parts:

1. Define a graph which contains your model
2. Run the graph.  Two special cases are:
  * Train the model
  * Test/predict using the model

## Step 1: Define a computation graph

In [None]:
# tf.placeholder creates an "input" node
# we MUST give it value when we run our model
# these can be data we want to learn from or
# values of hyper-parameters for our model
a = tf.placeholder(tf.int32, name="input_a")
b = tf.placeholder(tf.int32, name="input_b")

# tf.add creates an addition node
c = tf.add(a, b, name="add")

# tf.multiply creates a multiplication node
d = tf.multiply(a, b, name="multiply")

# Add up the results of the previous two nodes
out = tf.add(c, d, name="output")

## Step 2: Run the graph

In [None]:

# Start a session
sess = tf.Session()

# Create a "feed_dict" dictionary to define input values
# Keys to dictionary are handles to our placeholders
# Values to dictionary are values we'd like to feed in
feed_dict = { a: 4, b: 3 }

# Execute the graph using `sess.run()`, which takes two parameters:
# - `fetches` lists which node(s) we'd like to receive as output
# - `feed_dict` feeds in key-value pairs to input to various nodes
# In this case, we pass in the Tensor `out` as our value for `fetches`,
# which causes the value for out to be computed and returned
result = sess.run(out, feed_dict=feed_dict)

# Print the value of `out`
print("({0}*{1}) + ({0}+{1}) = {2!s}".format(feed_dict[a], feed_dict[b], result))

# Close the session
sess.close()

(4*3) + (4+3) = 19


# TensorFlow Core API

Our main TensorFlow objects are:

  * `tf.Tensor`
  * `tf.Operation`
  * `tf.Graph`
  * `tf.Session`
  * `tf.Variable`

We will introduce each of these below.

## `Tensor` Objects

##### What is a Tensor?

Tensors, for our purposes, are $n$-dimensional matrices (or tables of numbers).
  * a 0-dimensional tensor is a single number (or scalar)
  * a 1-dimensional tensor is a vector, and
  * a 2-dimensional tensor is a standard matrix.
  
Higher dimensional tensors are simply referred to as an *$n$-D tensor*.  Every value that is passed through a TensorFlow model is a `Tensor` object - the TensorFlow representation of a tensor.

##### Defining tensors by hand

You can define `Tensor` object values in two main ways:

1. Native Python types
2. NumPy arrays (recommended)

Both of these can be automatically converted into TensorFlow `Tensor` objects.

### Tensors from Native Python

In [None]:
# 0-D tensor (scalar)
t_0d_py = 4

# 1-D tensor (vector)
t_1d_py = [1, 2, 3]

# 2-D tensor (matrix)
t_2d_py = [[1, 2],
           [3, 4],
           [5, 6]]

# 3-D tensor
t_3d_py = [[[0, 0], [0, 1], [0, 2]],
           [[1, 0], [1, 1], [1, 2]],
           [[2, 0], [2, 1], [2, 2]]]

python_defined_tensors = [t_0d_py, t_1d_py, t_2d_py, t_3d_py]

# tf.constant creates a tf.Tensor from a fixed value
# you can read more here:
#  https://www.tensorflow.org/api_docs/python/tf/constant
for pdt in python_defined_tensors:
    print(tf.constant(pdt))

Tensor("Const:0", shape=(), dtype=int32)
Tensor("Const_1:0", shape=(3,), dtype=int32)
Tensor("Const_2:0", shape=(3, 2), dtype=int32)
Tensor("Const_3:0", shape=(3, 3, 2), dtype=int32)


Interestingly, we haven't *run* this constant tensor yet.  We've only defined it.  And, it knows a little bit about itself:  its shape.  We'll get into these details in just a bit.

### NumPy Arrays

Pretty much the same as native Python, but with the `numpy.array` function wrapping it:

In [None]:
# 0-D tensor (scalar)
t_0d_np = np.array(4, dtype=np.int32)

# 1-D tensor (vector)
t_1d_np = np.array([1, 2, 3], dtype=np.int64)

# 2-D tensor (matrix)
t_2d_np = np.array([[1, 2],
                    [3, 4],
                    [5, 6]],
                   dtype=np.float32)

# 3-D tensor
t_3d_np = np.array([[[0, 0], [0, 1], [0, 2]],
                    [[1, 0], [1, 1], [1, 2]],
                    [[2, 0], [2, 1], [2, 2]]],
                   dtype=np.int32)

numpy_defined_tensors = [t_0d_np, t_1d_np, t_2d_np, t_3d_np]

for ndt in numpy_defined_tensors:
    print(tf.constant(ndt))

Tensor("Const_4:0", shape=(), dtype=int32)
Tensor("Const_5:0", shape=(3,), dtype=int64)
Tensor("Const_6:0", shape=(3, 2), dtype=float32)
Tensor("Const_7:0", shape=(3, 3, 2), dtype=int32)


### Data Types

In general, using `np.array` (or `np.asarray`) is the recommended way of defining values for tensors by hand in TensorFlow. The primary reason for this is that you can specify the exact data type (`dtype`) you'd like the values to be represented with. For example, there's no way to specify a 32-bit integer vs a 64-bit integer with native Python. TensorFlow is tightly integrated with NumPy, and most TensorFlow data types have a corresponding NumPy `dtype`:

TensorFlow type | Equivalent NumPy type | Description
--- | --- | ---
`tf.float32` | `np.float32` | 32 bit floating point.
`tf.float64` | `np.float64` | 64 bit floating point.
`tf.int8` | `np.int8` | 8 bit signed integer.
`tf.int16` | `np.int16` | 16 bit signed integer.
`tf.int32` | `np.int32` | 32 bit signed integer.
`tf.int64` | `np.int64` | 64 bit signed integer.
`tf.uint8` | `np.uint8` | 8 bit unsigned integer.
`tf.string` | N/A | String type, as byte array
`tf.bool` | `np.bool` | Boolean.
`tf.complex64` | `np.complex64` | Complex number made of two 32 bit floating point numbers: real and imaginary parts.
`tf.qint8` | N/A | 8 bit signed integer used in quantized Ops.
`tf.qint32` | N/A | 32 bit signed integer used in quantized Ops.
`tf.quint8` | N/A | 8 bit unsigned integer used in quantized Ops.

Slightly modified version of [this table](https://www.tensorflow.org/versions/master/resources/dims_types.html#data_types).

In [None]:
# Just to show that they are equivalent
(tf.float32 == np.float32 and
 tf.float64 == np.float64 and
 tf.int8 == np.int8 and
 tf.int16 == np.int16 and
 tf.int32 == np.int32 and
 tf.int64 == np.int64 and
 tf.uint8 == np.uint8 and
 tf.bool == bool and
 tf.complex64 == np.complex64)

True

The primary exception to when you should _not_ use `np.array()` is when defining a `Tensor` of strings. When using strings, just use standard Python lists. It's best practice to include the `b` prefix in front of strings to explicitly define the strings as byte-arrays:

In [None]:
tf_string_tensor = [b"first", b"second", b"third"]

### Tensor Shapes

A common term in TensorFlow is a `Tensor` object's "shape". A shape value is a list or tuple containing an ordered set of integers. The _i_-th  element in the list describes the length of the _i_-th dimension in the tensor, while the number of elements in the list defines the dimensionality of the tensor. Here are some examples:

In [None]:
# Shapes corresponding to scalars
# Note that either lists or tuples can be used
s_0d_list = []
s_0d_tuple = ()

# Shape corresponding to a vector of length 3
s_1d = [3]

# Shape corresponding to a 2-by-3 matrix
s_2d = (2, 3)

# Shape corresponding to a 4-by-4-by-4 cube tensor
s_3d = [4, 4, 4]

You can use the `tf.shape` Operation to get the shape value of `Tensor` objects:

In [None]:
# note, np.ndarray.reshape takes its args "flattened" out
#       (not in a list or tuple)
arr = np.arange(24).reshape(2,3,4)
print("In NumPy:", arr.shape,arr,sep="\n")

# tf.shape creates an Operation that returns a Tensor
#          the returned Tensor is the shape of arr
shape_op = tf.shape(arr)
print("In TensorFlow:", shape_op, sep="\n")

shape = tf.Session().run(shape_op)
print("Shape of tensor: " + str(shape))

In NumPy:
(2, 3, 4)
[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]
In TensorFlow:
Tensor("Shape:0", shape=(3,), dtype=int32)
Shape of tensor: [2 3 4]


As mentioned, defining an `Operation` doesn't *execute* it.  To execute it, we have to run it.  We do that with a helper `tf.Session`.  More to come below!  Quick quiz: is `shape_op` a 3D tensor or 1D tensor?

## TensorFlow `Operation` Objects

TensorFlow `Operation` objects (commonly abbreviated as "Ops" in the TensorFlow documentation) are nodes that perform compuation on or with Tensor objects. They take as input zero or more `Tensor` objects (or objects that can be converted into tensors- see the previous section), and output zero or more tensors. These outputs can then either be returned to the client or passed on to further Operations. Operations are the fundamental building blocks of any TensorFlow graph- their calculations represent nodes, and data flowing from one to the next represents edges.

We've already seen a few `Operation` examples: `tf.add` and `tf.multiply` are classic examples: they both take in two tensors and output one. When given non-scalar values, they do addition/multiplication element-wise.  Also, `tf.constant` and `tf.shape` are `Operation`s.

In [None]:
# Initialize some arrays to be fed into tf.add as Tensors
a = np.array([1, 2], dtype=np.int32)
b = np.array([3, 4], dtype=np.int32)

# tf.add creates an "add" Operation and places it in the graph
# The variable c will be a handle to the output of the operation
# This output can be passed on to other Operations!
c = tf.add(a, b)

The important thing to remember is that Operations do not execute when created - that's the reason `tf.add([1, 2],[3, 4])` doesn't return the value `[4, 6]` immediately. It must be passed into a `Session.run()` method, which we'll cover in more detail below.

In [None]:
result = tf.Session().run(c)
print(result)

[4 6]


In [None]:
# or, since we're in a notebook, we can just evaluate the result
# (without explicitly needing to print it out)
# note: that the returned value is a numpy array
tf.Session().run(c)

array([4, 6], dtype=int32)

The majority of the TensorFlow API is Operations.  In addition to Operation-specific inputs, each Operation can take in a `name` parameter, which can help identify Operations in TensorBoard and other tools.

In [None]:
c = tf.add(a, b, name="my_add_operation")

Getting into the habit of adding names to your Operations now will save you headaches later on.

## TensorFlow `Graph` Objects

When TensorFlow is imported into Python, it automatically creates a `Graph` object and makes it the default graph. You can create more graphs as well:

In [None]:
# Create a new graph - constructor takes no parameters
new_graph = tf.Graph()

However, operations (such as `tf.add` and `tf.multiply`) are added to the default graph when created. To add operations to your new graph, use a `with` statement along with the graph's `as_default()` method. This makes that graph the default while inside of the `with` block:

In [None]:
with new_graph.as_default():
    a = tf.add(3, 4)
    b = tf.multiply(a, 2)

The default graph, other than being set to the default, is no different than any other `Graph`. If you need to get a handle to the default graph, use the `tf.get_default_graph` function:

default_graph = tf.get_default_graph()

Note: `get_default_graph()` will return whatever graph is set to the default, so if you are inside of a `with g.as_default()` block, `get_default_graph()` will return `g`:

In [None]:
with new_graph.as_default():
    print(new_graph is tf.get_default_graph())

True


*Most TensorFlow models will not require more than one graph per script.*  So, often, you can simply use the default, implicit graph and be done. However, you may find multiple `Graph` instances useful when defining two independent models side-by-side. Additionally, there are mechanisms to export and import external models and load them in as `Graph` objects, which can allow you to feed the output of existing models into your new model (or vice versa). We won't be able to demonstrate these now, but see the TensorFlow API for more info:
  * [`Graph.as_graph_def`](https://www.tensorflow.org/versions/master/api_docs/python/framework.html#Graph.as_graph_def)
  * [`tf.import_graph_def`](https://www.tensorflow.org/versions/master/api_docs/python/framework.html#import_graph_def)

## TensorFlow `Session`

### Creating Sessions

As we saw earlier, `Session` objects are used to launch and execute graphs. Earlier, we created a session using its default constructor, but it has three optional parameters:

* `target` specifies the execution engine to use. By default it is the empty string, which causes the Session to use the standard local execution context. Typically, this parameter is only used when using TensorFlow in a distributed setting
* `graph` specifies which `Graph` object the session should run. The default value is `None`, which causes the `Session` to load in the default graph. Sessions only manage one graph at a time, so executing more than one graph will require more than one session
* `config` allows users to specify advanced options to configure the session. We won't cover this today, but some things that are available are: limiting the number of CPUs/GPUs used, logging options, and changing optimization of the graph

In [None]:
# A session with the default graph launched
# Equivalent to: tf.Session(graph=tf.get_default_graph())
sess_default = tf.Session()

# A session with new_graph launched
new_graph = tf.Graph()
sess_new = tf.Session(graph=new_graph)

### Running Sessions

The most important method of a `Session` is its `run()` function. Earlier in this notebook, we saw basic usage of the two primary parameters to `run()`: `fetches` and `feed_dict`.

##### Retrieving information: `fetches`

`fetches` expects a list of `Tensor` and/or `Operation` handles (or just a single `Tensor`/`Operation`). The list specifies what computations we would like TensorFlow to run, as well as what we'd like `run()` to output:

In [None]:
# equiv:  tf.Session().run(tf.add(3,2))
sess_default.run(tf.add(3,2))

5

TensorFlow will only perform calculations necessary to compute the values specified in `fetches`, so it won't waste time if you only need to run a small part of a large, complicated graph.

##### Sending information: `feed_dict`

`feed_dict` is an optional parameter to `run`, but it becomes *required* when placeholder nodes are included. We saw it used to feed input data to placeholders, but `feed_dict` can actually send values to any node. The keys to the dictionary should be handles to `Tensor` objects (usually outputs of Operations), and the values should be replacement data:

In [None]:
# Create Operations, Tensors, etc (using the default graph)
a = tf.add(3, 4)
b = tf.multiply(a, 5)

# Define a dictionary that says to replace the value of `a` with 15
replacers = {a: 15}

# Run the session without feed_dict
# Prints (3 + 4) * 5 = 35
print(sess_default.run(b))

# Run the session, passing in `replace_dict` as the value to `feed_dict`
# Prints 15 * 5 = 75 instead of 7 * 5 = 35
print(sess_default.run(b, feed_dict=replacers))

35
75


When using placeholders,TensorFlow insists that any calls to `Session.run()` include `feed_dict` values for all placeholders:

In [None]:
a = tf.placeholder(tf.int32, name="my_placeholder")
b = tf.add(a, 3)

# This raises an error (with a LONG error message, so we
# use try-except to catch it and print out just a portion)
try:
    sess_default.run(b)
except tf.errors.InvalidArgumentError as e:
    print("The error:\n", e.message)

The error:
 Graph execution error:

Detected at node 'my_placeholder' defined at (most recent call last):
    File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    File "/usr/local/lib/python3.10/dist-packages/colab_kernel_launcher.py", line 37, in <module>
    File "/usr/local/lib/python3.10/dist-packages/traitlets/config/application.py", line 992, in launch_instance
    File "/usr/local/lib/python3.10/dist-packages/ipykernel/kernelapp.py", line 619, in start
    File "/usr/local/lib/python3.10/dist-packages/tornado/platform/asyncio.py", line 195, in start
    File "/usr/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
    File "/usr/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
    File "/usr/lib/python3.10/asyncio/events.py", line 80, in _run
    File "/usr/local/lib/python3.10/dist-packages/tornado/ioloop.py", line 685, in <lambda>
    File "/usr/local/lib/python3.10

In [None]:
a = tf.placeholder(tf.int32, name="my_placeholder")
b = tf.add(a, 3)

# Create feed dictionary
feed_dict = {a: 8}

# Now it works!
print(sess_default.run(b, feed_dict=feed_dict))

11


In [None]:
# Closing out the Sessions we opened up
sess_default.close()
sess_new.close()

## TensorFlow `Variable` Objects

The last fundamental TensorFlow class is the `Variable`. A TensorFlow `Variable` has persistent state across multiple calls to `Session.run()`, which means that learned parameters in machine learning models are Variables. We can create a Variable with a starting value of 0 like so:

In [None]:
my_var = tf.Variable(0, name="my_var")

However, even though the object has been created, the value of the `Variable` has to be initialized separately with either of the `tf.variables_initializer()` or, more commonly, `tf.global_variables_initializer()` Operations. Remember that Operations must be passed into `Session.run()` to be executed:

In [None]:
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)

Having value initialization separated from object creation allows us to reinitialize the variable later if we'd like.

*Note: `tf.global_variables_initializer()` used to be named `tf.initialize_all_variables()`, and `tf.variables_initializer()` used to be called `tf.initialize_variables()`.* These were renamed just before version 1.0.0 of TensorFlow, so if you follow older tutorials, you may need to update these functions.

Now that the Variable is initialized, we can tweak it's value! Let's do some basic incrementing with the `Variable.assign()` method:

In [None]:
increment = my_var.assign(my_var + 1)

for i in range(10):
    print(sess.run(increment), end=" ")

1 2 3 4 5 6 7 8 9 10 

You may notice that if you run the previous code multiple times in the notebook (i.e., rerun the prior cell a few times in a row), the value persists and continues to climb. The Variable's state is maintained by the Session object, and the state will persist unless either the session is close, the Variable is re-initialized, or a new value is assigned to the Variable.

In [None]:
# Re-initialize variables
sess.run(init)

# Start incrementing, beginning from 0 again
for i in range(10):
    print(sess.run(increment), end=" ")

1 2 3 4 5 6 7 8 9 10 

##### Trainable Variables

There are several optional parameters in the `Variable` constructor, but one to pay close attention to is `trainable`. It takes in a boolean value, which defaults to `True`, and specifies to TensorFlow whether the built-in optimization functions (which we will cover in a separate notebook) should affect this `Variable`. **If a `Variable` in you model should _not_ be adjusted during gradient descent, make sure to set its `trainable` parameter to `False`**

# Exercise 1

##### Part 1
Create a TensorFlow `Graph` that is based on the following image:

<img src="https://i.ibb.co/HhVsjfq/04.png" width=70%/>

[Use TensorFlow's API guide for math operations](https://www.tensorflow.org/api_guides/python/math_ops) to find Operations you don't know.  You'll want to look in the section called "Reduction".  If you are confused about how to give inputs to a reduction Operation, don't get too fancy.  Remember, a Python list (of Python lists/scalars *or* of `Tensor`s) can be used to create another input `Tensor`.

##### Part 2

Once you've created the graph, use a `Session` to run the graph and confirm the following input/output pairs:

|In | Out|
|---|----|
|1, 2, 3| 14|
|-1, -2, 3| 2|
|123, 456, 789| 44669304 |

##### Part 3
Finally, use `tf.summary.FileWriter` to output the `Graph` to disk and double check that your model resembles the above image in TensorBoard (the image reads from left to right, TensorBoard displays them from bottom to top.

## Solution 1

##### Part 1

In [None]:
new_graph = tf.Graph()
with new_graph.as_default():
    in_1 = tf.placeholder(tf.int32, name="in_1")
    in_2 = tf.placeholder(tf.int32, name="in_2")
    in_3 = tf.placeholder(tf.int32, name="in_3")

    mul_1 = tf.multiply(in_1, in_2, name="mul_1")
    mul_2 = tf.multiply(in_2, in_3, name="mul_2")

    stacked_prod_input = tf.stack([in_1, in_2, in_3])
    prod_  = tf.reduce_prod(stacked_prod_input, name="prod")
    #prod_  = tf.reduce_prod([in_1, in_2, in_3], name="prod")

    stacked_sum_input = tf.stack([mul_1, prod_, mul_2])
    sum_  = tf.reduce_sum(stacked_sum_input, name="sum")

##### Part 2

In [None]:
ng_sess = tf.Session(graph=new_graph)
for inputs in [(1, 2, 3),
               (-1, -2, 3),
               (123, 456, 789)]:
    input_values = dict(zip([in_1, in_2, in_3], inputs))
    output_values = ng_sess.run(fetches=sum_, feed_dict=input_values)
    print(inputs, "--->", output_values)

(1, 2, 3) ---> 14
(-1, -2, 3) ---> 2
(123, 456, 789) ---> 44669304
