# TensorFlow Fundamentals

Learning Objectives:

* Gain experience with low-level Tensorflow operations
* Learn to use GradientTape to calculate partial derivatives and perform gradient descent
* Learn about the tf.data.Dataset class, including batching

## Calculating Gradients

In a previous exercise, we practiced calculating partial derivatives on the following example:

$$ f(x,y) = \sqrt{x^2 + y^2}$$

$$\frac{\partial f}{\partial x} = \frac{x}{\sqrt{x^2 + y^2}}$$

$$\frac{\partial f}{\partial y} = \frac{y}{\sqrt{x^2 + y^2}}$$

### Question
Take a second to calcuate the following by hand:

*   $\displaystyle f(3, 4) = ??$
    

*   $ \displaystyle \frac{\partial f(3, 4)}{\partial x} = ??$
    
   
*   $ \displaystyle \frac{\partial f(3, 4)}{\partial y} = ??$
   


### Answers:
* 
* 
* 

At its core, TensorFlow is a library for representing mathematical operations as graphical structures and automating the process of computing partial derivatives.  We can use TensorFlow to write numpy-style mathematical operations:


In [1]:
import tensorflow as tf
import numpy as np

def f(x, y):
    return tf.sqrt(x**2 + y**2)
    
x = tf.Variable(3, dtype=tf.float32)
y = tf.Variable(4, dtype=tf.float32)

print(f(x, y))

tf.Tensor(5.0, shape=(), dtype=float32)


More interestingly, we can use a `GradientTape` to record mathematical operations for automatic differentiation:

In [2]:
with tf.GradientTape(persistent=True) as tape:
    tape.watch(x)
    tape.watch(y)
    fxy = f(x, y)
    
df_dx = tape.gradient(fxy, x)
df_dy = tape.gradient(fxy, y)

print("f(3, 4) = {:.5}".format(fxy))
print("df(3, 4)/dx = {:.5}".format(df_dx))
print("df(3, 4)/dy = {:.5}".format(df_dy))

f(3, 4) = 5.0
df(3, 4)/dx = 0.6
df(3, 4)/dy = 0.8


Once we have the partial derivatives, we can minimize our function using gradient descent.  Use the cell below to find the x and y that minimize $ f(x,y) = \sqrt{x^2 + y^2}$.  Adjust the learning rate and the number of iterations (*not* the starting x and y values) until the code converges to something close to the minimum value for the function.

In [3]:
learning_rate = .01
iterations = 10

x = tf.Variable(3, dtype=tf.float32)
y = tf.Variable(4, dtype=tf.float32)

for iteration in range(iterations):
    with tf.GradientTape(persistent=True) as tape:
        tape.watch(x)
        tape.watch(y)
        fxy = f(x, y)
    
    print('current "loss": {}'.format(fxy))
    
    df_dx = tape.gradient(fxy, x)
    df_dy = tape.gradient(fxy, y)
    
    x = x - learning_rate * df_dx
    y = y - learning_rate * df_dy
    

print("\nx: {}".format(x.numpy()))
print("y: {}".format(y.numpy()))
    

current "loss": 5.0
current "loss": 4.990000247955322
current "loss": 4.980000019073486
current "loss": 4.970000267028809
current "loss": 4.960000038146973
current "loss": 4.950000286102295
current "loss": 4.940000057220459
current "loss": 4.930000305175781
current "loss": 4.9200005531311035
current "loss": 4.910000801086426

x: 2.939999580383301
y: 3.9200010299682617


### Questions:

* What learning rate and iteration count did you settle on? 
* Where does this this have its minimum? (Note that this is a case where we don't *need* to use gradient descent to find the solution. You should be able to predict the minimum value without executing the code above.)

### Answers:
* 
* 

## DataSets

In machine learning it is often the case that training data is too large to fit in memory on a single machine.  We may also want to perform some pre-processing on the data as it is loaded.  The `tf.data.Dataset` class provides a standard interface of feeding data to a machine learning model.  `Dataset` objects act as Python generators. 

We can create a Dataset from a numpy array using the `from_tensor_slices` method:


In [4]:

#Generate 6 random two-dimensional elements as column vectors:

features = np.round(np.random.random((6, 2, 1)), 2)
print("Numpy array of data:\n")
print(features)
  
# Build a dataset:

dataset = tf.data.Dataset.from_tensor_slices(features)

# iterate over the elements in the dataset:

print("\nIterate over the corresponding Dataset:\n")
for element in dataset:
    print(element)

Numpy array of data:

[[[1.  ]
  [0.77]]

 [[0.89]
  [0.87]]

 [[0.03]
  [0.6 ]]

 [[0.56]
  [0.72]]

 [[0.  ]
  [0.48]]

 [[0.2 ]
  [0.82]]]

Iterate over the corresponding Dataset:

tf.Tensor(
[[1.  ]
 [0.77]], shape=(2, 1), dtype=float64)
tf.Tensor(
[[0.89]
 [0.87]], shape=(2, 1), dtype=float64)
tf.Tensor(
[[0.03]
 [0.6 ]], shape=(2, 1), dtype=float64)
tf.Tensor(
[[0.56]
 [0.72]], shape=(2, 1), dtype=float64)
tf.Tensor(
[[0.  ]
 [0.48]], shape=(2, 1), dtype=float64)
tf.Tensor(
[[0.2 ]
 [0.82]], shape=(2, 1), dtype=float64)


## Batches

It is usually more efficent to process data in *batches* than individually. Here is an example of Tensorflow code that multiplies each element in our data set by an appropriately sized weight vector and sums the result.  In this example each element is processed individually.

In [5]:
total = tf.Variable(np.zeros((1,1)))
weights = tf.Variable(np.random.random((2,1)))

for element in dataset:
    total = total + tf.matmul(tf.transpose(weights), element)
    print("Total so far: {}".format(total))

print("\nFinal Total: {}".format(total))

Total so far: [[1.18588664]]
Total so far: [[2.30891774]]
Total so far: [[2.55561414]]
Total so far: [[3.32539864]]
Total so far: [[3.50105733]]
Total so far: [[3.9819611]]

Final Total: [[3.9819611]]


Instead of processing one data element per iteration, we can batch the dataset and process multiple elements per iteration.  Many TensorFlow operators, including `tf.matmul`, are "batch-aware" and will recognize that the first dimension corresponds to the batch.  Let's look at a batched version of our dataset:

In [6]:
dataset_batched = dataset.batch(2)
for batch in dataset_batched:
    print("Shape: {}\n".format(batch.shape))
    print("Elements:\n {}\n".format(batch))

Shape: (2, 2, 1)

Elements:
 [[[1.  ]
  [0.77]]

 [[0.89]
  [0.87]]]

Shape: (2, 2, 1)

Elements:
 [[[0.03]
  [0.6 ]]

 [[0.56]
  [0.72]]]

Shape: (2, 2, 1)

Elements:
 [[[0.  ]
  [0.48]]

 [[0.2 ]
  [0.82]]]



In [7]:
total = tf.Variable(np.zeros((1, 1)))

for batch in dataset_batched:
    batch_of_products = tf.matmul(tf.transpose(weights), batch)
    total = total + tf.reduce_sum(batch_of_products)
    print("Total so far: {}".format(total))

print("\nFinal Total: {}".format(total))

Total so far: [[2.30891774]]
Total so far: [[3.32539864]]
Total so far: [[3.9819611]]

Final Total: [[3.9819611]]
