In [1]:
import tensorflow as tf
print(f"TensorFlow version: {tf.__version__}")

TensorFlow version: 2.2.0


## GradientTape
GradientTape records operation for Automatic Differentation. With this, tensorflow can compute the derivative of a function with respect to any parameters.



### 1 Example
Given a dataset $S = [(xi
, yi), i = 1, . . . , m]$. with $xi ∈ R$ and $ yi ∈ R $ and the mapping  $f:x \to y$ described as: $ y = w_0 + w_1x $, where $w_0, w_1 ∈ R $ are two parameters to determine the best line to fit the data. Using a open algorithm to find the solution we defined our loss ass:

$$J = \frac{1}{m} \sum_{i=1}^m(\bar{y}-y_i)^2 $$

When $J(w0, w1)$ is near zero, it means the proposed line can fit the dataset and model an accurate relation between $xi$ and $yi$. The best line with parameter $(w_0^*,w_1^*)$ can reach the minimum value of the error fucntion $J(w0, w1)$:

$$(w_0^*,w_1^*) = argmin_{w_0,w_1} \;\; J(w_0,w_1)$$

From here we know the solution is

$$ \bigtriangledown J(w_0,w_1) = 0 $$

Since:
$$J = \frac{1}{m} \sum_{i=1}^m( w_0 + w_1x-y_i)^2  $$

$ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;$
$\frac{dJ}{dw_1} = \frac{2}{m} \sum_{i=1}^m( w_0 + w_1x-y_i)x $ 
$ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;$
$\frac{dJ}{dw_0} = \frac{2}{m} \sum_{i=1}^m( w_0 + w_1x-y_i) $

In order to solve this simple linear regression with TensorFlow, we will only compute forward pass, since tensorflow can handle backpropagation by itself we will let it do it.

In [28]:
### Set Training and Test set ###
X_train = tf.constant(range(10), dtype=tf.float32)     # Create a R=1 Tensor from 0-9
Y_train = 2 * X_train + 10                             # Create a R=1  maped by the function 2x + 10

X_test = tf.constant(range(10, 20), dtype=tf.float32)  # Create a R=1 Tensor from 10-19
Y_test = 2 * X_test + 10                               # Create a R=1  maped by the function 2x + 10

### GET TO KNOW DATA ###
print("TRAIN SET:")
print(f"X:{X_train}")
print(f"Y:{X_train}\n")

print("TEST SET:")
print(f"X:{X_test}")
print(f"Y:{Y_test}")

TRAIN SET:
X:[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
Y:[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]

TEST SET:
X:[10. 11. 12. 13. 14. 15. 16. 17. 18. 19.]
Y:[30. 32. 34. 36. 38. 40. 42. 44. 46. 48.]


In [34]:
def loss_mse(X, Y, w0, w1):
    """ 1 Layer Forward pass"""
    Y_hat = w1* X + w0             # Forward Propagation
    errors = (Y_hat - Y)**2        # Compute  Squared error
    error = tf.reduce_mean(errors) # Sum over erros to obtain a constant (r=0 Tensor) and divide by mean
    return error

def compute_gradients(X, Y, w0, w1): 
    """Saves loss_mse in a tape"""
    ### Use a contex Manager to obtain tape ###
    with tf.GradientTape() as tape:
        loss = loss_mse(X, Y, w0, w1)         # Perfrom operations
        
    return  tape.gradient(loss, [w0, w1])  # Save Forward pass and variables you wish to obtain gradients

def gradient_validation(X, Y, w0, w1):
    " THIS IS JUST A VALIDATION GRADIENT"
    x = X.numpy()
    y = Y.numpy()
    w_0 = w0.numpy()
    w_1 = w1.numpy()
    suma_w0=0
    suma_w1=0
    for xi,yi in zip(x,y):
        suma_w0 += (w_0 + w_1*xi - yi)
        suma_w1 += (w_0 + w_1*xi - yi)*xi
    
    suma_w0 = (2*suma_w0)/len(x)
    suma_w1 = (2*suma_w1)/len(x)
    
    return suma_w0,suma_w1


### QUICK TEST ###

### Init Weights ###
w0 = tf.Variable(0.0)
w1 = tf.Variable(0.0)

### Obtain  Gradient ###
dw0, dw1 = compute_gradients(X_train, Y_train, w0, w1)
dw_0,dw_1 = gradient_validation(X_train, Y_train, w0, w1)

print(f"dw0: {dw0.numpy()}, Validation: {dw_0}")
print(f"dw1 {dw1.numpy()}, Validation: {dw_1}")

dw0: -38.0, Validation: -38.0
dw1 -204.0, Validation: -204.0


Now that we now how to compute backproagation with Tensorflow, our life is way easier than it was before, Lets apply this to a open linear regression: 

In [39]:
### Hyper Parameters ###
STEPS = 1000
LEARNING_RATE = .02

### Initialize Weights ###
w0 = tf.Variable(0.0)
w1 = tf.Variable(0.0)

### Linear Regression ###

for step in range(0, STEPS + 1):
    
    ### Forward PASS with gradient Tape, it obtains gradients ###
    dw0, dw1 = compute_gradients(X_train, Y_train, w0, w1)
    
    ### GRADIENT DESCENT ###
    w0.assign_sub(dw0 * LEARNING_RATE) # w0 =  w0 - (dw0)w0
    w1.assign_sub(dw1 * LEARNING_RATE) # w1 =  w1 - (dw1)w1

    if step % 200 == 0:
        loss = loss_mse(X_train, Y_train, w0, w1)
        print(f"STEP {step} - loss: {loss}")
        
### USER INTERACTION ###

### PRINT MESSAGE ###
print(f"\nw0: {w0.numpy()}, w1: {w1.numpy()}")

### LOSS ON TEST SET ###
loss = loss_mse(X_train, Y_train, w0, w1)
print(f"Train set loss: {loss.numpy()}")

STEP 0 - loss: 35.70719528198242
STEP 200 - loss: 0.26831889152526855
STEP 400 - loss: 0.0028539239428937435
STEP 600 - loss: 3.0356444767676294e-05
STEP 800 - loss: 3.2238213520940917e-07
STEP 1000 - loss: 3.6101481803996194e-09

w0: 9.99988842010498, w1: 2.0000178813934326
Train set loss: 3.6101481803996194e-09
