# Hackathon #1

Written by Eleanor Quint

In [1]:
# We'll start with our library imports...
from __future__ import print_function

import numpy as np       # to use numpy arrays
import tensorflow as tf  # to specify and run computation graphs

#### Tensor

The basic unit of data in TensorFlow is the [Tensor](https://www.tensorflow.org/api_docs/python/tf/Tensor). A tensor is a multi-dimensional array of numerical variable specialized for numerical computation with an underlying data type (think `float` or `int`). We say that a tensor's _rank_ is its number of dimensions. Here are some examples of tensors.

In [3]:
# a rank 0 tensor; a scalar with shape ()
its_complicated = tf.Variable(12.3 - 4.85j, tf.complex64)
print(its_complicated)

# a rank 1 tensor; a vector with shape (5,), meaning it's just a plain 'ol array
# notice that we've given a name to this variable
first_primes = tf.Variable([2, 3, 5, 7, 11], tf.int32, name='primes')
print(first_primes)

# a rank 2 tensor; a matrix with shape [2, 2]
# notice that the dtype is inferred when we don't specify it
my_identity = tf.Variable([[1, 0],[0, 1]])
print(my_identity)
print("We can retrieve a numpy array from TensorFlow as:", my_identity.numpy(), "is a", type(my_identity.numpy()))

# a rank 4 tensor with shape [10, 299, 299, 3]
blank_image = tf.zeros([10, 299, 299, 3])
print("tf.shape returns a Tensor:", tf.shape(blank_image))
print("while .shape returns a tuple:", blank_image.shape)

<tf.Variable 'Variable:0' shape=() dtype=complex128, numpy=(12.3-4.85j)>
<tf.Variable 'primes:0' shape=(5,) dtype=int32, numpy=array([ 2,  3,  5,  7, 11], dtype=int32)>
<tf.Variable 'Variable:0' shape=(2, 2) dtype=int32, numpy=
array([[1, 0],
       [0, 1]], dtype=int32)>
We can retrieve a numpy array from TensorFlow as: [[1 0]
 [0 1]] is a <class 'numpy.ndarray'>
tf.shape returns a Tensor: tf.Tensor([ 10 299 299   3], shape=(4,), dtype=int32)
while .shape returns a tuple: (10, 299, 299, 3)


#### Tensor Shape and Broadcasting

The shape of a tensor can be checked by calling the [tf.shape](https://www.tensorflow.org/api_docs/python/tf/shape) operation. (Note too that the rank can be checked with `len(x.shape)`). Tensors can be reshaped with [tf.reshape](https://www.tensorflow.org/api_docs/python/tf/reshape). For example:

In [4]:
print(tf.range(10))
print(tf.reshape(tf.range(10), (2,5)))  # re-arrange into two rows
print(tf.reshape(tf.range(10), (1,10))) # add a dimension

tf.Tensor([0 1 2 3 4 5 6 7 8 9], shape=(10,), dtype=int32)
tf.Tensor(
[[0 1 2 3 4]
 [5 6 7 8 9]], shape=(2, 5), dtype=int32)
tf.Tensor([[0 1 2 3 4 5 6 7 8 9]], shape=(1, 10), dtype=int32)


The shape of a tensor is important to determine what operations are valid on it. TensorFlow uses the same operational semantics and broadcasting rules as numpy. Operations are generally pointwise, as illustrated by the following multiplication which calculates squares.

In [5]:
print("Shapes", tf.range(10).shape, "and", tf.range(10).shape, "gives", (tf.range(10) * tf.range(10)))

Shapes (10,) and (10,) gives tf.Tensor([ 0  1  4  9 16 25 36 49 64 81], shape=(10,), dtype=int32)


Scalars with shape `()` can always be broadcast to operate with anything

In [6]:
print("Shapes", tf.range(10).shape, "and", tf.convert_to_tensor(2).shape, "gives", (tf.range(10) - tf.convert_to_tensor(2)))

Shapes (10,) and () gives tf.Tensor([-2 -1  0  1  2  3  4  5  6  7], shape=(10,), dtype=int32)


And similarly, if one of the tensors has a 1 in a dimension and the other doesn't, broadcasting occurs in that dimension. You can assume all tensors' shape begin with an implicit 1, which allows the last example below to work.

In [7]:
# these operation will fail if uncommented
# tf.range(10) * tf.range(20)
# tf.ones([2,10]) * tf.ones([3,10])

# note that tf.ones([a,b]) == tf.reshape(tf.ones(a*b), [a,b])
print("This example broadcasts in the first two dimensions to get shape:", (tf.ones([3,1,10]) * tf.ones([1,3,10])).shape)
print("This one works because the first has implicit shape [1,10], giving shape:", (tf.range(10) * tf.ones([2,10], dtype=tf.int32)).shape)

This example broadcasts in the first two dimensions to get shape: (3, 3, 10)
This one works because the first has implicit shape [1,10], giving shape: (2, 10)


#### Tensor data type

Broadcasting gives a good amount of flexibility to working with shapes, but TensorFlow will never implicitly change data types, leading to hidden errors like the one below. Changing the data type is easy with `tf.cast`.

In [8]:
# This won't work, even though the shapes line up: tf.range(10) * tf.ones([2,10])
# Why?
print("First type is", tf.range(10).dtype, "and second is", tf.ones([2,10]).dtype)
print("This works though!", tf.range(10) * tf.cast(tf.ones([2,10]), tf.int32))

First type is <dtype: 'int32'> and second is <dtype: 'float32'>
This works though! tf.Tensor(
[[0 1 2 3 4 5 6 7 8 9]
 [0 1 2 3 4 5 6 7 8 9]], shape=(2, 10), dtype=int32)


#### Trainable variables

In machine learning, we're interested in using models which are parameterized with trainable variables. We can create variables with `tf.Variable` and by providing the initial value.

In [9]:
print(tf.Variable(tf.random.normal([10])))

<tf.Variable 'Variable:0' shape=(10,) dtype=float32, numpy=
array([-0.68911904,  0.16275883,  1.7171981 , -2.5942533 ,  1.6408705 ,
        0.44160518,  0.75587684,  0.48746487, -1.0254815 ,  0.86240095],
      dtype=float32)>


Next, we're going to take a big step to using gradient descent to solve a system of linear equations, `Ax=b`. We'll generate fixed values for A and b, and make `x` a variable we can learn. Then, we'll calculate an error function (the `difference_sq` line below), and use the gradients of the error with respect to `x` to update it to make the error smaller on the next run. We can do this for all the indices of the `x` vector simultaneously.

In [13]:
learning_rate = 0.05
num_iterations = 200

# the optimizer allows us to apply gradients to update variables
optimizer = tf.keras.optimizers.Adam(learning_rate)

# Create a fixed matrix, A
A = tf.random.normal([4,4])
# Create x using an arbitrary initial value
x = tf.Variable(tf.ones([4, 1]))
# Create a fixed vector b
b = tf.random.normal([4, 1])

# We want Ax - b = 0, so we'll try to minimize its value
for step in range(num_iterations):
    with tf.GradientTape() as tape:
        # Calculate A*x
        product = tf.matmul(A, x)
        # find the value we want to minimize
        # what happens if we don't use the square here?
        difference_sq = tf.math.square(product - b)
        print("The error is", tf.norm(tf.math.sqrt(difference_sq)).numpy())
        # calculate the gradient
        grad = tape.gradient(difference_sq, [x])
        print(grad)
        # update x
        optimizer.apply_gradients(zip(grad, [x]))

The error is 3.84468
[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[12.733934 ],
       [ 2.4185116],
       [-7.675736 ],
       [11.431174 ]], dtype=float32)>]
The error is 3.6311133
[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[11.998866 ],
       [ 1.7371895],
       [-5.78738  ],
       [10.080473 ]], dtype=float32)>]
The error is 3.4393377
[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[11.269282 ],
       [ 1.0660903],
       [-3.9241402],
       [ 8.7442045]], dtype=float32)>]
The error is 3.2727468
[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[10.551348 ],
       [ 0.4180411],
       [-2.1169147],
       [ 7.439481 ]], dtype=float32)>]
The error is 3.1329815
[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[ 9.854049  ],
       [-0.18711853],
       [-0.41302288],
       [ 6.1919928 ]], dtype=float32)>]
The error is 3.0183275
[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[ 9.189345  ],
       [-0.72271305],
       [ 1.1

[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[0.44740483],
       [0.05611607],
       [0.00762007],
       [0.09128919]], dtype=float32)>]
The error is 0.15498172
[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[ 0.42730123],
       [ 0.05545563],
       [-0.01298101],
       [ 0.09634441]], dtype=float32)>]
The error is 0.14473423
[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[ 0.40495524],
       [ 0.05256458],
       [-0.02525693],
       [ 0.09542507]], dtype=float32)>]
The error is 0.13468656
[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[ 0.3805812 ],
       [ 0.04755135],
       [-0.02957624],
       [ 0.08900549]], dtype=float32)>]
The error is 0.124845155
[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[ 0.35452023],
       [ 0.04072264],
       [-0.02683532],
       [ 0.07786606]], dtype=float32)>]
The error is 0.11525736
[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[ 0.3272119 ],
       [ 0.03254063],
       [-0.0183

The error is 0.0019182764
[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[-5.8451146e-03],
       [ 3.9255712e-05],
       [ 1.4342391e-04],
       [-1.0393125e-03]], dtype=float32)>]
The error is 0.001729621
[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[-0.00521682],
       [ 0.00031611],
       [-0.00044485],
       [-0.00078616]], dtype=float32)>]
The error is 0.0015522913
[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[-0.00458737],
       [ 0.00055302],
       [-0.00099092],
       [-0.0005281 ]], dtype=float32)>]
The error is 0.0013862127
[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[-0.00397653],
       [ 0.00073181],
       [-0.00143973],
       [-0.00030083]], dtype=float32)>]
The error is 0.0012288175
[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[-0.0033994 ],
       [ 0.00084038],
       [-0.001749  ],
       [-0.00013104]], dtype=float32)>]
The error is 0.0010767997
[<tf.Tensor: shape=(4, 1), dtype=float32, numpy=
array([[-

We use `.numpy()` to get the value of the tensor for a cleaner output message. If the number of iterations is large enough, we will eventually learn a vector for `x` which approximately satisfies the system of equations.

### Homework

Your homework this week is to specify an equation that you will solve with gradient descent (as above). Then, play around with the learning rate and number of update iterations to get an intuitive understanding of how they affect your solver. Write up a paragraph or two describing your equation, how learning rate and number of iterations gave a better or worse solution, and with your intuition for why.