# SIT744 Practical 2: Introduction to TensorFlow


*Prof. Antonio Robles-Kelly*




<div class="alert alert-info">
We suggest that you run this notebook using Google Colab.
</div>

- Go through [this tutorial](https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/quickstart/beginner.ipynb) of TensorFlow using Keras in Google Colab. *You do not need to understand anything at this stage; Just get an idea of how Keras works.*
- Read through [this blog](https://blog.tensorflow.org/2019/09/tensorflow-20-is-now-available.html) to understand how Keras and TensorBoard fit in the TensorFlow 2.0 eco-system. *Again, don't worry if you don't understand everything mentioned there.*




## Task 1. TensorFlow




TensorFlow contains the following key components:

- A NumPy-like data model with GPU support.
- A computation model based on the computation graph.
- A library to compute gradients and perform gradient descent.

### Loading TensorFlow 2.0 in Colab
We use TensorFlow 2.0 in this unit. It is much more user-friendly compared with the previous version.

In Google Colab, you can specify TensorFlow 2.0 to be used as below.

In [9]:
#%tensorflow_version 2.x
import tensorflow as tf
print(tf.__version__)

2.16.2


### Create Tensors from NumPy

Tensors can be created from NumPy arrays and vice versa.

Which of the variables below contain tensors?


In [14]:
import numpy as np

a = np.ones(5)
print(type(tf.constant(a)))
print(type(tf.Variable(a)))



<class 'tensorflow.python.framework.ops.EagerTensor'>
<class 'tensorflow.python.ops.resource_variable_ops.ResourceVariable'>


Many operations for ndarrays have counterparts for tensors.

In [4]:
print(a + 3)  # NumPy
print(b + 3)  # TensorFlow

[4. 4. 4. 4. 4.]
tf.Tensor([4. 4. 4. 4. 4.], shape=(5,), dtype=float64)


In [4]:
print(np.square(a + 1)) # NumPy
print(tf.square(b + 1)) # TensorFlow

[4. 4. 4. 4. 4.]
tf.Tensor([4. 4. 4. 4. 4.], shape=(5,), dtype=float64)


However, for computational efficiency, TensorFlow avoids automatic type conversions. Integers are often not very useful in TensorFlow.

In [20]:
d = tf.constant(3)
b = tf.constant([1, 2, 3])
d+b


<tf.Tensor: shape=(3,), dtype=int32, numpy=array([4, 5, 6], dtype=int32)>

In [16]:
d

<tf.Tensor: shape=(), dtype=int32, numpy=3>

In [17]:
## This works
b + tf.cast(d, tf.float64)

<tf.Tensor: shape=(5,), dtype=float64, numpy=array([4., 4., 4., 4., 4.])>

### Why Tensor? Why GPU?

A tensor is similar to a NumPy ndarray, but with GPU support. Let's first make sure that a GPU is enabled.

In [5]:
print(tf.test.gpu_device_name())
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
print("These are:",tf.config.experimental.list_physical_devices('GPU'))

/device:GPU:0
Num GPUs Available:  1
These are: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]


2024-07-14 16:15:38.428213: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-07-14 16:15:38.428248: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)


Now compare the running time with GPU and without GPU.

In [8]:
 %%time
 with tf.device('/cpu:0'):
    random_image = tf.random.normal((10000, 10000, 3))
    tf.math.reduce_sum(random_image)


tf.Tensor(-13995.58, shape=(), dtype=float32)
CPU times: user 5.81 s, sys: 3.09 s, total: 8.9 s
Wall time: 2.33 s


In [7]:
 %%time
 with tf.device('/device:GPU:0'):
    random_image = tf.random.normal((10000, 10000, 3))
    tf.math.reduce_sum(random_image)

CPU times: user 48.1 ms, sys: 56.4 ms, total: 104 ms
Wall time: 114 ms


## Task 2 TensorBoard

Tensorboard is a convenient visualisation tool that comes with TensorFlow. You have read in [this blog post](https://blog.tensorflow.org/2019/09/tensorflow-20-is-now-available.html) that TensorBoard is the default analysis tool in TensorFlow 2.x. You can use it to visualise computation graphs and training metrics and much more.

Run through [this tutorial](https://colab.research.google.com/github/tensorflow/tensorboard/blob/master/docs/get_started.ipynb) in Google Colab. If you don't understand everything, it is fine, as we will revisit TensorBoard with Keras later on.



### Input for TensorBoard

Here is how TensorBoard expect the input data to be organised.

#### logdir
TensorBoard looks into a log folder *logdir*, which contains **summary data** to visualise.

#### runs

You may organise the summary data from difference executions of your model into subfolders under *logdir*. These subfolders will become different **runs** in the TensorBoard.


#### event files

You use TensorFlow's `tf.summary` API (or indirectly via Keras callbacks) to generate log files, which are call **event files** in *logdir*. These files will automatically have "tfevents" in their file names. Each file contains records called *summaries*.

#### tags

You add tags to a summary by passing a `name` argument in `tf.summary` calls (See examples below). In TensorBoard, these tags allow you to filter data to be visualised.


#### Two ways to generate TensorBoard logs

As suggested before, there are two ways to generate TensorBoard logs: through the high-level Keras callbacks or directly use the *tf.summary* API. We will introduce Keras later in this unit. For now, let's focus on *tf.summary*.  



### Task 2.1 Creat logdir

As mentioned above, it is a common practice to keep logs from different training runs in separate folders so that we can compare the learning curves across multiple runs. Below shows a common trick to use datetime to generate a unique folder for the current training run.

In [37]:
from datetime import datetime
import os
root_logdir = "logs"
run_id = datetime.now().strftime("%Y%m%d-%H%M%S")
logdir = os.path.join(root_logdir, run_id)


Then we use that folder for logging.

In [45]:
# Clear any logs from previous runs
import shutil

#!rm -rf ./logs/

logs_dir_path = './logs/'
if os.path.isdir(logs_dir_path):
  shutil.rmtree(logs_dir_path)




In [46]:
file_writer = tf.summary.create_file_writer(logdir)

In [47]:

# from google.colab import drive
# drive.mount('/content/drive')

### Visualise computation graph

To visualise TensorFlow computation graphs in TensorBoard, you need two functions `tf.summary.trace_on()` and `tf.summary.trace_export()`.

In [51]:
# The function to be traced.
@tf.function
def my_prod(x, y):
    return tf.matmul(x, y)

def my_min(B):
    return tf.reduce_min(B)

def my_max(B):
    return tf.reduce_max(B)

# Sample data for your function.
x = tf.random.uniform((3, 3))
y = tf.random.uniform((3, 3))

# Bracket the function call with
# tf.summary.trace_on() and tf.summary.trace_export().
tf.summary.trace_on(graph=True)

# Call only one tf.function when tracing.
z = my_prod(x, y)
with file_writer.as_default():
  tf.summary.trace_export(
      name="my_func_trace",
      step=0,
      profiler_outdir=logdir)
tf.summary.trace_off()

A = my_prod(x, y)
A = (A-my_min(A))/(my_max(A)-my_min(A))
A = tf.expand_dims(A, 2)
A = tf.expand_dims(A, 0)
# Using the file writer, log the reshaped image.
with file_writer.as_default():
  tf.summary.image("my_matrix_multiplication", A, step=0)


#### Show log in TensorBoard

To load TensorBoard in Jupyter, follow the example below.


In [49]:
# Load the TensorBoard notebook extension
%load_ext tensorboard

The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard


In [50]:
# %reload_ext tensorboard
%tensorboard --logdir logs

@tf.function
def my_prod(x, y):
    a = tf.matmul(x, y)

Reusing TensorBoard on port 6006 (pid 73651), started 1 day, 21:36:42 ago. (Use '!kill 73651' to kill it.)

You should be able to see a **GRAPHS** tab and there you can find the computation graph.

**Exercise**: Follow the example above to display the computation graph of the function $z = 3 x^2 + 2xy$. (*Hint: You may need a different tag.*)

### Log (training) metrics

Besides visualising the computation graph, TensorBoard can also be used to log training metrics.

In [None]:
with file_writer.as_default():
  for step in range(200):
    tf.summary.scalar(name="my_metric", data=0.5, step=step)
    file_writer.flush()

**Exercise**:
1. Follow the example above to log the value of the sine function from 0 to 100; Display the values in TensorBoard. (*Hint: use a different tag.*)
2. Create another run. And plot the cosine function instead. (*Hint: use a different summary writer.*)

## Task 3. AutoDiff

Computing gradient is a core requirement for training deep learning models. In TensorFlow, this is done automatically by the software. If you use Keras, in most cases, you specify an optimiser in the `fit` function and don't need to access the gradients directly. However, to access the computed gradients, you can follow the example below.

In [55]:
x = tf.Variable(4.0)

def func(x):
  return x**2

with tf.GradientTape() as t:
  y = func(x)
  loss = 2*y

dy_dx = t.gradient(loss, x)
print(dy_dx)


tf.Tensor(16.0, shape=(), dtype=float32)


**Exercise**: Follow the example above to compute the gradient of function $z(x,y) = 3 x^2 + 2xy$.

## Additional resources

- TensorFlow guide on [Examining the TensorFlow Graph](https://www.tensorflow.org/tensorboard/graphs)
- TensorFlow tutorial on [Automatic differentiation and gradient tape](https://www.tensorflow.org/tutorials/customization/autodiff)
- [Calculus on Computational Graphs: Backpropagation](https://colah.github.io/posts/2015-08-Backprop/)