# 1. Getting Started with Tensorflow

| Date | User | Change Type | Remarks |  
| ---- | ---- | ----------- | ------- |
| 03/09/2024   | Martin | Created   | Started chapter 1 | 
| 09/09/2024   | Martin | Update   | To page 55 - activation functions | 
| 08/10/2024   | Martin | Update   | Activation functions and sources for datasets | 

# Content

* [Introduction](#introduction)
* [Variables and Tensors](#variables-and-tensors)
* [Activation Functions](#activation-functions)
* [Working with Data Sources](#working-with-data-sources)
* [Tensorflow Under the hood](#tensorflow-under-the-hood)

In [4]:
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
os.environ["GRPC_VERBOSITY"] = "ERROR"
os.environ["GLOG_minloglevel"] = "2"
import tensorflow as tf

# Introduction

General workflow of Tensorflow:

1. Import or generate datasets
2. Transform and normalize data
3. Partition datasets into training, test and validation sets
4. Set algorithm parameters (hyperparameters)
5. Initialize variables
6. Define the model structure
7. Declare loss function
8. Initialise and train the model
9. Evaluate the model
10. Tune hyper parameters
11. Deploy/ predict new outcomes

In [13]:
# 1.
import tensorflow as tf
import tensorflow_datasets as tfds
import numpy as np

data = tfds.load("iris", split='train')

# 4.
epochs = 1000
batch_size = 32
input_size = 4
output_size = 3
learning_rate = 0.001

# 5.
weights = tf.Variable(tf.random.normal(
  shape=(input_size, output_size),
  dtype=tf.float32
))
biases = tf.Variable(tf.random.normal(
  shape=(output_size,),
  dtype=tf.float32
))

# 8.
optimizer = tf.optimizers.SGD(learning_rate)

for _ in range(epochs):
  # 2.
  for batch in data.batch(batch_size, drop_remainder=True):
    labels = tf.one_hot(batch['label'], 3)
    X = batch['features']
    X = (X - np.mean(X) / np.std(X))

    with tf.GradientTape() as tape:
      # 6.
      logits = tf.add(tf.matmul(X, weights), biases) # logistic regression (actually is linear regression first)

      # 7.
      loss = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(labels, logits)
      )

    # 8.
    gradients = tape.gradient(loss, [weights, biases])
    optimizer.apply_gradients(zip(gradients, [weights, biases]))

print(f"final loss is: {loss.numpy():.3f}")
preds = tf.math.argmax(tf.add(tf.matmul(X, weights), biases), axis=1)
ground_truth = tf.math.argmax(labels, axis=1)
for y_true, y_pred in zip(ground_truth.numpy(), preds.numpy()):
  print(f"real label: {y_true} | fitted: {y_pred}")

final loss is: 0.313
real label: 0 | fitted: 0
real label: 1 | fitted: 1
real label: 1 | fitted: 2
real label: 2 | fitted: 1
real label: 2 | fitted: 2
real label: 2 | fitted: 2
real label: 0 | fitted: 0
real label: 2 | fitted: 2
real label: 1 | fitted: 1
real label: 2 | fitted: 2
real label: 1 | fitted: 2
real label: 0 | fitted: 0
real label: 1 | fitted: 1
real label: 0 | fitted: 0
real label: 2 | fitted: 2
real label: 2 | fitted: 2
real label: 0 | fitted: 0
real label: 2 | fitted: 2
real label: 0 | fitted: 0
real label: 1 | fitted: 1
real label: 2 | fitted: 2
real label: 0 | fitted: 0
real label: 2 | fitted: 2
real label: 1 | fitted: 1
real label: 0 | fitted: 0
real label: 0 | fitted: 0
real label: 2 | fitted: 2
real label: 0 | fitted: 0
real label: 1 | fitted: 2
real label: 2 | fitted: 2
real label: 0 | fitted: 0
real label: 2 | fitted: 2


Tensorflow computes the changes by creating a computational graph which tracks the steps taken for all operations. Graphs do not use recursion.

Tensorflow keeps track of all variables in the computational graph and computes the gradients to minimize the loss

# Variables and Tensors

All variables are stored as tensors. Even single number/ digits are stored as zero-dimensional tensors

In [1]:
import tensorflow as tf

2024-09-09 14:51:18.603813: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-09-09 14:51:18.739735: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-09-09 14:51:18.802387: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-09-09 14:51:18.817166: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-09-09 14:51:18.916843: I tensorflow/core/platform/cpu_feature_guar

In [5]:
row_dim, col_dim = 3, 3
ones_tsr = tf.ones([row_dim, col_dim])
ones_tsr

<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]], dtype=float32)>

In [6]:
# or a filled tensor
tf.fill([row_dim, col_dim], value=42)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[42, 42, 42],
       [42, 42, 42],
       [42, 42, 42]], dtype=int32)>

In [8]:
# create tensors based on an existing shape
tf.zeros_like(ones_tsr)

<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]], dtype=float32)>

In [9]:
# generate values from distributions
tf.random.uniform([row_dim, col_dim], minval=0, maxval=1)

<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[0.9851533 , 0.11942112, 0.5387734 ],
       [0.4384067 , 0.9586129 , 0.19684386],
       [0.04166341, 0.10652184, 0.03117621]], dtype=float32)>

To initialize a tensor as a variable use `tf.Variable()`. Notice in the example below that the output is a Variable instead of tensor now

In [11]:
my_var = tf.Variable(tf.zeros([row_dim, col_dim]))
my_var

<tf.Variable 'Variable:0' shape=(3, 3) dtype=float32, numpy=
array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]], dtype=float32)>

In [14]:
import numpy as np
# to convert into a tensor
np_arr = np.array([1, 2, 3])
l = [1, 2, 3]
np_arr, tf.convert_to_tensor(np_arr)

(array([1, 2, 3]),
 <tf.Tensor: shape=(3,), dtype=int64, numpy=array([1, 2, 3])>)

## Creating Matrices

pg 52-54 examples of elementwise operations on matrices

Any custom functions created must use the tensorflow API to be used in the computational graph

In [15]:
id_matrix = tf.linalg.diag([1.0, 1.0, 1.0])
A = tf.random.truncated_normal([2, 3])
B = tf.fill([2, 3], 5.0)
C = tf.random.uniform([3, 2])
D = tf.convert_to_tensor(np.array([[1., 2., 3.],
                                   [-3., -7., -1.],
                                   [0., 5., -2.]]),
                                   dtype=tf.float32)

In [16]:
# Operations
A + B

<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
array([[6.7755075, 5.012356 , 5.9779267],
       [4.1081805, 4.056792 , 5.404626 ]], dtype=float32)>

In [19]:
tf.matmul(B, id_matrix)
tf.multiply(D, id_matrix)
tf.transpose(D)

<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[ 1., -3.,  0.],
       [ 2., -7.,  5.],
       [ 3., -1., -2.]], dtype=float32)>

In [22]:
# Linear algebra uses the .linalg methods
tf.linalg.det(D)
tf.linalg.inv(D)
tf.linalg.eigh(D)

(<tf.Tensor: shape=(3,), dtype=float32, numpy=array([-10.659076  ,  -0.22750677,   2.8865824 ], dtype=float32)>,
 <tf.Tensor: shape=(3, 3), dtype=float32, numpy=
 array([[ 0.21749546, -0.6325011 ,  0.7433963 ],
        [ 0.84526515, -0.25879973, -0.46749282],
        [-0.48808047, -0.7300446 , -0.47834337]], dtype=float32)>)

# Activation Functions 

Introduces non-linear operations into neural networks. Careful which activation function is used and where it's used.

* Adjusts weights and biases
* Non-linear operations on tensors

## A note about activation functions

The computational graph is limited by the output range of the activation function i.e if the output of the activation function is between 0 and 1, then the computational graph will have a range of [0, 1]

💡 Use activation functions that preserve the variance as much as possible

In [5]:
# Example of the ReLU activation function
print(tf.nn.relu([ -3., 3., 10 ]))

tf.Tensor([ 0.  3. 10.], shape=(3,), dtype=float32)


In [6]:
# Another implementation which caps the max value of the ReLU function
## called ReLU6 which caps it at 6
print(tf.nn.relu6([-3., 3., 10]))

tf.Tensor([0. 3. 6.], shape=(3,), dtype=float32)


The method above is _computationally faster_, prevents the _exploding and vanishing gradient_ problem

In [8]:
# Sigmoid (logistic) function
print(tf.nn.sigmoid([-1., 0., 1.]))

tf.Tensor([0.26894143 0.5        0.73105854], shape=(3,), dtype=float32)


Sigmoid function has a tendency to zero-out the backpropogation term during training

In [12]:
# Other activation functions
# Hyperbolic tangent function - similar to Sigmoid, but (-1, 1)
print(tf.nn.tanh([-1, 0., 1.]))

# Softplus function - smoother version if ReLU function
print(tf.nn.softplus([-1., 0., 1.]))

# Exponential Linear Unit (ELU) - bottom asymptote is -1, similar to Softplus
print(tf.nn.elu([-1., 0., 1.]))

tf.Tensor([-0.7615942  0.         0.7615942], shape=(3,), dtype=float32)
tf.Tensor([0.31326172 0.6931472  1.3132616 ], shape=(3,), dtype=float32)
tf.Tensor([-0.63212055  0.          1.        ], shape=(3,), dtype=float32)


## Custom activation functions

In [13]:
def swish(x: tf.Tensor):
  return x * tf.nn.sigmoid(x)

print(swish([-1., 0, 1.]))

tf.Tensor([-0.26894143  0.          0.73105854], shape=(3,), dtype=float32)


# Working with Data Sources

In [15]:
import tensorflow_datasets as tfds
# Iris dataset 
iris = tfds.load('iris', split='train')

# Birth weight data - contains measurements inclusing childbirth weight
birthdata_url = 'https://raw.githubusercontent.com/PacktPublishing/TensorFlow-2-Machine-Learning-Cookbook-Third-Edition/master/birthweight.dat' 
path = tf.keras.utils.get_file(birthdata_url.split('/')[-1], birthdata_url)

def map_line(x):
  return tf.strings.to_number(tf.strings.split(x))
birth_file = tf.data \
  .TextLineDataset(path) \
  .skip(1) \
  .map(map_line)

# Boston housing data - 506 observations of house worth
housing_url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data'
path = tf.keras.utils.get_file(housing_url.split("/")[-1], housing_url)

housing = tf.data \
  .TextLineDataset(path) \
  .map(map_line)

# MNIST - handwriting data
mnist = tfds.load('mnist', split=None)
mnist_train = mnist['train']
mnist_test = mnist['test']

# Spam-ham text data - text message data on whether a message is spam or not
zip_url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/00228/smsspamcollection.zip'
path = tf.keras.utils.get_file(zip_url.split("/")[-1], zip_url, extract=True)

path = path.replace("smsspamcollection.zip", "SMSSpamCollection")

def split_text(x):
    return tf.strings.split(x, sep='\t')

text_data = tf.data \
  .TextLineDataset(path) \
  .map(split_text)
            
# Moview review data - Classify whether movie is good or bad
movie_data_url = 'http://www.cs.cornell.edu/people/pabo/movie-review-data/rt-polaritydata.tar.gz'
path = tf.keras.utils.get_file(movie_data_url.split('/')[-1], movie_data_url, extract=True)
path = path.replace('.tar.gz', '')

with open('moview_reviews.txt', 'w') as review_file:
  for response, filename in enumerate(['/rt-polarity.neg', '/rt-polarity.pos']):
    with open(path+filename, 'r', encoding='utf-8', errors='ignore') as movie_file:
      for line in movie_file:
        review_file.write(str(response) + '\t' + line.encode('utf-8').decode())

movies = tf.data \
  .TextLineDataset('movie_reviews.txt') \
  .map(split_text)

# CIFAR-10 image data - labeled coloured images, 10 target classes
ds, info = tfds.load('cifar10', shuffle_files=True, with_info=True)
print(info)

cifar_train = ds['train']
cifar_test = ds['test']

# Shakespear text data - compiled work of Shakespear
shakespeare_url = 'https://raw.githubusercontent.com/PacktPublishing/TensorFlow-2-Machine-Learning-Cookbook-Third-Edition/master/shakespeare.txt'
path = tf.keras.utils.get_file(shakespeare_url.split("/")[-1], shakespeare_url)

shakespeare_text = tf.data \
  .TextLineDataset(path) \
  .map(split_text)

# English-German translation data - sentence-to-sentence translation from english to german
import os
import pandas as pd
from zipfile import ZipFile
from urllib.request import urlopen, Request

sentence_url = "https://www.manythings.org/anki/cmn-eng.zip"
r = Request(
  sentence_url,
  headers={
    'User-Agent': 'Mozilla/5.0 (X11; U; Linux i686) Gecko/20071127 Firefox/2.0.0.11'
  }
)
b2 = [z for z in sentence_url.split('/') if '.zip' in z][0]

with open(b2, "wb") as target:
  target.write(urlopen(r).read())

with ZipFile(b2) as z:
  chn = [line.split('\t')[:2] for line in z.open('cmn.txt').read().decode().split('\n')]

os.remove(b2)

with open("cmn.txt", "wb") as chn_file:
  for line in chn:
    data = ",".join(line) + "\n"
    chn_file.write(data.encode('utf-8'))
  
text_data = tf.data \
  .TextLineDataset("cmn.txt") \
  .map(split_text)


Downloading data from https://raw.githubusercontent.com/PacktPublishing/TensorFlow-2-Machine-Learning-Cookbook-Third-Edition/master/birthweight.dat
[1m4554/4554[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step
Downloading data from http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data
  49152/Unknown [1m0s[0m 4us/step[1mDownloading and preparing dataset 11.06 MiB (download: 11.06 MiB, generated: 21.00 MiB, total: 32.06 MiB) to /root/tensorflow_datasets/mnist/3.0.1...[0m


Dl Completed...:   0%|          | 0/5 [00:00<?, ? file/s]

[1mDataset mnist downloaded and prepared to /root/tensorflow_datasets/mnist/3.0.1. Subsequent calls will reuse this data.[0m
Downloading data from http://archive.ics.uci.edu/ml/machine-learning-databases/00228/smsspamcollection.zip
 106496/Unknown [1m0s[0m 4us/stepDownloading data from http://www.cs.cornell.edu/people/pabo/movie-review-data/rt-polaritydata.tar.gz
[1m487770/487770[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3us/step
[1mDownloading and preparing dataset 162.17 MiB (download: 162.17 MiB, generated: 132.40 MiB, total: 294.58 MiB) to /root/tensorflow_datasets/cifar10/3.0.2...[0m


Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Extraction completed...: 0 file [00:00, ? file/s]

# Tensorflow Under the Hood

1. Tensorflow uses eager execution