<a href="https://colab.research.google.com/github/ryangprince/TensorFlow-Bootcamp/blob/main/00_tensorflow_fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

What are we going to do?

More specifically, we're going to cover:
* Introduction to tensors.
* Getting information from tensors.
* Manipulating tensors.
* Tensors & Numpy
* Using @tf.function (a way to speed up your regular python functions)
* Using GPUs with TensorFlow (or TPUs)
* Exercises to try for yourself!

## Introduction to Tensors

In [None]:
# Import TensorFlow
import tensorflow as tf
print(tf.__version__)

2.17.1


In [None]:
# Create tensors with tf.constant()
scalar = tf.constant(7)
scalar

<tf.Tensor: shape=(), dtype=int32, numpy=7>

In [None]:
# Check the number of dimensions of a tensor (ndim stands for number of dimensions)
scalar.ndim

0

In [None]:
# Create a vector
vector = tf.constant([10, 10])
vector

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([10, 10], dtype=int32)>

In [None]:
# Check the dimensions of our vector
vector.ndim

1

In [None]:
# Create a matrix (a matrix has more than one dimension)
matrix = tf.constant([[10, 7],
                      [7, 10]])
matrix

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[10,  7],
       [ 7, 10]], dtype=int32)>

In [None]:
matrix.ndim

2

In [None]:
# Create another matrix
another_matrix = tf.constant([[10., 7.],
                              [3., 2.],
                              [8., 9.]], dtype=tf.float16) # specify the data type with dtype parameter
another_matrix

<tf.Tensor: shape=(3, 2), dtype=float16, numpy=
array([[10.,  7.],
       [ 3.,  2.],
       [ 8.,  9.]], dtype=float16)>

In [None]:
# What's the number of dimensions of another_matrix?
another_matrix.ndim

2

In [None]:
# Let's create a tensor
tensor = tf.constant([[[1, 2, 3],
                       [4, 5, 6]],
                      [[7, 8, 9],
                       [10, 11, 12]],
                      [[13, 14, 15],
                       [16, 17, 18]]])
tensor

<tf.Tensor: shape=(3, 2, 3), dtype=int32, numpy=
array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]],

       [[13, 14, 15],
        [16, 17, 18]]], dtype=int32)>

In [None]:
tensor.ndim

3

What we've created so far:

* Scalar: a single number
* Vector: a number with direction (e.g. wind speed and direction)
* Matrix: a 2-dimensional array of numbers
* Tensor: an n-dimensional array of numbers (where n can be any number, a 0-dimensional tensor is a scalar, a 1-dimensional tensor is a vector)

### Creating tensors with `tf.Variable`

In [None]:
# Create the same tensor with tf.Variable() as above
changeable_tensor = tf.Variable([10, 7])
unchangeable_tensor = tf.constant([10, 7])
changeable_tensor, unchangeable_tensor

(<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([10,  7], dtype=int32)>,
 <tf.Tensor: shape=(2,), dtype=int32, numpy=array([10,  7], dtype=int32)>)

In [None]:
# Let's try to change one of the elements in our changeable tensor
changeable_tensor[0] = 7
changeable_tensor

TypeError: 'ResourceVariable' object does not support item assignment

In [None]:
# How about we try .assign()
changeable_tensor[0].assign(7)
changeable_tensor

<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([7, 7], dtype=int32)>

In [None]:
# Let's try change our unchangeable tensor
unchangeable_tensor[0].assign(7)
unchangeable_tensor

AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'assign'

🔑 **Note**: Rarely in practice will you need to decide whether to use `tf.constant` or `tf.Variable` to create tensors, as TensorFlow does this for you. However, if in doubt, use tf.constant and change it later if needed.

### Creating random tensors

Random tensors are tensors of some arbitrary size which contains random numbers.

In [None]:
# Create two random (but the same) tensors
random_1 = tf.random.Generator.from_seed(7) # set seed for reproducibility
random_1 = random_1.normal(shape=(3, 2))

random_2 = tf.random.Generator.from_seed(7)
random_2 = random_2.normal(shape=(3, 2))

# Are they equal?
random_1, random_2, random_1 == random_2

(<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[-1.3240396 ,  0.28785667],
        [-0.8757901 , -0.08857018],
        [ 0.69211644,  0.84215707]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[-1.3240396 ,  0.28785667],
        [-0.8757901 , -0.08857018],
        [ 0.69211644,  0.84215707]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=bool, numpy=
 array([[ True,  True],
        [ True,  True],
        [ True,  True]])>)

### Shuffle the order of elements in a tensor

In [None]:
# Shuffle a tensor (valuable for when you want to shuffle your data so the inheret order doesn't effect learning)
not_shuffled = tf.constant([[10, 7],
                            [3, 4],
                            [2, 5]])

# Shuffle our non-shuffled tensor
tf.random.shuffle(not_shuffled)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4],
       [ 2,  5]], dtype=int32)>

In [None]:
# Shuffle our non-shuffled tensor
tf.random.set_seed(42)
tf.random.shuffle(not_shuffled, seed=42)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4],
       [ 2,  5]], dtype=int32)>

In [None]:
not_shuffled

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4],
       [ 2,  5]], dtype=int32)>

🛠 **Exercise:** Read through TensorFlow documentation on random seed generation: https://www.tensorflow.org/api_docs/python/tf/random/set_seed and practice writing 5 random tensors and shuffling them.

It looks like if we want our shuffled tensors to be in the same order, we've got to use the global level random seed as well as the operation level random seed:

> Rule 4: "If both the global and the operation seed are set: Both seeds are used in conjunction to determine the random sequence."

In [None]:
tf.random.set_seed(42)
not_shuffled_2 = tf.constant([[10, 7, 6],
                            [3, 4, 10],
                            [2, 5, 40]])

tf.random.shuffle(not_shuffled_2, seed=42)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[10,  7,  6],
       [ 3,  4, 10],
       [ 2,  5, 40]], dtype=int32)>

In [None]:
tf.random.set_seed(42)
not_shuffled_3 = tf.Variable([[1, 2, 3],
                              [4, 5, 6],
                              [7, 8, 9]])

tf.random.shuffle(not_shuffled_3, seed=42)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]], dtype=int32)>

In [None]:
tf.random.set_seed(50)
not_shuffled_4 = tf.constant([[[1, 2, 3],
                               [4, 5, 6],
                               [7, 8, 9]],
                              [[10, 11, 12],
                               [13, 14, 15],
                               [16, 17, 18]],
                              [[19, 20, 21],
                               [22, 23, 24],
                               [25, 26, 27]]])
tf.random.shuffle(not_shuffled_4, seed=50)

<tf.Tensor: shape=(3, 3, 3), dtype=int32, numpy=
array([[[19, 20, 21],
        [22, 23, 24],
        [25, 26, 27]],

       [[10, 11, 12],
        [13, 14, 15],
        [16, 17, 18]],

       [[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9]]], dtype=int32)>

### Other ways to make tensors

In [None]:
# Create a tensor of all ones
tf.ones([10, 7])

<tf.Tensor: shape=(10, 7), dtype=float32, numpy=
array([[1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1.]], dtype=float32)>

In [None]:
# Create a tensor of all zeroes
tf.zeros(shape=(3, 4))

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]], dtype=float32)>

### Turn NumPy arrays into tensors

The main difference between NumPy arrays and TensorFlow tensors is that tensors can be run on a GPU (much faster for numerical computering).

In [None]:
# You can also turn Numpy arrays into tensors
import numpy as np
numpy_A = np.arange(1, 25, dtype=np.int32) # create a NumPy array between 1 and 25
numpy_A

# X = tf.constant(some_matrix) # capital for matrix
# y = tf.constant(vector) # lower case for vector

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24], dtype=int32)

In [None]:
# Convert NumPy array to a tensor and change the shape of the tensor
A = tf.constant(numpy_A, shape=(3, 8))
B = tf.constant(numpy_A)
A, B

(<tf.Tensor: shape=(3, 8), dtype=int32, numpy=
 array([[ 1,  2,  3,  4,  5,  6,  7,  8],
        [ 9, 10, 11, 12, 13, 14, 15, 16],
        [17, 18, 19, 20, 21, 22, 23, 24]], dtype=int32)>,
 <tf.Tensor: shape=(24,), dtype=int32, numpy=
 array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19, 20, 21, 22, 23, 24], dtype=int32)>)

🛠 **Exercise:** Create 5 NumPy arrays, convert them to vectors and then reshape them.

In [None]:
numpy_1 = np.arange(1, 31, dtype=np.int32)
tensor_1 = tf.constant(numpy_1, shape=(5, 2, 3))
numpy_1, tensor_1

(array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30], dtype=int32),
 <tf.Tensor: shape=(5, 2, 3), dtype=int32, numpy=
 array([[[ 1,  2,  3],
         [ 4,  5,  6]],
 
        [[ 7,  8,  9],
         [10, 11, 12]],
 
        [[13, 14, 15],
         [16, 17, 18]],
 
        [[19, 20, 21],
         [22, 23, 24]],
 
        [[25, 26, 27],
         [28, 29, 30]]], dtype=int32)>)

In [None]:
numpy_2 = np.arange(11, 101, dtype=np.int32)
tensor_2 = tf.constant(numpy_2, shape=(10, 9))
numpy_2, tensor_2

(array([ 11,  12,  13,  14,  15,  16,  17,  18,  19,  20,  21,  22,  23,
         24,  25,  26,  27,  28,  29,  30,  31,  32,  33,  34,  35,  36,
         37,  38,  39,  40,  41,  42,  43,  44,  45,  46,  47,  48,  49,
         50,  51,  52,  53,  54,  55,  56,  57,  58,  59,  60,  61,  62,
         63,  64,  65,  66,  67,  68,  69,  70,  71,  72,  73,  74,  75,
         76,  77,  78,  79,  80,  81,  82,  83,  84,  85,  86,  87,  88,
         89,  90,  91,  92,  93,  94,  95,  96,  97,  98,  99, 100],
       dtype=int32),
 <tf.Tensor: shape=(10, 9), dtype=int32, numpy=
 array([[ 11,  12,  13,  14,  15,  16,  17,  18,  19],
        [ 20,  21,  22,  23,  24,  25,  26,  27,  28],
        [ 29,  30,  31,  32,  33,  34,  35,  36,  37],
        [ 38,  39,  40,  41,  42,  43,  44,  45,  46],
        [ 47,  48,  49,  50,  51,  52,  53,  54,  55],
        [ 56,  57,  58,  59,  60,  61,  62,  63,  64],
        [ 65,  66,  67,  68,  69,  70,  71,  72,  73],
        [ 74,  75,  76,  77,  78,  79, 

In [None]:
numpy_3 = np.arange(1, 19, dtype=np.int32)
tensor_3 = tf.constant(numpy_3, shape=(3, 2, 3))
numpy_3, tensor_3

(array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18], dtype=int32),
 <tf.Tensor: shape=(3, 2, 3), dtype=int32, numpy=
 array([[[ 1,  2,  3],
         [ 4,  5,  6]],
 
        [[ 7,  8,  9],
         [10, 11, 12]],
 
        [[13, 14, 15],
         [16, 17, 18]]], dtype=int32)>)

In [None]:
numpy_4 = np.arange(1, 17, dtype=np.int32)
tensor_4 = tf.constant(numpy_4, shape=(2, 2, 2, 2))
numpy_4, tensor_4

(array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16],
       dtype=int32),
 <tf.Tensor: shape=(2, 2, 2, 2), dtype=int32, numpy=
 array([[[[ 1,  2],
          [ 3,  4]],
 
         [[ 5,  6],
          [ 7,  8]]],
 
 
        [[[ 9, 10],
          [11, 12]],
 
         [[13, 14],
          [15, 16]]]], dtype=int32)>)

In [None]:
# Try with tf.Variable
numpy_5 = np.arange(1, 26, dtype=np.int32)
# tensor_5 = tf.Variable(numpy_5, shape=(1, 5)) # cannot use built in shape=() function to change shape of tensor created using tf.Variable
tensor_5 = tf.constant(numpy_5, shape=(5, 5))
numpy_5, tensor_5

(array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19, 20, 21, 22, 23, 24, 25], dtype=int32),
 <tf.Tensor: shape=(5, 5), dtype=int32, numpy=
 array([[ 1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10],
        [11, 12, 13, 14, 15],
        [16, 17, 18, 19, 20],
        [21, 22, 23, 24, 25]], dtype=int32)>)

In [None]:
# Try to create a tensor using tf.Variable AND change its shape use tf.reshape
numpy_6 = np.arange(1, 26, dtype=np.int32)
tensor_6 = tf.Variable(numpy_6)
tensor_6 = tf.reshape(tensor_6, [1,5])
numpy_6, tensor_6

# appears to still have issues

InvalidArgumentError: {{function_node __wrapped__Reshape_device_/job:localhost/replica:0/task:0/device:CPU:0}} Input to reshape is a tensor with 25 values, but the requested shape has 5 [Op:Reshape]

### Getting information from tensors

When dealing with tensors you probably want to be aware of the following attributes.
* Shape: the length (number of elements) of each of the dimensions of a tensor.
> tensor.shape
* Rank: the number of tesnor dimensions. A scalar has rank 0, a vector has rank 1, a tensor has rank n.
> tensor.ndim
* Axis: a particular dimension of a tensor.
> tensor[0], tensor[:, 1]
* Size: the total number of items in the tensor.
> tf.size(tensor)

In [None]:
# Create a rank 4 tensor (4 dimensions)
rank_4_tensor = tf.zeros(shape=[2, 3, 4, 5])
rank_4_tensor

<tf.Tensor: shape=(2, 3, 4, 5), dtype=float32, numpy=
array([[[[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]]],


       [[[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]]]], dtype=float32)>

In [None]:
rank_4_tensor[0]

<tf.Tensor: shape=(3, 4, 5), dtype=float32, numpy=
array([[[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]]], dtype=float32)>

In [None]:
rank_4_tensor.shape, rank_4_tensor.ndim, tf.size(rank_4_tensor)

(TensorShape([2, 3, 4, 5]), 4, <tf.Tensor: shape=(), dtype=int32, numpy=120>)

In [None]:
# Get various attributes of our tensor
print('Datatype of every element:', rank_4_tensor.dtype)
print('Number of dimensions (rank):', rank_4_tensor.ndim)
print('Shape of tensor:', rank_4_tensor.shape)
print('Elements along the 0 axis:', rank_4_tensor.shape[0])
print('Elements along the last axis:', rank_4_tensor.shape[-1])
print('Total number of elements in our tensor:', tf.size(rank_4_tensor).numpy()) # .numpy() returns the numberical value only

Datatype of every element: <dtype: 'float32'>
Number of dimensions (rank): 4
Shape of tensor: (2, 3, 4, 5)
Elements along the 0 axis: 2
Elements along the last axis: 5
Total number of elements in our tensor: 120


### Indexing tensors

Tensors can be indexed just like Python lists.

In [None]:
# Get the first 2 elements of each dimension
rank_4_tensor[:2, :2, :2, :2]

<tf.Tensor: shape=(2, 2, 2, 2), dtype=float32, numpy=
array([[[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]],


       [[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]]], dtype=float32)>

In [None]:
rank_4_tensor.shape

TensorShape([2, 3, 4, 5])

In [None]:
# Get the first element from each dimension from each index except for the final one
rank_4_tensor[:1, :1, :1, :]

<tf.Tensor: shape=(1, 1, 1, 5), dtype=float32, numpy=array([[[[0., 0., 0., 0., 0.]]]], dtype=float32)>

In [None]:
# Create a rank 2 tensor(2 dimensions)
rank_2_tensor = tf.constant([[10, 7],
                             [3, 4]])
rank_2_tensor.shape, rank_2_tensor.ndim

(TensorShape([2, 2]), 2)

In [None]:
rank_2_tensor

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4]], dtype=int32)>

In [None]:
# Get the last itme of each of our rank 2 tensor
rank_2_tensor[:, -1]

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([7, 4], dtype=int32)>

In [None]:
# Add in extra dimension to our rank 2 tensor
rank_3_tensor = rank_2_tensor[..., tf.newaxis]
rank_3_tensor

<tf.Tensor: shape=(2, 2, 1), dtype=int32, numpy=
array([[[10],
        [ 7]],

       [[ 3],
        [ 4]]], dtype=int32)>

In [None]:
# Alternative to tf.newaxis
tf.expand_dims(rank_2_tensor, axis=-1) # "-1" means expand the final axis

<tf.Tensor: shape=(2, 2, 1), dtype=int32, numpy=
array([[[10],
        [ 7]],

       [[ 3],
        [ 4]]], dtype=int32)>

In [None]:
# Expand the zero axis
tf.expand_dims(rank_2_tensor, axis=0) # expand the 0-axis

<tf.Tensor: shape=(1, 2, 2), dtype=int32, numpy=
array([[[10,  7],
        [ 3,  4]]], dtype=int32)>

### Manipulating tensors (tensor operations)

**Basic operations**

`+`, `-`, `*`, `/`

In [None]:
# You can add values to a tensor using the addition operator
tensor = tf.constant([[10, 7], [3, 4]])
tensor + 10

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[20, 17],
       [13, 14]], dtype=int32)>

In [None]:
# Original tensor is unchanged
tensor

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4]], dtype=int32)>

In [None]:
# Multiplication also works
tensor * 10

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[100,  70],
       [ 30,  40]], dtype=int32)>

In [None]:
# Subtraction if you want
tensor - 10

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 0, -3],
       [-7, -6]], dtype=int32)>

In [None]:
# We can use the tensorflow built-in function too
tf.multiply(tensor, 10)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[100,  70],
       [ 30,  40]], dtype=int32)>

In [None]:
# Use tensorflow built-in function for addition
tf.add(tensor, 10)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[20, 17],
       [13, 14]], dtype=int32)>

**Matrix multiplication**

In machine learning, matrix multiplication is one of the most common tensor operations.

There are two rules our tensors (or matrices) need to fulfill if we're going to atrix multiply them:

1. The inner dimensions must match
* The "inner dimensions" refer to the number of columns in the first matrix and the number of rows in the second matrix, which must be equal for the multiplication to be defined.
> tensorX.shape=(3, `3`) * tensorY.shape=(`3`, 2) = tensor.shape(3, 2)
2. The resulting matrix has the shape of the outer dimensions
* The "outer dimensions" are the number of rows in the first matrix and the number of columns in the second matrix, which will be the dimensions of the resulting product matrix.
> tensorX.shape=(`2`, 3) * tensorY.shape=(3, `2`) = tensor.shape(3, 2)

📖 **Resource:** Info and example of matrix multiplication: https://www.mathsisfun.com/algebra/matrix-multiplying.html

In [None]:
# Matrix multiplication in tensorflow
print(tensor)
tf.matmul(tensor, tensor)

tf.Tensor(
[[10  7]
 [ 3  4]], shape=(2, 2), dtype=int32)


<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[121,  98],
       [ 42,  37]], dtype=int32)>

In [None]:
# Replicate matrix multiplication from http://matrixmultiplication.xyz/
left_tensor = tf.constant([[1, 2, 7],
                           [7, 2, 1],
                           [3, 3, 3]])

right_tensor = tf.constant([[2, 5],
                            [6, 7],
                            [1, 8]])

print(f'left_tensor shape: {left_tensor.shape}, right_tensor shape: {right_tensor.shape}')
tf.matmul(left_tensor, right_tensor)

left_tensor shape: (3, 3), right_tensor shape: (3, 2)


<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[21, 75],
       [27, 57],
       [27, 60]], dtype=int32)>

In [None]:
# Matrix multiplication with python operator "@"
tensor @ tensor

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[121,  98],
       [ 42,  37]], dtype=int32)>

In [None]:
# Create a tensor of (3, 2)
X = tf.constant([[1, 2],
                 [3, 4],
                 [5, 6]])
# Create another (3, 2) tensor
Y = tf.constant([[7, 8],
                 [9, 10],
                 [11, 12]])

X, Y

(<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[1, 2],
        [3, 4],
        [5, 6]], dtype=int32)>,
 <tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[ 7,  8],
        [ 9, 10],
        [11, 12]], dtype=int32)>)

In [None]:
# Let's change the shape of Y
tf.reshape(Y, shape=(2, 3))

<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[ 7,  8,  9],
       [10, 11, 12]], dtype=int32)>

In [None]:
# Try to matrix multiply X by reshaped Y
X @ tf.reshape(Y, shape=(2, 3))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 27,  30,  33],
       [ 61,  68,  75],
       [ 95, 106, 117]], dtype=int32)>

In [None]:
tf.matmul(X, tf.reshape(Y, shape=(2, 3)))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 27,  30,  33],
       [ 61,  68,  75],
       [ 95, 106, 117]], dtype=int32)>

In [None]:
# Try reshaping X instead of Y
tf.matmul(tf.reshape(X, shape=(2, 3)), Y)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 58,  64],
       [139, 154]], dtype=int32)>

In [None]:
# Can do the same with transpose
X, tf.transpose(X), tf.reshape(X, shape=(2, 3))

# Transpose flips the axis, while reshape shuffles the tensor around into the shape that you want

(<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[1, 2],
        [3, 4],
        [5, 6]], dtype=int32)>,
 <tf.Tensor: shape=(2, 3), dtype=int32, numpy=
 array([[1, 3, 5],
        [2, 4, 6]], dtype=int32)>,
 <tf.Tensor: shape=(2, 3), dtype=int32, numpy=
 array([[1, 2, 3],
        [4, 5, 6]], dtype=int32)>)

**The dot product**

Matrix multiplication is also referred to as the dot product.

You can perform matrix multiplication using:
* `tf.matmul()`
* `tf.tensordot()`

In [None]:
X, Y

(<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[1, 2],
        [3, 4],
        [5, 6]], dtype=int32)>,
 <tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[ 7,  8],
        [ 9, 10],
        [11, 12]], dtype=int32)>)

In [None]:
# Perform the dot product on X and Y (requires X or Y to be transposed)
tf.tensordot(tf.transpose(X), Y, axes=1)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 89,  98],
       [116, 128]], dtype=int32)>

In [None]:
tf.transpose(X)

<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[1, 3, 5],
       [2, 4, 6]], dtype=int32)>

In [None]:
tf.reshape(X, shape=(2, 3))

<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[1, 2, 3],
       [4, 5, 6]], dtype=int32)>

In [None]:
# Perform matrix multiplication between X and Y (transposed)
tf.matmul(X, tf.transpose(Y))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 23,  29,  35],
       [ 53,  67,  81],
       [ 83, 105, 127]], dtype=int32)>

In [None]:
# Perform matrix multiplication between X and Y (reshaped)
tf.matmul(X, tf.reshape(Y, shape=(2, 3)))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 27,  30,  33],
       [ 61,  68,  75],
       [ 95, 106, 117]], dtype=int32)>

In [None]:
# Check the values of Y, reshape Y and transposed Y
print('Normal Y:')
print(Y, '\n')

print('Y reshaped to (2, 3):')
print(tf.reshape(Y, (2, 3)), '\n')

print('Y transposed:')
print(tf.transpose(Y))

Normal Y:
tf.Tensor(
[[ 7  8]
 [ 9 10]
 [11 12]], shape=(3, 2), dtype=int32) 

Y reshaped to (2, 3):
tf.Tensor(
[[ 7  8  9]
 [10 11 12]], shape=(2, 3), dtype=int32) 

Y transposed:
tf.Tensor(
[[ 7  9 11]
 [ 8 10 12]], shape=(2, 3), dtype=int32)


Generally, when performing multiplication on two tensors and one of the axes doesn't line up, you will tranpose (rather than reshape) one of the tensors to get satisfy the matrix multiplication rules.

### Changing the datatype of a tensor

In [None]:
tf.__version__

'2.17.1'

In [None]:
# Create a new tensor with default datatype (float32)
B = tf.constant([1.7, 7.4])
B, B.dtype

(<tf.Tensor: shape=(2,), dtype=float32, numpy=array([1.7, 7.4], dtype=float32)>,
 tf.float32)

In [None]:
C = tf.constant([7, 10])
C.dtype

tf.int32

In [None]:
# Change fro float32 to float16 (reduced precision)
D = tf.cast(B, dtype=tf.float16)
D, D.dtype

(<tf.Tensor: shape=(2,), dtype=float16, numpy=array([1.7, 7.4], dtype=float16)>,
 tf.float16)

In [None]:
# Change front int32 to float32
E = tf.cast(C, dtype=tf.float32)
E, E.dtype

(<tf.Tensor: shape=(2,), dtype=float32, numpy=array([ 7., 10.], dtype=float32)>,
 tf.float32)

In [None]:
E_float16 = tf.cast(E, dtype=tf.float16)
E_float16

<tf.Tensor: shape=(2,), dtype=float16, numpy=array([ 7., 10.], dtype=float16)>

### Aggregating tensors

Aggregating tensors = condensing them from multiple values down to a smaller amount of values.

In [None]:
# Create a new tensor D
D = tf.constant([-7, -10])
D

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([ -7, -10], dtype=int32)>

In [None]:
# Get the absolute values (take all the negative numbers in a tensor and turn them into positive numbers)
tf.abs(D)

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([ 7, 10], dtype=int32)>

Let's go through the following forms of aggregation:
* Get the minimum
* Get the maximum
* Get the mean of a tensor
* Get the sum of a tensor

TensorFlow uses the term "reduce" in its functions to indicate operations that aggregate values across one or more dimensions of a tensor, resulting in a tensor with a reduced number of dimensions.

In [None]:
# Practice, find the maximum, minimum, mean, and sum of a tensor
stats_tensor = tf.constant([[1, 2, 4],
                            [8, 16, 3],
                            [5, 10, 9]])
stats_tensor

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 1,  2,  4],
       [ 8, 16,  3],
       [ 5, 10,  9]], dtype=int32)>

In [None]:
# Find the maximum
tf.reduce_max(stats_tensor) # to find max for each dimension use axis=1, default reduces all dimensions (looks at all values and picks max)

<tf.Tensor: shape=(), dtype=int32, numpy=16>

In [None]:
# Find the minimum
tf.reduce_min(stats_tensor, keepdims=True) # "keepdims=True" shows the dimension of input_tensor in the output

<tf.Tensor: shape=(1, 1), dtype=int32, numpy=array([[1]], dtype=int32)>

In [None]:
# Find the mean
tf.reduce_mean(stats_tensor), tf.reduce_mean(stats_tensor, axis=1, keepdims=True)

(<tf.Tensor: shape=(), dtype=int32, numpy=6>,
 <tf.Tensor: shape=(3, 1), dtype=int32, numpy=
 array([[2],
        [9],
        [8]], dtype=int32)>)

In [None]:
# Find the sum
tf.reduce_sum(stats_tensor), tf.reduce_sum(stats_tensor, axis=0, keepdims=True)

(<tf.Tensor: shape=(), dtype=int32, numpy=58>,
 <tf.Tensor: shape=(1, 3), dtype=int32, numpy=array([[14, 28, 16]], dtype=int32)>)

In [None]:
# Create a new tensor used in video for exercise
E = tf.constant(np.random.randint(0, 100, size=50))
E

<tf.Tensor: shape=(50,), dtype=int64, numpy=
array([65, 19, 25, 54, 77,  3, 35, 29, 93, 32, 71,  8, 54, 27, 16, 81, 24,
        3, 96, 59, 60, 46, 58, 57, 81, 47, 58, 91,  6, 97, 32, 24, 96, 21,
       40,  3, 21, 46, 12, 26, 66, 56, 33, 67, 88, 94, 84, 93, 77,  1])>

🛠 **Exercise:** With what we've just learned, find the variance and standard deviation of our `E` tensor using TensorFlow methods.

In [None]:
# Find the variance
E_float32 = tf.cast(E, dtype=tf.float32)
tf.math.reduce_variance(E_float32) # causes and error, change the datatype from int64 to float32

<tf.Tensor: shape=(), dtype=float32, numpy=882.47833>

In [None]:
# Can also use the tensorflow_probability library
import tensorflow_probability as tfp
tfp.stats.variance(E)

<tf.Tensor: shape=(), dtype=int64, numpy=882>

In [None]:
# Find the standard deviation
tf.math.reduce_std(E_float32)

<tf.Tensor: shape=(), dtype=float32, numpy=29.706537>

So we can use two libraries, `tfp` (tensorflow_probability) or `math`. `tfp` is advantageous for probabilitic modeling and statistical analysis, while `math` is sufficient for basic mathematical computations.

Choose `tfp` if you need to:
* Model uncertainty
* Perform Bayesian inference
* Work with probability distributions
* Implement advanced statistical methods

Stick with TensorFlow `math` if you only need:
* Basic mathematical operations
* Deterministic computations
* Standard neural network architectures without probabilistic elements

### Find the positional maximum and minimum

The "position maximum" of a tensor refers to the index (or set of indices) corresponding to the element with the highest value within that tensor; essentially, it tells you where the maximum value is located within the tensor's structure, not just the maximum value itself.

The "positional minimum" similarly returns the indice(s) of the minimum values of tensor.

In [None]:
# Positional maximum (exercise)
tf.argmax(E)

<tf.Tensor: shape=(), dtype=int64, numpy=29>

In [None]:
E[11]

<tf.Tensor: shape=(), dtype=int64, numpy=8>

In [None]:
# Positional minimum (exercise)
tf.argmin(E)

<tf.Tensor: shape=(), dtype=int64, numpy=49>

In [None]:
E[41]

<tf.Tensor: shape=(), dtype=int64, numpy=56>

In [None]:
# Create a new tesnor for finding positional maximum and minimum (back to video)
tf.random.set_seed(42)
F = tf.random.uniform(shape=[50])
F

<tf.Tensor: shape=(50,), dtype=float32, numpy=
array([0.6645621 , 0.44100678, 0.3528825 , 0.46448255, 0.03366041,
       0.68467236, 0.74011743, 0.8724445 , 0.22632635, 0.22319686,
       0.3103881 , 0.7223358 , 0.13318717, 0.5480639 , 0.5746088 ,
       0.8996835 , 0.00946367, 0.5212307 , 0.6345445 , 0.1993283 ,
       0.72942245, 0.54583454, 0.10756552, 0.6767061 , 0.6602763 ,
       0.33695042, 0.60141766, 0.21062577, 0.8527372 , 0.44062173,
       0.9485276 , 0.23752594, 0.81179297, 0.5263394 , 0.494308  ,
       0.21612847, 0.8457197 , 0.8718841 , 0.3083862 , 0.6868038 ,
       0.23764038, 0.7817228 , 0.9671384 , 0.06870162, 0.79873943,
       0.66028714, 0.5871513 , 0.16461694, 0.7381023 , 0.32054043],
      dtype=float32)>

In [None]:
# Find the positional maximum
tf.argmax(F)

<tf.Tensor: shape=(), dtype=int64, numpy=42>

In [None]:
# Index on our largest value position
F[tf.argmax(F)]

<tf.Tensor: shape=(), dtype=float32, numpy=0.9671384>

In [None]:
# Find the max value of F
tf.reduce_max(F)

<tf.Tensor: shape=(), dtype=float32, numpy=0.9671384>

In [None]:
# Check for equality
F[tf.argmax(F)] == tf.reduce_max(F)

<tf.Tensor: shape=(), dtype=bool, numpy=True>

In [None]:
# Find the positional minimum
tf.argmin(F)

<tf.Tensor: shape=(), dtype=int64, numpy=16>

In [None]:
# Find the minimum using the positional minimum index
F[tf.argmin(F)]

<tf.Tensor: shape=(), dtype=float32, numpy=0.009463668>

### Squeezing a tensor (removing all single dimensions)

Removes dimensions of size one from a tensor.

In [None]:
# Create a tensor to get started
tf.random.set_seed(42)
G = tf.constant(tf.random.uniform(shape=[50]), shape=(1, 1, 1, 1, 50))
G

<tf.Tensor: shape=(1, 1, 1, 1, 50), dtype=float32, numpy=
array([[[[[0.6645621 , 0.44100678, 0.3528825 , 0.46448255, 0.03366041,
           0.68467236, 0.74011743, 0.8724445 , 0.22632635, 0.22319686,
           0.3103881 , 0.7223358 , 0.13318717, 0.5480639 , 0.5746088 ,
           0.8996835 , 0.00946367, 0.5212307 , 0.6345445 , 0.1993283 ,
           0.72942245, 0.54583454, 0.10756552, 0.6767061 , 0.6602763 ,
           0.33695042, 0.60141766, 0.21062577, 0.8527372 , 0.44062173,
           0.9485276 , 0.23752594, 0.81179297, 0.5263394 , 0.494308  ,
           0.21612847, 0.8457197 , 0.8718841 , 0.3083862 , 0.6868038 ,
           0.23764038, 0.7817228 , 0.9671384 , 0.06870162, 0.79873943,
           0.66028714, 0.5871513 , 0.16461694, 0.7381023 , 0.32054043]]]]],
      dtype=float32)>

In [None]:
G.shape

TensorShape([1, 1, 1, 1, 50])

In [None]:
G_squeezed = tf.squeeze(G)
G_squeezed, G_squeezed.shape

(<tf.Tensor: shape=(50,), dtype=float32, numpy=
 array([0.6645621 , 0.44100678, 0.3528825 , 0.46448255, 0.03366041,
        0.68467236, 0.74011743, 0.8724445 , 0.22632635, 0.22319686,
        0.3103881 , 0.7223358 , 0.13318717, 0.5480639 , 0.5746088 ,
        0.8996835 , 0.00946367, 0.5212307 , 0.6345445 , 0.1993283 ,
        0.72942245, 0.54583454, 0.10756552, 0.6767061 , 0.6602763 ,
        0.33695042, 0.60141766, 0.21062577, 0.8527372 , 0.44062173,
        0.9485276 , 0.23752594, 0.81179297, 0.5263394 , 0.494308  ,
        0.21612847, 0.8457197 , 0.8718841 , 0.3083862 , 0.6868038 ,
        0.23764038, 0.7817228 , 0.9671384 , 0.06870162, 0.79873943,
        0.66028714, 0.5871513 , 0.16461694, 0.7381023 , 0.32054043],
       dtype=float32)>,
 TensorShape([50]))

### One-hot encoding tensors

In data preprocessing for linear models, “One Hot Encoding” is a crucial technique for managing categorical data. In this method, “hot” signifies a category’s presence (encoded as one), while “cold” (or zero) signals its absence, using binary vectors for representation.

In [None]:
# Create a list of indices
some_list = [0, 1, 2, 3] # could be red, green, blue, purple

# One hot encode our list of indices
tf.one_hot(some_list, depth=4)

# Each row produced has the same length as the list, and has a value of 1 in the same place as the number we're looking at in the list
# So, the first row has one at the first index ([1., 0., 0., 0.]) because 0 is the first index in the list
# So on and so forth, the third row ([0., 0., 1., 0.]) has 1 at the third index, because we're looking at the third value in the list

<tf.Tensor: shape=(4, 4), dtype=float32, numpy=
array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]], dtype=float32)>

In [None]:
# Specify custom values for one hot encoding
tf.one_hot(some_list, depth=4, on_value='yo I love deep learning', off_value='I also like to dance')

<tf.Tensor: shape=(4, 4), dtype=string, numpy=
array([[b'yo I love deep learning', b'I also like to dance',
        b'I also like to dance', b'I also like to dance'],
       [b'I also like to dance', b'yo I love deep learning',
        b'I also like to dance', b'I also like to dance'],
       [b'I also like to dance', b'I also like to dance',
        b'yo I love deep learning', b'I also like to dance'],
       [b'I also like to dance', b'I also like to dance',
        b'I also like to dance', b'yo I love deep learning']],
      dtype=object)>

In [None]:
# Try different on values and off values
tf.one_hot(some_list, depth=4, on_value=100.0, off_value=50.0)

<tf.Tensor: shape=(4, 4), dtype=float32, numpy=
array([[100.,  50.,  50.,  50.],
       [ 50., 100.,  50.,  50.],
       [ 50.,  50., 100.,  50.],
       [ 50.,  50.,  50., 100.]], dtype=float32)>

In [None]:
# Index first vector (row)
tf.one_hot(some_list, depth=4, on_value=100.0, off_value=50.0)[0]

<tf.Tensor: shape=(4,), dtype=float32, numpy=array([100.,  50.,  50.,  50.], dtype=float32)>

In [None]:
# Find position of on values
for i in tf.one_hot(some_list, depth=4, on_value=100.0, off_value=50.0):
  print(tf.argmax(i))

tf.Tensor(0, shape=(), dtype=int64)
tf.Tensor(1, shape=(), dtype=int64)
tf.Tensor(2, shape=(), dtype=int64)
tf.Tensor(3, shape=(), dtype=int64)


In [None]:
# Change the depth parameter
my_some_list = [0, 1, 2, 3, 4, 5, 6, 7]
tf.one_hot(my_some_list, depth=16) # number of columns = depth

# If depth is less than the length of the list, only the rows up to list[depth] (index) will have the on value
# If depth is greater than the length of the list, all columns beyond the depth will only contain the off value

<tf.Tensor: shape=(8, 16), dtype=float32, numpy=
array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.]],
      dtype=float32)>

📖 **Resource:** website with machine learning resources: https://machinelearningmastery.com/one-hot-encoding-understanding-the-hot-in-data/

### Squaring, log, square root

In [None]:
H = tf.range(1, 10)
H

<tf.Tensor: shape=(9,), dtype=int32, numpy=array([1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)>

In [None]:
# Square it
tf.square(H)

<tf.Tensor: shape=(9,), dtype=int32, numpy=array([ 1,  4,  9, 16, 25, 36, 49, 64, 81], dtype=int32)>

In [None]:
# Find the square root
tf.sqrt(tf.cast(H, dtype=tf.float32)) # method required non int type, otherwise error


<tf.Tensor: shape=(9,), dtype=float32, numpy=
array([1.       , 1.4142135, 1.7320508, 2.       , 2.236068 , 2.4494898,
       2.6457512, 2.828427 , 3.       ], dtype=float32)>

In [None]:
# Find the log
tf.math.log(tf.cast(H, dtype=tf.float32)) # method required non int type, otherwise error

<tf.Tensor: shape=(9,), dtype=float32, numpy=
array([0.       , 0.6931472, 1.0986123, 1.3862944, 1.609438 , 1.7917595,
       1.9459102, 2.0794415, 2.1972246], dtype=float32)>

🛠 **Exercise:** Find three other math operations in TensorFlow documentation and try executing them.

* `tf.math.approx_max_k`: a function that efficiently finds the k largest values and their corresponding indices in a given tensor.
* `tf.math.approx_min_k`: a function that efficiently finds the k smallest values and their corresponding indices in a given tensor. *not working*
* `tf.math.less_equal`: returns the truth value of (x <= y) element-wise.

In [None]:
# Create a new tensor for exercise
tf.random.set_seed(50)
G_exercise = tf.random.uniform(shape=[20])
G_exercise

<tf.Tensor: shape=(20,), dtype=float32, numpy=
array([0.94196117, 0.1697309 , 0.5064962 , 0.27598453, 0.899318  ,
       0.4486513 , 0.43584716, 0.71771336, 0.85751176, 0.12559032,
       0.0875535 , 0.85261035, 0.17349517, 0.31913245, 0.4845121 ,
       0.5540675 , 0.6620537 , 0.20845115, 0.43225408, 0.9061552 ],
      dtype=float32)>

In [None]:
# Find the k largest values and their given indices
tf.math.approx_max_k(G_exercise, 2)

ApproxTopK(values=<tf.Tensor: shape=(2,), dtype=float32, numpy=array([0.94196117, 0.9061552 ], dtype=float32)>, indices=<tf.Tensor: shape=(2,), dtype=int32, numpy=array([ 0, 19], dtype=int32)>)

In [None]:
# Find the k smallest values and their given indices
tf.math.approx_min_k(tf.cast(G_exercise, dtype=tf.float16), 2)

'''
Following solution suggested by colab also not working.

@tf.function
def find_min_k(tensor, k):
  return tf.math.approx_min_k(tensor, k)

# Find the k smallest values and their given indices
min_values, min_indices = find_min_k(G_exercise, 2)
print(min_values, min_indices)
'''

NotFoundError: Could not find device for node: {{node ApproxTopK}} = ApproxTopK[T=DT_HALF, aggregate_to_topk=true, is_max_k=false, k=2, recall_target=0.95, reduction_dimension=-1, reduction_input_size_override=-1]
All kernels registered for op ApproxTopK:
  device='XLA_CPU_JIT'; T in [DT_FLOAT, DT_BFLOAT16, DT_HALF]
  device='XLA_GPU_JIT'; T in [DT_FLOAT, DT_BFLOAT16, DT_HALF]
  device='CPU'; T in [DT_DOUBLE]; reduction_dimension in [-1]; is_max_k in [true]
  device='CPU'; T in [DT_FLOAT]; reduction_dimension in [-1]; is_max_k in [true]
  device='CPU'; T in [DT_BFLOAT16]; reduction_dimension in [-1]; is_max_k in [true]
  device='CPU'; T in [DT_HALF]; reduction_dimension in [-1]; is_max_k in [true]
  device='CPU'; T in [DT_INT32]; reduction_dimension in [-1]; is_max_k in [true]
  device='CPU'; T in [DT_INT8]; reduction_dimension in [-1]; is_max_k in [true]
  device='CPU'; T in [DT_UINT8]; reduction_dimension in [-1]; is_max_k in [true]
  device='CPU'; T in [DT_INT16]; reduction_dimension in [-1]; is_max_k in [true]
  device='CPU'; T in [DT_UINT16]; reduction_dimension in [-1]; is_max_k in [true]
  device='CPU'; T in [DT_UINT32]; reduction_dimension in [-1]; is_max_k in [true]
  device='CPU'; T in [DT_INT64]; reduction_dimension in [-1]; is_max_k in [true]
  device='CPU'; T in [DT_UINT64]; reduction_dimension in [-1]; is_max_k in [true]
  device='GPU'; T in [DT_DOUBLE]; reduction_dimension in [-1]; is_max_k in [true]
  device='GPU'; T in [DT_FLOAT]; reduction_dimension in [-1]; is_max_k in [true]
  device='GPU'; T in [DT_BFLOAT16]; reduction_dimension in [-1]; is_max_k in [true]
  device='GPU'; T in [DT_HALF]; reduction_dimension in [-1]; is_max_k in [true]
 [Op:ApproxTopK] name: 

In [None]:
# Find which elements are less than or equal to elements in the tensor
G_exercise, tf.math.less_equal(G_exercise, [0.5])

(<tf.Tensor: shape=(20,), dtype=float32, numpy=
 array([0.94196117, 0.1697309 , 0.5064962 , 0.27598453, 0.899318  ,
        0.4486513 , 0.43584716, 0.71771336, 0.85751176, 0.12559032,
        0.0875535 , 0.85261035, 0.17349517, 0.31913245, 0.4845121 ,
        0.5540675 , 0.6620537 , 0.20845115, 0.43225408, 0.9061552 ],
       dtype=float32)>,
 <tf.Tensor: shape=(20,), dtype=bool, numpy=
 array([False,  True, False,  True, False,  True,  True, False, False,
         True,  True, False,  True,  True,  True, False, False,  True,
         True, False])>)

### Tensors and NumPy

TensorFlow interacts beautifully with NumPy arrays.

In [None]:
# Create a tensor directly from a NumPy array
J = tf.constant(np.array([3., 7., 10.]))
J

<tf.Tensor: shape=(3,), dtype=float64, numpy=array([ 3.,  7., 10.])>

In [None]:
# Convert our tensor back to our NumPy array
np.array(J), type(np.array(J))

(array([ 3.,  7., 10.]), numpy.ndarray)

In [None]:
# Convert tensor J to a NumPy array
J.numpy(), type(J.numpy())

(array([ 3.,  7., 10.]), numpy.ndarray)

In [None]:
# The default types of each are slightly different
numpy_J = tf.constant(np.array([3., 7., 10.]))
tensor_J = tf.constant([3., 7., 10.])
# Check the data types of each
numpy_J.dtype, tensor_J.dtype

# default type of Numpy array is float64, default type of TensorFlow tensor is float32

(tf.float64, tf.float32)

## 🛠 Exercises & 📖 Extra-curriculum

You can find solutions to the exercises in extras/solutions/.

Github link: https://github.com/mrdbourke/tensorflow-deep-learning#-00-tensorflow-fundamentals-exercises

### 🛠 00. TensorFlow Fundamentals Exercises

#### Exercise 1: Create a vector, scalar, matrix and tensor with values of your choosing using tf.constant().

In [None]:
# Create a vector, scalar, matrix and tensor with values of your choosing using tf.constant()
exercise_scalar = tf.constant([3.])
exercise_vector = tf.constant([0, 1, 2, 3, 4, 5])
exercise_matrix = tf.constant([[1, 2, 3],
                               [4, 5, 6],
                               [7, 8, 9]])
exercise_tensor = tf.constant([2, 34, 8, 76, 14, 34, 56, 23, 78, 3, 1, 76, 34, 56, 98, 17, 14, 37], shape=(1, 9, 2))

exercise_scalar, exercise_vector, exercise_matrix, exercise_tensor

(<tf.Tensor: shape=(1,), dtype=float32, numpy=array([3.], dtype=float32)>,
 <tf.Tensor: shape=(6,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5], dtype=int32)>,
 <tf.Tensor: shape=(3, 3), dtype=int32, numpy=
 array([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]], dtype=int32)>,
 <tf.Tensor: shape=(1, 9, 2), dtype=int32, numpy=
 array([[[ 2, 34],
         [ 8, 76],
         [14, 34],
         [56, 23],
         [78,  3],
         [ 1, 76],
         [34, 56],
         [98, 17],
         [14, 37]]], dtype=int32)>)

#### Exercise 2: Find the shape, rank and size of the tensors you created in 1.

In [None]:
# Find the shape, rank and size of the tensors you created in 1
print('exercise_scalar shape:', exercise_scalar.shape)
print('exercise_scalar rank (dimensions):', exercise_scalar.ndim)
print('exercise_scalar size:', tf.size(exercise_scalar.numpy())) # .numpy() returns the numberical value only

print('\nexercise_vector shape:', exercise_vector.shape)
print('exercise_vector rank (dimensions):', exercise_vector.ndim)
print('exercise_vector size:', tf.size(exercise_vector.numpy()))

print('\nexercise_matrix shape:', exercise_matrix.shape)
print('exercise_matrix rank (dimensions):', exercise_matrix.ndim)
print('exercise_matrix size:', tf.size(exercise_matrix.numpy()))

print('\nexercise_tensor shape:', exercise_tensor.shape)
print('exercise_tensor rank (dimensions):', exercise_tensor.ndim)
print('exercise_tensor size:', tf.size(exercise_tensor.numpy()))

exercise_scalar shape: (1,)
exercise_scalar rank (dimensions): 1
exercise_scalar size: tf.Tensor(1, shape=(), dtype=int32)

exercise_vector shape: (6,)
exercise_vector rank (dimensions): 1
exercise_vector size: tf.Tensor(6, shape=(), dtype=int32)

exercise_matrix shape: (3, 3)
exercise_matrix rank (dimensions): 2
exercise_matrix size: tf.Tensor(9, shape=(), dtype=int32)

exercise_tensor shape: (1, 9, 2)
exercise_tensor rank (dimensions): 3
exercise_tensor size: tf.Tensor(18, shape=(), dtype=int32)


#### Exercise 3: Create two tensors containing random values between 0 and 1 with shape [5, 300].

In [None]:
# Create two tensors containing random values between 0 and 1 with shape [5, 300]

# first tensor
tf.random.set_seed(14)
ex_random_tensor1 = tf.random.Generator.from_seed(14) # set seed for reproducibility
ex_random_tensor1 = ex_random_tensor1.uniform(shape=(5, 300))
ex_random_tensor1

<tf.Tensor: shape=(5, 300), dtype=float32, numpy=
array([[0.6457157 , 0.16484237, 0.4484123 , ..., 0.89660215, 0.9323058 ,
        0.01358628],
       [0.7463044 , 0.3287711 , 0.89990544, ..., 0.40853846, 0.6575657 ,
        0.51386154],
       [0.18530178, 0.14207125, 0.03463948, ..., 0.23896313, 0.10053349,
        0.21235132],
       [0.8799336 , 0.6545799 , 0.3623494 , ..., 0.57640624, 0.27010262,
        0.34334147],
       [0.2932942 , 0.66050565, 0.62351525, ..., 0.8472601 , 0.87572944,
        0.37801242]], dtype=float32)>

In [None]:
# Check min and max values are between 0 and 1
tf.reduce_max(ex_random_tensor1), tf.reduce_min(ex_random_tensor1)

(<tf.Tensor: shape=(), dtype=float32, numpy=0.9981767>,
 <tf.Tensor: shape=(), dtype=float32, numpy=0.0003619194>)

In [None]:
# second tensor
tf.random.set_seed(15)
ex_random_tensor2 = tf.random.Generator.from_seed(15) # set seed for reproducibility
ex_random_tensor2 = ex_random_tensor2.uniform(shape=(5, 300))
ex_random_tensor2

<tf.Tensor: shape=(5, 300), dtype=float32, numpy=
array([[0.816115  , 0.4129653 , 0.2632984 , ..., 0.3287711 , 0.89990544,
        0.80902195],
       [0.75954664, 0.20455527, 0.80811715, ..., 0.14207125, 0.03463948,
        0.20632088],
       [0.37123632, 0.9250525 , 0.91835976, ..., 0.6545799 , 0.3623494 ,
        0.11630154],
       [0.02891374, 0.87833166, 0.6165559 , ..., 0.66050565, 0.62351525,
        0.376606  ],
       [0.83196616, 0.86020887, 0.6980419 , ..., 0.65553236, 0.96398544,
        0.4293959 ]], dtype=float32)>

In [None]:
# Check min and max values are between 0 and 1
tf.reduce_max(ex_random_tensor2), tf.reduce_min(ex_random_tensor2)

(<tf.Tensor: shape=(), dtype=float32, numpy=0.9981767>,
 <tf.Tensor: shape=(), dtype=float32, numpy=0.0003619194>)

In [None]:
# third tensor
tf.random.set_seed(16)
ex_random_tensor3 = tf.constant(np.random.uniform(0.0, 1.0, size=1500), shape=(5, 300))
ex_random_tensor3

<tf.Tensor: shape=(5, 300), dtype=float64, numpy=
array([[0.11207952, 0.28402688, 0.42875933, ..., 0.09918831, 0.09316682,
        0.69140493],
       [0.00143584, 0.16199708, 0.53267193, ..., 0.62370795, 0.16278485,
        0.00587014],
       [0.40430532, 0.92693388, 0.02882474, ..., 0.27297877, 0.21234375,
        0.01201918],
       [0.07835266, 0.43209584, 0.97844994, ..., 0.08054464, 0.99846651,
        0.904399  ],
       [0.95731477, 0.81367187, 0.03274807, ..., 0.20716142, 0.06783502,
        0.07616366]])>

In [None]:
tf.reduce_max(ex_random_tensor3), tf.reduce_min(ex_random_tensor3)

(<tf.Tensor: shape=(), dtype=float64, numpy=0.999634067471954>,
 <tf.Tensor: shape=(), dtype=float64, numpy=0.0003477792969935889>)

In [None]:
# Another way to do this
tf.random.set_seed(18)

review_tensor_1 = tf.random.uniform(shape=(5, 300))
review_tensor_2 = tf.random.uniform(shape=(5, 300))

review_tensor_1, review_tensor_1.shape, review_tensor_2, review_tensor_2.shape

(<tf.Tensor: shape=(5, 300), dtype=float32, numpy=
 array([[0.99482024, 0.423509  , 0.23757601, ..., 0.9008151 , 0.37804234,
         0.7523302 ],
        [0.71978104, 0.3487463 , 0.5362587 , ..., 0.502463  , 0.80428815,
         0.13641596],
        [0.6298022 , 0.44815707, 0.3548994 , ..., 0.9418329 , 0.85317564,
         0.19250667],
        [0.93190825, 0.9386103 , 0.9973055 , ..., 0.37180674, 0.6324371 ,
         0.6770823 ],
        [0.7946948 , 0.06712067, 0.9683863 , ..., 0.7479322 , 0.50673914,
         0.17488968]], dtype=float32)>,
 TensorShape([5, 300]),
 <tf.Tensor: shape=(5, 300), dtype=float32, numpy=
 array([[0.07182574, 0.89430106, 0.43877077, ..., 0.31279123, 0.26463306,
         0.8546907 ],
        [0.92182326, 0.9383422 , 0.00460994, ..., 0.37463796, 0.53461885,
         0.24716055],
        [0.7638447 , 0.4456128 , 0.8371905 , ..., 0.63126206, 0.5899426 ,
         0.1716013 ],
        [0.9511194 , 0.14007449, 0.16739333, ..., 0.5857521 , 0.3355477 ,
         0.210

#### Exercise 4: Multiply the two tensors you created in 3 using matrix multiplication.

In [None]:
# Multiply the two tensors you created in 3 using matrix multiplication
tf.multiply(ex_random_tensor1, ex_random_tensor2)

<tf.Tensor: shape=(5, 300), dtype=float32, numpy=
array([[0.5269783 , 0.06807417, 0.11806624, ..., 0.2947769 , 0.83898705,
        0.0109916 ],
       [0.566853  , 0.06725187, 0.727229  , ..., 0.05804157, 0.02277773,
        0.10602037],
       [0.06879075, 0.13142337, 0.0318115 , ..., 0.15642045, 0.03642825,
        0.02469679],
       [0.02544217, 0.57493824, 0.22340867, ..., 0.38071957, 0.1684131 ,
        0.12930445],
       [0.24401084, 0.5681728 , 0.4352398 , ..., 0.55540645, 0.8441904 ,
        0.162317  ]], dtype=float32)>

#### Exercise 5: Multiply the two tensors you created in 3 using dot product.

In [None]:
# Multiply the two tensors you created in 3 using dot product
tf.tensordot(ex_random_tensor1, tf.transpose(ex_random_tensor2), axes=1)

<tf.Tensor: shape=(5, 5), dtype=float32, numpy=
array([[71.91752 , 75.44386 , 70.5459  , 71.81357 , 75.62357 ],
       [72.48187 , 79.91962 , 77.637665, 73.77139 , 77.765564],
       [72.287766, 75.62342 , 70.23537 , 70.703835, 74.09948 ],
       [69.014206, 73.79578 , 69.22356 , 71.17187 , 76.16797 ],
       [74.12229 , 75.7308  , 74.37825 , 75.789085, 76.8095  ]],
      dtype=float32)>

In [None]:
# Check dimensions
ex_random_tensor1.shape, tf.transpose(ex_random_tensor2).shape, tf.tensordot(ex_random_tensor1, tf.transpose(ex_random_tensor2), axes=1).shape

(TensorShape([5, 300]), TensorShape([300, 5]), TensorShape([5, 5]))

In [None]:
# Multiply the two tensors you created in 3 using matmul
tf.matmul(ex_random_tensor1, tf.transpose(ex_random_tensor2))

<tf.Tensor: shape=(5, 5), dtype=float32, numpy=
array([[71.91752 , 75.44386 , 70.5459  , 71.81357 , 75.62357 ],
       [72.48187 , 79.91962 , 77.637665, 73.77139 , 77.765564],
       [72.287766, 75.62342 , 70.23537 , 70.703835, 74.09948 ],
       [69.014206, 73.79578 , 69.22356 , 71.17187 , 76.16797 ],
       [74.12229 , 75.7308  , 74.37825 , 75.789085, 76.8095  ]],
      dtype=float32)>

In [None]:
# Check dimensions
ex_random_tensor1.shape, tf.transpose(ex_random_tensor2).shape, tf.matmul(ex_random_tensor1, tf.transpose(ex_random_tensor2)).shape

(TensorShape([5, 300]), TensorShape([300, 5]), TensorShape([5, 5]))

In [None]:
# Check that multiplied matrices are equal
tf.tensordot(ex_random_tensor1, tf.transpose(ex_random_tensor2), axes=1) == tf.matmul(ex_random_tensor1, tf.transpose(ex_random_tensor2))

<tf.Tensor: shape=(5, 5), dtype=bool, numpy=
array([[ True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True]])>

#### Exercise 6: Create a tensor with random values between 0 and 1 with shape [224, 224, 3].

In [None]:
# Create a tensor with random values between 0 and 1 with shape [224, 224, 3]
tf.random.set_seed(14)
random_tensor_4 = tf.random.Generator.from_seed(14) # set seed for reproducibility
random_tensor_4 = random_tensor_4.uniform(shape=(224, 224, 3))
random_tensor_4

<tf.Tensor: shape=(224, 224, 3), dtype=float32, numpy=
array([[[0.6457157 , 0.16484237, 0.4484123 ],
        [0.6057888 , 0.816115  , 0.4129653 ],
        [0.2632984 , 0.19087589, 0.7631861 ],
        ...,
        [0.46352363, 0.5257573 , 0.9382287 ],
        [0.6497872 , 0.18516552, 0.79773116],
        [0.36611784, 0.08979952, 0.4708724 ]],

       [[0.5742779 , 0.5244585 , 0.47529185],
        [0.44046998, 0.93211055, 0.47112358],
        [0.737151  , 0.2978208 , 0.35761952],
        ...,
        [0.68137383, 0.4698242 , 0.9976375 ],
        [0.27073598, 0.30773306, 0.90979517],
        [0.4774183 , 0.38195646, 0.50485706]],

       [[0.44287014, 0.7514242 , 0.1499486 ],
        [0.02811372, 0.24711013, 0.3171805 ],
        [0.2562983 , 0.98411655, 0.5639552 ],
        ...,
        [0.87723756, 0.30829275, 0.7989719 ],
        [0.60742605, 0.69676137, 0.83041453],
        [0.7720957 , 0.44206798, 0.53568316]],

       ...,

       [[0.33557928, 0.11696231, 0.12967658],
        [0.03

In [None]:
# Check min, max, and shape
tf.reduce_min(random_tensor_4), tf.reduce_max(random_tensor_4), random_tensor_4.shape

(<tf.Tensor: shape=(), dtype=float32, numpy=4.053116e-06>,
 <tf.Tensor: shape=(), dtype=float32, numpy=0.99998736>,
 TensorShape([224, 224, 3]))

In [None]:
# Another way to do this and manually set minimum and maximum values
big_tensor = tf.random.uniform(shape=[224, 224, 3], minval=0, maxval=1)
big_tensor

<tf.Tensor: shape=(224, 224, 3), dtype=float32, numpy=
array([[[4.9112380e-01, 5.3003132e-01, 1.4631307e-01],
        [6.6462088e-01, 6.0586429e-01, 8.2083035e-01],
        [7.9139292e-01, 3.8782442e-01, 9.9900615e-01],
        ...,
        [3.0100179e-01, 6.0639465e-01, 8.4932172e-01],
        [9.1465998e-01, 9.2156982e-01, 4.6522570e-01],
        [9.0860510e-01, 2.9255652e-01, 9.8321199e-01]],

       [[9.3751562e-01, 1.3177049e-01, 3.0410969e-01],
        [2.5001693e-01, 9.3303931e-01, 8.0606985e-01],
        [6.8146145e-01, 9.9024904e-01, 6.6689038e-01],
        ...,
        [8.5864568e-01, 8.7709463e-01, 6.4447522e-01],
        [4.4084299e-01, 8.6032951e-01, 3.9109659e-01],
        [1.6456926e-01, 9.6167183e-01, 4.9146652e-02]],

       [[3.9788377e-01, 2.7997780e-01, 3.1804836e-01],
        [2.0025587e-01, 3.7393117e-01, 4.2632401e-01],
        [9.8690438e-01, 3.6748266e-01, 3.6327207e-01],
        ...,
        [2.8211617e-01, 8.0663574e-01, 4.1089642e-01],
        [7.5058496e-01

#### Exercise 7: Find the min and max values of the tensor you created in 6 along the first axis.

In [None]:
# Find the min and max values of the tensor you created in 6 along the first axis
tf.reduce_min(random_tensor_4, axis=1), tf.reduce_max(random_tensor_4, axis=1)

(<tf.Tensor: shape=(224, 3), dtype=float32, numpy=
 array([[8.85725021e-03, 2.27296352e-03, 3.61919403e-04],
        [3.81457806e-03, 1.57868862e-03, 7.04526901e-03],
        [2.30479240e-03, 7.05158710e-03, 6.20603561e-04],
        [4.91738319e-04, 2.68554688e-03, 3.78012657e-04],
        [1.23119354e-03, 4.99725342e-04, 2.79754400e-02],
        [8.93831253e-03, 3.03101540e-03, 7.01427460e-04],
        [4.61709499e-03, 8.92996788e-04, 5.28657436e-03],
        [4.57584858e-03, 1.32584572e-03, 2.43222713e-03],
        [6.10423088e-03, 1.17567778e-02, 1.36065483e-03],
        [3.61275673e-03, 1.60312653e-03, 1.39328241e-02],
        [3.53908539e-03, 7.36081600e-03, 1.54817104e-03],
        [1.33935213e-02, 8.34465027e-04, 1.55812502e-02],
        [1.62267685e-03, 1.18041039e-03, 1.20317936e-03],
        [1.95264816e-04, 2.37429142e-03, 1.57500505e-02],
        [8.28146935e-03, 5.01728058e-03, 3.30805779e-04],
        [9.07468796e-03, 7.46309757e-03, 1.33943558e-03],
        [4.95231152e-

#### Exercise 8: Created a tensor with random values of shape [1, 224, 224, 3] then squeeze it to change the shape to [224, 224, 3]

In [None]:
# Create tensor and check min and max values
tf.random.set_seed(10)
random_tensor_5 = tf.random.Generator.from_seed(10)
random_tensor_5 = random_tensor_5.uniform(shape=(1, 224, 224, 3))
random_tensor_5, tf.reduce_min(random_tensor_5), tf.reduce_max(random_tensor_5), random_tensor_5.shape

(<tf.Tensor: shape=(1, 224, 224, 3), dtype=float32, numpy=
 array([[[[0.93598676, 0.6513264 , 0.31663585],
          [0.00111556, 0.9212191 , 0.3822806 ],
          [0.77246034, 0.91514194, 0.5751133 ],
          ...,
          [0.06638753, 0.52053475, 0.18713212],
          [0.29305923, 0.724267  , 0.06265461],
          [0.60910606, 0.07623863, 0.89656115]],
 
         [[0.72851205, 0.38249934, 0.5205425 ],
          [0.91310525, 0.5603143 , 0.72630703],
          [0.66004574, 0.46352363, 0.5257573 ],
          ...,
          [0.39209676, 0.6545948 , 0.9075004 ],
          [0.06859708, 0.56658983, 0.7013639 ],
          [0.55792916, 0.6484337 , 0.8147869 ]],
 
         [[0.9756906 , 0.8409076 , 0.6368543 ],
          [0.05900538, 0.8106538 , 0.6503265 ],
          [0.92341375, 0.68137383, 0.4698242 ],
          ...,
          [0.98975146, 0.7613524 , 0.9861349 ],
          [0.92030287, 0.8113171 , 0.26456964],
          [0.24923408, 0.74147856, 0.49428725]],
 
         ...,
 
       

In [None]:
# Squeeze tensor to change the shape to [224, 224, 3]
random_tensor_5_squeezed = tf.squeeze(random_tensor_5)
random_tensor_5_squeezed, random_tensor_5_squeezed.shape

(<tf.Tensor: shape=(224, 224, 3), dtype=float32, numpy=
 array([[[0.93598676, 0.6513264 , 0.31663585],
         [0.00111556, 0.9212191 , 0.3822806 ],
         [0.77246034, 0.91514194, 0.5751133 ],
         ...,
         [0.06638753, 0.52053475, 0.18713212],
         [0.29305923, 0.724267  , 0.06265461],
         [0.60910606, 0.07623863, 0.89656115]],
 
        [[0.72851205, 0.38249934, 0.5205425 ],
         [0.91310525, 0.5603143 , 0.72630703],
         [0.66004574, 0.46352363, 0.5257573 ],
         ...,
         [0.39209676, 0.6545948 , 0.9075004 ],
         [0.06859708, 0.56658983, 0.7013639 ],
         [0.55792916, 0.6484337 , 0.8147869 ]],
 
        [[0.9756906 , 0.8409076 , 0.6368543 ],
         [0.05900538, 0.8106538 , 0.6503265 ],
         [0.92341375, 0.68137383, 0.4698242 ],
         ...,
         [0.98975146, 0.7613524 , 0.9861349 ],
         [0.92030287, 0.8113171 , 0.26456964],
         [0.24923408, 0.74147856, 0.49428725]],
 
        ...,
 
        [[0.38056338, 0.43446696

#### Exercise 9: Create a tensor with shape [10] using your own choice of values, then find the index which has the maximum value.

In [None]:
# Create a tensor
exercise_numpy = np.arange(1, 11)
int_tensor_1 = tf.constant(exercise_numpy, shape=(10))
exercise_numpy, int_tensor_1, int_tensor_1.shape

(array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10]),
 <tf.Tensor: shape=(10,), dtype=int64, numpy=array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])>,
 TensorShape([10]))

In [None]:
# Find positional maximum value
tf.argmax(int_tensor_1)

<tf.Tensor: shape=(), dtype=int64, numpy=9>

In [None]:
# Check value of element at positional maximum index
int_tensor_1[tf.argmax(int_tensor_1)]

<tf.Tensor: shape=(), dtype=int64, numpy=10>

In [None]:
# Check for equality
int_tensor_1[tf.argmax(int_tensor_1)] == tf.reduce_max(int_tensor_1)

<tf.Tensor: shape=(), dtype=bool, numpy=True>

#### Exercise 10: One-hot encode the tensor you created in 9.

In [None]:
# One hot encode tensor
tf.one_hot(int_tensor_1, depth=10, on_value=100.0, off_value=50.0)

<tf.Tensor: shape=(10, 10), dtype=float32, numpy=
array([[ 50., 100.,  50.,  50.,  50.,  50.,  50.,  50.,  50.,  50.],
       [ 50.,  50., 100.,  50.,  50.,  50.,  50.,  50.,  50.,  50.],
       [ 50.,  50.,  50., 100.,  50.,  50.,  50.,  50.,  50.,  50.],
       [ 50.,  50.,  50.,  50., 100.,  50.,  50.,  50.,  50.,  50.],
       [ 50.,  50.,  50.,  50.,  50., 100.,  50.,  50.,  50.,  50.],
       [ 50.,  50.,  50.,  50.,  50.,  50., 100.,  50.,  50.,  50.],
       [ 50.,  50.,  50.,  50.,  50.,  50.,  50., 100.,  50.,  50.],
       [ 50.,  50.,  50.,  50.,  50.,  50.,  50.,  50., 100.,  50.],
       [ 50.,  50.,  50.,  50.,  50.,  50.,  50.,  50.,  50., 100.],
       [ 50.,  50.,  50.,  50.,  50.,  50.,  50.,  50.,  50.,  50.]],
      dtype=float32)>

### 📖 00. TensorFlow Fundamentals Extra-curriculum

#### Read through the list of [TensorFlow Python APIs](https://www.tensorflow.org/api_docs/python/tf/all_symbols), pick one we haven't gone through in this notebook, reverse engineer it (write out the documentation code for yourself) and figure out what it does.

* API: `tf.concat(values, axis, name='concat')`
* Link: https://www.tensorflow.org/api_docs/python/tf/concat

In [None]:
# Recreate the tf.concat function
def test_concat(tensors, axis=0):
  if not isinstance(tensors, list) or len(tensors) < 1:
        raise ValueError("Input must be a non-empty list of tensors")

  shapes = [t.shape for t in tensors]
  for i in range(1, len(shapes)):
    if len(shapes[i]) != len(shapes[0]):
      raise ValueError("All tensors must have the same rank")
  for j in range(len(shapes[i])):
    if j != axis and shapes[i][j] != shapes[0][j]:
      raise ValueError("Dimensions must match except for the concat axis")

  return tf.concat(tensors, axis)

testing = tf.constant([1, 2, 3])
testing1 = tf.constant([4, 5, 6, 7])

print(test_concat([testing, testing1], axis=0))

tf.Tensor([1 2 3 4 5 6 7], shape=(7,), dtype=int32)


#### Go through the [TensorFlow 2.x quick start for beginners](https://www.tensorflow.org/tutorials/quickstart/beginner) tutorial (be sure to type out all of the code yourself, even if you don't understand it).

* Are there any functions we used in here that match what's used in there? Which are the same? Which haven't you seen before?

In [None]:
print("TensorFlow version:", tf.__version__)

TensorFlow version: 2.17.1


Load and prepare the MNIST dataset. The pixel values of the images range from 0 through 255. Scale these values to a range of 0 to 1 by dividing the values by 255.0. This also converts the sample data from integers to floating-point numbers

The MNIST database of handwritten digits has a training set of 60,000 examples, and a test set of 10,000 examples.

[MNIST](https://www.kaggle.com/datasets/hojjatk/mnist-dataset) Dataset

In [None]:
mnist = tf.keras.datasets.mnist

In [None]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 0us/step


Build a `tf.keras.Sequential` model:

`Sequential` is useful for stacking layers where each layer has one input tensor and one output tensor. Layers are functions with a known mathematical structure that can be reused and have trainable variables. Most TensorFlow models are composed of layers. This model uses the `Flatten`, `Dense`, and `Dropout` layers.

* **Flatten**: The `Flatten` layer in Keras reshapes input data into a one-dimensional array, allowing compatibility between convolutional layers and fully connected layers in neural networks.
* **Dense**: A `Dense` layer is mostly used as the penultimate layer after a feature extraction block (convolution, encoder or decoder, etc.), output layer (final layer), and to project a vector of dimension d0 to a new dimension d1.
> Dense implements the operation: output = activation(dot(input, kernel) + bias). [Link to Keras site](https://keras.io/api/layers/core_layers/dense/)
* **Dropout**: a regularization technique for neural network models. `Dropout` is a technique where randomly selected neurons are ignored during training. They are “dropped out” randomly. This means that their contribution to the activation of downstream neurons is temporally removed on the forward pass, and any weight updates are not applied to the neuron on the backward pass.
> Link to [Dropout Regularization in Deep Learning Models with Keras](https://machinelearningmastery.com/dropout-regularization-deep-learning-models-keras/)

In [None]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10)
])

  super().__init__(**kwargs)


For each example, the model returns a vector of logits or log-odds scores, one for each class.

In [None]:
predictions = model(x_train[:1]).numpy()
predictions

array([[-0.8770466 , -0.22437695, -0.46914947,  0.41602507, -0.19913813,
        -0.19627443, -0.40551683, -0.06523189,  0.29083237,  0.63028276]],
      dtype=float32)

The `tf.nn.softmax` function converts these logits to probabilities for each class.

* `Softmax` is a mathematical function that converts a vector of numbers into a vector of probabilities, where the probabilities of each value are proportional to the relative scale of each value in the vector.
> Link to [Softmax Activation Function with Python](https://machinelearningmastery.com/softmax-activation-function-with-python/)

In [None]:
tf.nn.softmax(predictions).numpy()

array([[0.04237703, 0.08139193, 0.06372033, 0.1544203 , 0.0834723 ,
        0.0837117 , 0.0679068 , 0.09543268, 0.13624918, 0.19131778]],
      dtype=float32)

Define a loss function for training using `losses.SparseCategoricalCrossentropy`.

* The Keras `losses` function is to compute the quantity that a model should seek to minimze during taining.
* **Cross-Entropy Loss Function**: Cross-Entropy Loss is also known as logarithmic loss, log loss or logistic loss. Each probability of the predicted class is compared with the actual class and loss is calculated which penalizes the probability based on how far it is from the actual expected value. The penalty is logarithmic in nature yielding a large score for large differences close to 1 and small score for small differences tending to 0. A perfect model has a cross-entropy loss of 0.
* **Categorical Cross-Entropy and Sparse Categorical Cross-Entropy**: Both categorical cross entropy and sparse categorical cross-entropy have the same loss function as defined above. The only difference between the two is on how labels are defined.
> **Categorical cross-entropy** is used when we have to deal with the labels that are one-hot encoded, for example, we have the following values for 3-class classification problem [1,0,0], [0,1,0] and [0,0,1].
> In **sparse categorical cross-entropy** , labels are integer encoded, for example, [1], [2] and [3] for 3-class problem.
> Link to [Here is what you need to know about Sparse Categorical Cross Entropy in nutshell](https://vevesta.substack.com/p/here-is-what-you-need-to-know-about)

In [None]:
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

In [None]:
loss_fn(y_train[:1], predictions).numpy()

2.4803767

Before you start training, configure and compile the model using Keras `Model.compile`. Set the `optimizer` class to `adam`, set the `loss` to the `loss_fn` function you defined earlier, and specify `metrics`to be evaluated for the model by setting the metrics parameter to `accuracy`.

* `adam`: The Adam (Adaptive Moment Estimation) Optimizer adjusts learning rates to improve the training of deep neural networks.
  * **Adapts learning rates**: adjusts the learning rate for each parameter based on its gradient history, helping the network learn more efficiently.
  * **Uses momentum**: uses momentum to help navigate complex surfaces.
  * **Corrects for bias**: includes bias correction terms to help perform well early in training.
  * **Adapts to gradient**: takes smaller steps in areas where the gradient changes rapidly, and larger steps in areas where the gradient changes slowly.  
* `accuracy` measures how well a model predicts the correct outcome by dividing the number of correct predictions by the total number of predictions.

In [None]:
model.compile(optimizer='adam',
              loss=loss_fn,
              metrics=['accuracy'])

Train and evaluate your model.

* Uses `model.fit` method to adjust your model parameters and minimize the loss.
* The Keras `.fit()` method is used to train a model.
  * Adjusts model parameters: primary purpose of `.fit()` is to adjust the model's internal parameters (weights and biases) to minimize the loss function and improve it predictive accuracy.
  * Iterates over data: iterates through the training data for a specified number of epochs (passes over the entire dataset). In each epoch, the data is divided into batches, and the model's parameters are updated based on the gradients calculated from each batch.
  * Tracks performance: tracks the model's performance during training by evaluating it on the validation data (if provided). Helps monitor overfitting and assess the model's generalization capabilities.
  * Returns a history object: returns a history object that contains information the trianing process, such as the loss metric values for each epoch, on both the trianing and validation sets.

In [None]:
model.fit(x_train, y_train, epochs=5)

Epoch 1/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 4ms/step - accuracy: 0.8595 - loss: 0.4813
Epoch 2/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 3ms/step - accuracy: 0.9542 - loss: 0.1562
Epoch 3/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 4ms/step - accuracy: 0.9658 - loss: 0.1128
Epoch 4/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 4ms/step - accuracy: 0.9731 - loss: 0.0888
Epoch 5/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 4ms/step - accuracy: 0.9739 - loss: 0.0801


<keras.src.callbacks.history.History at 0x7e0b34db8f90>

The `model.evaluate` method checks the model's performance, usually on a *validation set* or *test set*.

* The `verbose` argument controls the amount of output displayed during evaluation.
 * `verbose=0`: No output is displayed during evaluation (silent mode).
 * `verbose=1`: A progress bar is displayed, showing the progress of the evaluation process.
 * `verbose=2`: A single line of output is displayed per epoch, showing the loss and any metrics you've specified.

In [None]:
model.evaluate(x_test, y_test, verbose=2)

313/313 - 1s - 3ms/step - accuracy: 0.9753 - loss: 0.0789


[0.07890551537275314, 0.9753000140190125]

The image classifier is now trained to ~98% (accuracy: 0.9753) accuracy on this dataset.

If you want your model to return a probability, you can wrap the trained model, and attach the softmax to it:

In [None]:
probability_model = tf.keras.Sequential([
  model,
  tf.keras.layers.Softmax()
])

In [None]:
probability_model(x_test[:5])

<tf.Tensor: shape=(5, 10), dtype=float32, numpy=
array([[1.4707219e-07, 1.6401824e-08, 5.6507424e-06, 1.6328956e-04,
        4.7333342e-11, 6.7296838e-07, 5.5102927e-13, 9.9982589e-01,
        2.5402201e-07, 4.0905729e-06],
       [1.2968377e-08, 2.3084586e-03, 9.9768269e-01, 8.0520686e-06,
        2.5141649e-12, 3.0407901e-07, 3.5375993e-07, 6.5962808e-12,
        8.3211958e-08, 9.9173382e-14],
       [1.9834161e-08, 9.9929714e-01, 7.1342401e-05, 6.1357732e-06,
        2.2444752e-05, 4.3835257e-06, 3.3717159e-05, 3.0632631e-04,
        2.5809609e-04, 3.2196064e-07],
       [9.9968946e-01, 1.2203163e-08, 2.0163327e-05, 2.8757888e-06,
        9.0377343e-06, 3.3824697e-06, 6.2532657e-05, 2.7590240e-05,
        1.1542484e-07, 1.8489688e-04],
       [1.8808317e-06, 1.7034562e-09, 1.0265546e-06, 2.4659406e-07,
        9.9551302e-01, 2.6137926e-08, 1.8892902e-06, 2.0419879e-04,
        3.3259053e-07, 4.2772526e-03]], dtype=float32)>