# Introduction to tensorflow


## Introduction to Tensors

If you've ever used NumPy, [tensors](https://www.tensorflow.org/guide/tensor) are kind of like NumPy arrays (we'll see more on this later).

For the sake of this notebook and going forward, you can think of a tensor as a multi-dimensional numerical representation (also referred to as n-dimensional, where n can be any number) of something. Where something can be almost anything you can imagine: 
* It could be numbers themselves (using tensors to represent the price of houses). 
* It could be an image (using tensors to represent the pixels of an image).
* It could be text (using tensors to represent words).
* Or it could be some other form of information (or data) you want to represent with numbers.

The main difference between tensors and NumPy arrays (also an n-dimensional array of numbers) is that tensors can be used on [GPUs (graphical processing units)](https://blogs.nvidia.com/blog/2009/12/16/whats-the-difference-between-a-cpu-and-a-gpu/) and [TPUs (tensor processing units)](https://en.wikipedia.org/wiki/Tensor_processing_unit). 

The benefit of being able to run on GPUs and TPUs is faster computation, this means, if we wanted to find patterns in the numerical representations of our data, we can generally find them faster using GPUs and TPUs.

Okay, we've been talking enough about tensors, let's see them.

The first thing we'll do is import TensorFlow under the common alias `tf`.


### A scalar is known as a rank 0 tensor. Because it has no dimensions (it's just a number).
> 🔑 Note: The important point is knowing tensors can have an unlimited range of dimensions (the exact amount will depend on what data you're representing).

In [6]:
import tensorflow as tf
print(tf.__version__)

2.4.1


In [None]:
# create tensors with tf.constant()
scaler = tf.constant(7)
scaler

<tf.Tensor: shape=(), dtype=int32, numpy=7>

In [None]:
# check the number of dimensions in tensor
scaler.ndim

0

In [None]:
# create a vector
vector = tf.constant([10, 10])
vector

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([10, 10], dtype=int32)>

In [None]:
# check the dimensions of vector
vector.ndim

1

In [None]:
# create a matrix (has more than 1 dimensions)
matrix = tf.constant([
    [10, 7],
    [7, 10]
])

matrix

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[10,  7],
       [ 7, 10]], dtype=int32)>

In [None]:
matrix.ndim

2

In [None]:
# create another matrix
another_matrix = tf.constant([
        [10, 7],
        [7, 10],
        [1.0, 9.0]],
        dtype=tf.float16
)

another_matrix

<tf.Tensor: shape=(3, 2), dtype=float16, numpy=
array([[10.,  7.],
       [ 7., 10.],
       [ 1.,  9.]], dtype=float16)>

In [None]:
another_matrix.ndim

2

In [None]:
# Let's creat a tensor
tensor = tf.constant([[[1,2,3], [4,5,6]], [[7,8,9], [10, 11, 12]], [[13, 14, 15], [16, 17, 18]]])
tensor

<tf.Tensor: shape=(3, 2, 3), dtype=int32, numpy=
array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]],

       [[13, 14, 15],
        [16, 17, 18]]], dtype=int32)>

In [None]:
tensor.ndim

3

# What we have created so far:
- **scalar**: a single number.
- **vector**: a number with direction (e.g. wind speed with direction).
- **matrix**: a 2-dimensional array of numbers.
- **tensor**: an n-dimensional arrary of numbers (where n can be any number, a 0-dimension tensor is a scalar, a 1-dimension tensor is a vector).

-----


# Creating tensor with `tf.Variable()`

You can also (although you likely rarely will, because often, when working with data, tensors are created for you automatically) create tensors using tf.Variable().

The difference between `tf.Variable()` and `tf.constant()` is tensors created with `tf.constant()` are immutable (can't be changed, can only be used to create a new tensor), where as, tensors created with `tf.Variable() `are mutable (can be changed).

In [None]:
# create same tensor with tf.variable() as above
changable_tensor = tf.Variable([10, 7])
unchangable_tensor = tf.constant([10, 7])

changable_tensor, unchangable_tensor

(<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([10,  7], dtype=int32)>,
 <tf.Tensor: shape=(2,), dtype=int32, numpy=array([10,  7], dtype=int32)>)

In [None]:
# Let's try change one the elements in changable tensor
changable_tensor[0].assign(7)
changable_tensor

<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([7, 7], dtype=int32)>

In [None]:
unchangable_tensor[0].assign(7) # we can't change constant value
unchangable_tensor

AttributeError: ignored

Which one should you use? `tf.constant()` or `tf.Variable()`?

It will depend on what your problem requires. However, most of the time, TensorFlow will automatically choose for you (when loading data or modelling data).

-----

# Creating Random Tensors
Tensors of arbitary size which contain random numbers.

In [None]:
# create two random tensors
random_1 = tf.random.Generator.from_seed(42) # set seed for reproducibility
random_1 = random_1.normal(shape=(3, 2))
random_1

<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
array([[-0.7565803 , -0.06854702],
       [ 0.07595026, -1.2573844 ],
       [-0.23193763, -1.8107855 ]], dtype=float32)>

In [None]:
random_2 = tf.random.Generator.from_seed(42)
random_2 = random_2.normal(shape=(3, 2))
random_2

<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
array([[-0.7565803 , -0.06854702],
       [ 0.07595026, -1.2573844 ],
       [-0.23193763, -1.8107855 ]], dtype=float32)>

In [None]:
# are they equal?
random_1 == random_2

<tf.Tensor: shape=(3, 2), dtype=bool, numpy=
array([[ True,  True],
       [ True,  True],
       [ True,  True]])>

-----

# Shuffling the order elements in  tensor
- we want to shuffle the order of elements so that inherent order of data doesn't affect the learning
- **if we want to shuffle the data and make them in same order, we need to use both operational level and global level seed.**

### Global vs Operation Level Seed

`tf.random.set_seed(42)` sets the global seed, and the seed parameter in `tf.random.shuffle(seed=42)` sets the operation seed.

- this is helpful as we want to reproduce our expirement sometimes.

> 4. If both the global and the operation seed are set: Both seeds are used in conjunction to determine the random sequence."

### Wait, why would you want to do that?

Let's say you working with 15,000 images of cats and dogs and the first 10,000 images of were of cats and the next 5,000 were of dogs. This order could effect how a neural network learns (it may overfit by learning the order of the data), instead, it might be a good idea to move your data around.

In [None]:
not_shuffle = tf.constant([[10, 7], [1, 2], [3, 4]])
not_shuffle.ndim

2

In [None]:
# shuffle our tensor
# we can see that shuffling make the first dimension shuffle
after_shuffled = tf.random.shuffle(not_shuffle)
after_shuffled

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4],
       [ 1,  2]], dtype=int32)>

In [None]:
after_shuffled = tf.random.shuffle(not_shuffle, seed=42)
after_shuffled

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[10,  7],
       [ 1,  2],
       [ 3,  4]], dtype=int32)>

In [None]:
tf.random.set_seed(42)
after_shuffled = tf.random.shuffle(not_shuffle)
after_shuffled

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[ 1,  2],
       [ 3,  4],
       [10,  7]], dtype=int32)>

In [None]:
after_shuffled = tf.random.shuffle(not_shuffle)
after_shuffled

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4],
       [ 1,  2]], dtype=int32)>

## Note: 

`tf.random.set_seed(42)` sets the global seed, and the seed parameter in `tf.random.shuffle(seed=42)` sets the operation seed.

> 4. If both the global and the operation seed are set: Both seeds are used in conjunction to determine the random sequence."

In [None]:
# Shuffle in the same order every time

# Set the global random seed
tf.random.set_seed(42)

# Set the operation random seed
# tf.random.shuffle(not_shuffle)
tf.random.shuffle(not_shuffle, seed=42)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[10,  7],
       [ 1,  2],
       [ 3,  4]], dtype=int32)>

In [None]:
# Set the global random seed
tf.random.set_seed(42) # if you comment this out you'll get different results

# Set the operation random seed
tf.random.shuffle(not_shuffle)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[ 1,  2],
       [ 3,  4],
       [10,  7]], dtype=int32)>

## Exercise: create 5 random tensors and shuffle them
- we can see that using operation level seed will produce randomly shuffled every time we re-run the line.
- However, after setting global level random seed, we can see that no matter how many time we re-run this block of code, it doesn't change the seqeuence at all.

In [None]:
tf1 = tf.constant([[3, 4], [5, 6], [1,2]])
tf1

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[3, 4],
       [5, 6],
       [1, 2]], dtype=int32)>

#### operation level random seed

In [None]:
tf.random.shuffle(tf1, seed=42) # operation level random seed
# we can see that this will produce randomly shuffled every time we re-run the line

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[3, 4],
       [5, 6],
       [1, 2]], dtype=int32)>

#### global level random seed

In [None]:
tf.random.set_seed(42) # global level random seed
tf.random.shuffle(tf1, seed=41)

# after setting global level random seed, we can see that no matter how many time we re-run this block of code, it doesn't change the seqeuence at all.

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[5, 6],
       [1, 2],
       [3, 4]], dtype=int32)>

------

# Creating tensors from NumPy arrays



In [None]:
# create a tensor of ones
tf.ones(shape=(2,3))

<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
array([[1., 1., 1.],
       [1., 1., 1.]], dtype=float32)>

In [None]:
# create a tensor of zeros
tf.zeros(shape=(3, 5))

<tf.Tensor: shape=(3, 5), dtype=float32, numpy=
array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]], dtype=float32)>

In [None]:
tf.zeros([3, 5])

<tf.Tensor: shape=(3, 5), dtype=float32, numpy=
array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]], dtype=float32)>

## Turn Numpy array to Tensor
The main difference between Numpy arrys and Tensors is that Tensors can be run on GPU computing.

In [None]:
# you can also turn Numpy array to tensors
import numpy as np

numpy_A = np.arange(1, 25, dtype=np.int32)
numpy_A

# X = tf.constant(matrix) # capital for matrix or tensor
# y = tf.constant(vector) # non capital for vector

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24], dtype=int32)

In [None]:
A = tf.constant(numpy_A)
A

<tf.Tensor: shape=(24,), dtype=int32, numpy=
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24], dtype=int32)>

In [None]:
A = tf.constant(numpy_A, shape=(2, 3, 4))
B = tf.constant(numpy_A)

A, B

(<tf.Tensor: shape=(2, 3, 4), dtype=int32, numpy=
 array([[[ 1,  2,  3,  4],
         [ 5,  6,  7,  8],
         [ 9, 10, 11, 12]],
 
        [[13, 14, 15, 16],
         [17, 18, 19, 20],
         [21, 22, 23, 24]]], dtype=int32)>,
 <tf.Tensor: shape=(24,), dtype=int32, numpy=
 array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19, 20, 21, 22, 23, 24], dtype=int32)>)

In [None]:
2 * 3 * 4

24

In [None]:
A = tf.constant(numpy_A, shape=(8, 3))
A

<tf.Tensor: shape=(8, 3), dtype=int32, numpy=
array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12],
       [13, 14, 15],
       [16, 17, 18],
       [19, 20, 21],
       [22, 23, 24]], dtype=int32)>

In [None]:
A.ndim

2

-----

# Getting information from tensors (shape, rank, size) / Tensors Attributes

There will be times when we'll want to get different pieces of information from our tensors, in particuluar, we need now the following tensor vocabulary:

- **Shape**: The length (number of elements) of each of the dimensions of a tensor. `tensor.shape`
- **Rank**: The number of tensor dimensions. A scalar has rank 0, a vector has rank 1, a matrix is rank 2, a tensor has rank n. `tensor.ndim`
- **Axis or Dimension**: A particular dimension of a tensor. `tensor[0], tensor[:, 1]`
- **Size**: The total number of items in the tensor. `tf.size(tensor)`

We'll use these especially when you're trying to line up the shapes of our data to the shapes of our model. For example, making sure the shape of our image tensors are the same shape as our models input layer.

In [None]:
# Create a rank 4 tensor (4 dimensions)
rank4_tensor = tf.zeros(shape=[2, 3, 4, 5])
rank4_tensor

<tf.Tensor: shape=(2, 3, 4, 5), dtype=float32, numpy=
array([[[[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]]],


       [[[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]]]], dtype=float32)>

In [None]:
rank4_tensor[0]

<tf.Tensor: shape=(3, 4, 5), dtype=float32, numpy=
array([[[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]]], dtype=float32)>

In [None]:
rank4_tensor[1]

<tf.Tensor: shape=(3, 4, 5), dtype=float32, numpy=
array([[[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]]], dtype=float32)>

In [None]:
rank4_tensor.shape

TensorShape([2, 3, 4, 5])

In [None]:
rank4_tensor.ndim

4

In [None]:
tf.size(rank4_tensor)

<tf.Tensor: shape=(), dtype=int32, numpy=120>

In [None]:
2 * 3 * 4 * 5

120

## Get various attributes of tensors


In [None]:
# Get various attributes of tensors
print('DataType of every element: ', rank4_tensor.dtype)
print('Number of dimensions (rank): ', rank4_tensor.ndim)
print('Shape of tensor: ', rank4_tensor.shape)
print('Elements along the axis 0: ', rank4_tensor.shape[0]) # refer to the shape of the tensor and check the first index
print('Elements along the last axis: ', rank4_tensor.shape[-1])
print('Total elements in our tensor: ', tf.size(rank4_tensor))
print('Total elements in our tensor: ', tf.size(rank4_tensor).numpy()) # # .numpy() converts to NumPy array

DataType of every element:  <dtype: 'float32'>
Number of dimensions (rank):  4
Shape of tensor:  (2, 3, 4, 5)
Elements along the axis 0:  2
Elements along the last axis:  5
Total elements in our tensor:  tf.Tensor(120, shape=(), dtype=int32)
Total elements in our tensor:  120


----

# Indexing and Expanding Tensors

You can also index tensors just like Python lists.

In [None]:
somelist = [1, 2, 3, 4]
somelist[:2]

[1, 2]

In [None]:
# get the first 2 elements of each dimensions
rank4_tensor[:2, :2, :2, :2]

<tf.Tensor: shape=(2, 2, 2, 2), dtype=float32, numpy=
array([[[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]],


       [[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]]], dtype=float32)>

In [None]:
rank4_tensor.shape

TensorShape([2, 3, 4, 5])

In [None]:
# get the first element from each dimension , except for the final one
rank4_tensor[:1, :1, :1, :1]

<tf.Tensor: shape=(1, 1, 1, 1), dtype=float32, numpy=array([[[[0.]]]], dtype=float32)>

In [None]:
rank4_tensor[:, :1, :1, :1]

<tf.Tensor: shape=(2, 1, 1, 1), dtype=float32, numpy=
array([[[[0.]]],


       [[[0.]]]], dtype=float32)>

In [None]:
rank4_tensor[:1, :, :1, :1]

<tf.Tensor: shape=(1, 3, 1, 1), dtype=float32, numpy=
array([[[[0.]],

        [[0.]],

        [[0.]]]], dtype=float32)>

In [None]:
rank4_tensor[:1, :1, :, :1]

<tf.Tensor: shape=(1, 1, 4, 1), dtype=float32, numpy=
array([[[[0.],
         [0.],
         [0.],
         [0.]]]], dtype=float32)>

In [None]:
rank4_tensor[:1, :1, :1, :]

<tf.Tensor: shape=(1, 1, 1, 5), dtype=float32, numpy=array([[[[0., 0., 0., 0., 0.]]]], dtype=float32)>

In [None]:
# create a rank2 tensor (2 dimensions)
rank2_tensor = tf.constant([[10, 7], 
                                      [3, 4]])
rank2_tensor

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4]], dtype=int32)>

In [None]:
rank2_tensor.shape, rank2_tensor.ndim

(TensorShape([2, 2]), 2)

In [None]:
# let's get last item of each row of our rank2 tensor
rank2_tensor[: , -1]

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([7, 4], dtype=int32)>

In [None]:
# if we want to add additional dimension
rank2_tensor[: , -1:]

<tf.Tensor: shape=(2, 1), dtype=int32, numpy=
array([[7],
       [4]], dtype=int32)>

## Add in extra dimension to our rank2tensor
`...` means for every axis
There are 2 ways to do this.
+ tf.newaxis()
+ tf.expand_dims()

In [None]:
# Add in extra dimension to our rank2tensor
rank3_tensor = rank2_tensor[..., tf.newaxis] # same as [:, :, :, tf.newaxis]
rank3_tensor

<tf.Tensor: shape=(2, 2, 1), dtype=int32, numpy=
array([[[10],
        [ 7]],

       [[ 3],
        [ 4]]], dtype=int32)>

In [None]:
# Alternative to tf.newaxis
tf.expand_dims(rank2_tensor, axis=-1) # -1 means expands to last axis

# so from (2,2) became (2,2,1) , new 1 dimension to last axis

<tf.Tensor: shape=(2, 2, 1), dtype=int32, numpy=
array([[[10],
        [ 7]],

       [[ 3],
        [ 4]]], dtype=int32)>

In [None]:
tf.expand_dims(rank2_tensor, axis=0)

# so from (2,2) became (1,2,2) , new 1 dimension to first axis or 0 - axis

<tf.Tensor: shape=(1, 2, 2), dtype=int32, numpy=
array([[[10,  7],
        [ 3,  4]]], dtype=int32)>

In [None]:
tf.expand_dims(rank2_tensor, axis=1)

<tf.Tensor: shape=(2, 1, 2), dtype=int32, numpy=
array([[[10,  7]],

       [[ 3,  4]]], dtype=int32)>


-----------

# Manipulating tensors with Basic Operations

Finding patterns in tensors (numberical representation of data) requires manipulating them.

Again, when building models in TensorFlow, much of this pattern discovery is done for you.

## Basic operations
You can perform many of the basic mathematical operations directly on tensors using Pyhton operators such as, +, -, *, /

In [None]:
# we can add tensors 
tensor = tf.constant([[10, 7], [3, 4]])
tensor + 10

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[20, 17],
       [13, 14]], dtype=int32)>

In [None]:
tensor # original tensor is unchanged.

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4]], dtype=int32)>

In [None]:
tensor = tensor + 10
tensor

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[20, 17],
       [13, 14]], dtype=int32)>

In [None]:
# multiplication
tensor * 2

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[40, 34],
       [26, 28]], dtype=int32)>

In [None]:
tensor - 10

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4]], dtype=int32)>

In [None]:
tensor / 7

<tf.Tensor: shape=(2, 2), dtype=float64, numpy=
array([[2.85714286, 2.42857143],
       [1.85714286, 2.        ]])>

In [None]:
# We can use tensorflow built in functions too
tf.multiply(tensor, 10)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[200, 170],
       [130, 140]], dtype=int32)>

In [None]:
tf.divide(tensor, 2)

<tf.Tensor: shape=(2, 2), dtype=float64, numpy=
array([[10. ,  8.5],
       [ 6.5,  7. ]])>

In [None]:
tensor

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[20, 17],
       [13, 14]], dtype=int32)>

In [None]:
tf.add(tensor, 5)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[25, 22],
       [18, 19]], dtype=int32)>

In [None]:
tf.subtract(tensor, 10)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4]], dtype=int32)>

--------

# Matrix mutliplication

One of the most common operations in machine learning algorithms is [matrix multiplication](https://www.mathsisfun.com/algebra/matrix-multiplying.html).

TensorFlow implements this matrix multiplication functionality in the [`tf.matmul()`](https://www.tensorflow.org/api_docs/python/tf/linalg/matmul) method.

The main two rules for matrix multiplication to remember are:
1. The inner dimensions must match:
  * `(3, 5) @ (3, 5)` won't work
  * `(5, 3) @ (3, 5)` will work
  * `(3, 5) @ (5, 3)` will work
2. The resulting matrix has the shape of the inner dimensions:
 * `(5, 3) @ (3, 5)` -> `(3, 3)`
 * `(3, 5) @ (5, 3)` -> `(5, 5)`

> 🔑 **Note:** '`@`' in Python is the symbol for matrix multiplication.



In [None]:
# matrix multiplication in tensorflow
print(tensor)

tf.Tensor(
[[20 17]
 [13 14]], shape=(2, 2), dtype=int32)


In [None]:
tf.matmul(tensor, tensor)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[621, 578],
       [442, 417]], dtype=int32)>

In [None]:
tensor * tensor # element wise multiplication

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[400, 289],
       [169, 196]], dtype=int32)>

In [None]:
left_tensor = tf.constant([[1, 2, 5], [7, 2, 1], [3, 3, 3]])
right_tensor = tf.constant([[3, 5], [6, 7], [1, 8]])

left_tensor, right_tensor

(<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
 array([[1, 2, 5],
        [7, 2, 1],
        [3, 3, 3]], dtype=int32)>,
 <tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[3, 5],
        [6, 7],
        [1, 8]], dtype=int32)>)

In [None]:
tf.matmul(left_tensor, right_tensor)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[20, 59],
       [34, 57],
       [30, 60]], dtype=int32)>

## Matrix multiplication with python operator @

In [None]:
left_tensor @ right_tensor

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[20, 59],
       [34, 57],
       [30, 60]], dtype=int32)>

In [None]:
# create a tensor (3,2) 
X = tf.constant([[1,2], [3, 4], [5, 6]])

# create another (3, 2) tensor
Y = tf.constant([[7, 8], [9, 10], [11, 12]])

X, Y

(<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[1, 2],
        [3, 4],
        [5, 6]], dtype=int32)>, <tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[ 7,  8],
        [ 9, 10],
        [11, 12]], dtype=int32)>)

In [None]:
# try to matrix multiply tensors of same shape
X @ Y

InvalidArgumentError: ignored

In [None]:
tf.matmul(X, Y)

InvalidArgumentError: ignored

## Matrix Mulitplication (Reshape tensors)
Trying to matrix multiply two tensors with the shape `(3, 2)` errors because the inner dimensions don't match.

We need to either:
* Reshape X to `(2, 3)` so it's `(2, 3) @ (3, 2)`.
* Reshape Y to `(3, 2)` so it's `(3, 2) @ (2, 3)`.

We can do this with either:
* [`tf.reshape()`](https://www.tensorflow.org/api_docs/python/tf/reshape) - allows us to reshape a tensor into a defined shape.
* [`tf.transpose()`](https://www.tensorflow.org/api_docs/python/tf/transpose) - switches the dimensions of a given tensor.

![lining up dimensions for dot products](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/00-lining-up-dot-products.png)

Let's try `tf.reshape()` first.

In [None]:
X.shape, Y.shape

(TensorShape([3, 2]), TensorShape([3, 2]))

In [None]:
tf.reshape(X, shape=(2, 3))

<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[1, 2, 3],
       [4, 5, 6]], dtype=int32)>

In [None]:
# change the shape of X
tf.matmul(tf.reshape(X, shape=(2, 3)), Y)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 58,  64],
       [139, 154]], dtype=int32)>

In [None]:
# change the shape of Y
tf.matmul(X, tf.reshape(Y, shape=(2, 3)))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 27,  30,  33],
       [ 61,  68,  75],
       [ 95, 106, 117]], dtype=int32)>

In [None]:
X

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[1, 2],
       [3, 4],
       [5, 6]], dtype=int32)>

## Matrix multiplication with Transpose

Notice the difference in the resulting shapes when tranposing `X` or reshaping `Y`.

This is because of the 2nd rule mentioned above:
 * `(3, 2) @ (2, 3)` -> `(2, 2)` done with `tf.matmul(tf.transpose(X), Y)`
 * `(2, 3) @ (3, 2)` -> `(3, 3)` done with `X @ tf.reshape(Y, shape=(2, 3))`

This kind of data manipulation is a reminder: you'll spend a lot of your time in machine learning and working with neural networks reshaping data (in the form of tensors) to prepare it to be used with various operations (such as feeding it to a model).


In [None]:
# can do the same with Transpose
# but we can see that the results are different from Transpose & Reshape
tf.transpose(X), tf.reshape(X, shape=(2, 3))

(<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
 array([[1, 3, 5],
        [2, 4, 6]], dtype=int32)>,
 <tf.Tensor: shape=(2, 3), dtype=int32, numpy=
 array([[1, 2, 3],
        [4, 5, 6]], dtype=int32)>)

In [None]:
# try matrix multiplication with Transpose
tf.matmul(tf.transpose(X), Y)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 89,  98],
       [116, 128]], dtype=int32)>

## The dot product

Multiplying matrices by eachother is also referred to as the dot product.

You can perform the `tf.matmul()` operation using [`tf.tensordot()`](https://www.tensorflow.org/api_docs/python/tf/tensordot). 


In [None]:
X, Y

(<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[1, 2],
        [3, 4],
        [5, 6]], dtype=int32)>, <tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[ 7,  8],
        [ 9, 10],
        [11, 12]], dtype=int32)>)

In [None]:
# perform dot product on X and Y (requires X or Y to be transposed)
tf.tensordot(tf.transpose(X), Y, axes=1)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 89,  98],
       [116, 128]], dtype=int32)>

In [None]:
# perform matrix mulitplication between X and Y  (transpose)
tf.matmul(X, tf.transpose(Y))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 23,  29,  35],
       [ 53,  67,  81],
       [ 83, 105, 127]], dtype=int32)>

In [None]:
# perform matrix multiplication between X and Y (reshape)
tf.matmul(X, tf.reshape(Y, shape=(2,3)))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 27,  30,  33],
       [ 61,  68,  75],
       [ 95, 106, 117]], dtype=int32)>

## Check the value of Y, reshape Y and transposed Y

In [None]:
# Check the value of Y, reshape Y and transposed Y
print('Normal Y: ', Y)
print('\n Y reshaped: ', tf.reshape(Y, shape=(2, 3)))
print('\n Y transposed: ', tf.transpose(Y))

Normal Y:  tf.Tensor(
[[ 7  8]
 [ 9 10]
 [11 12]], shape=(3, 2), dtype=int32)

 Y reshaped:  tf.Tensor(
[[ 7  8  9]
 [10 11 12]], shape=(2, 3), dtype=int32)

 Y transposed:  tf.Tensor(
[[ 7  9 11]
 [ 8 10 12]], shape=(2, 3), dtype=int32)


As you can see, the outputs of `tf.reshape()` and `tf.transpose()` when called on `Y`, even though they have the same shape, are different.

This can be explained by the default behaviour of each method:
* [`tf.reshape()`](https://www.tensorflow.org/api_docs/python/tf/reshape) - change the shape of the given tensor (first) and then insert values in order they appear (in our case, 7, 8, 9, 10, 11, 12).
* [`tf.transpose()`](https://www.tensorflow.org/api_docs/python/tf/transpose) - swap the order of the axes, by default the last axis becomes the first, however the order can be changed using the [`perm` parameter](https://www.tensorflow.org/api_docs/python/tf/transpose).


So which should you use?

Again, most of the time these operations (when they need to be run, such as during the training a neural network, will be implemented for you).

But generally, whenever performing a matrix multiplication and the shapes of two matrices don't line up, you will transpose (not reshape) one of them in order to line them up.

## Matrix multiplication tidbits
* If we transposed `Y`, it would be represented as $\mathbf{Y}^\mathsf{T}$ (note the capital T for tranpose).
* Get an illustrative view of matrix multiplication [by Math is Fun](https://www.mathsisfun.com/algebra/matrix-multiplying.html).
* Try a hands-on demo of matrix multiplcation: http://matrixmultiplication.xyz/ (shown below).

![visual demo of matrix multiplication](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/00-matrix-multiply-crop.gif)


------



# Changing the datatype of a tensor

Sometimes we want to alter the default datatype of tensor. 

This is common when we want to compute using less precision (e.g. 16-bit floating point numbers vs. 32-bit floating point numbers). 

Computing with less precision is useful on devices with less computing capacity such as mobile devices (because the less bits, the less space the computations require).

we can change the datatype of a tensor using [`tf.cast()`](https://www.tensorflow.org/api_docs/python/tf/cast).

In [None]:
tf.__version__

'2.4.1'

In [None]:
# create a new tensor with default datatype (float 32)
B = tf.constant([1.7, 7.4])
B.dtype

tf.float32

In [None]:
C = tf.constant([7, 10])
C.dtype

tf.int32

In [None]:
# Change from float 32 to float16 (reduced precision)
D = tf.cast(B, dtype=tf.float16)
D.dtype

tf.float16

In [None]:
# change from int32 to float32
E = tf.cast(C, dtype=tf.float32)
E.dtype

tf.float32

In [None]:
E_float16 = tf.cast(E, dtype=tf.float16)
E_float16.dtype

tf.float16

----

# Getting the absolute Value
Sometimes we want the absolute values (all values are positive) of elements in tensors.

To do so, we can use [`tf.abs()`](https://www.tensorflow.org/api_docs/python/tf/math/abs).

In [None]:
test = tf.constant([[-7, -9]])
test

<tf.Tensor: shape=(1, 2), dtype=int32, numpy=array([[-7, -9]], dtype=int32)>

In [None]:
tf.abs(test) # Get the absolute values

<tf.Tensor: shape=(1, 2), dtype=int32, numpy=array([[7, 9]], dtype=int32)>

----

# Aggregation Tensors

### Finding the min, max, mean, sum (aggregation)

we can quickly aggregate (perform a calculation on a whole tensor) tensors to find things like the minimum value, maximum value, mean and sum of all the elements.

To do so, aggregation methods typically have the syntax `reduce()_[action]`, such as:
* [`tf.reduce_min()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_min) - find the minimum value in a tensor.
* [`tf.reduce_max()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_max) - find the maximum value in a tensor (helpful for when you want to find the highest prediction probability).
* [`tf.reduce_mean()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_mean) - find the mean of all elements in a tensor.
* [`tf.reduce_sum()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_sum) - find the sum of all elements in a tensor.
* **Note:** typically, each of these is under the `math` module, e.g. `tf.math.reduce_min()` but we can use the alias `tf.reduce_min()`.

Let's see them in action.


**Aggregrating Tensors**: condensing them from multiple values down to a smaller amount of values.

In [None]:
import numpy as np

In [None]:
# create a random tensor with values between 0 and 100
sample = tf.constant(np.random.randint(0, 100, size=50))
sample

<tf.Tensor: shape=(50,), dtype=int64, numpy=
array([48, 24, 67,  6, 55, 34, 38, 33, 98, 37,  3, 17, 45, 28, 85, 30, 39,
       51, 99, 36,  4, 38,  6, 97, 89, 38, 39, 88,  5, 91, 30, 27, 96, 99,
       84, 72, 51, 52,  4, 90, 98,  8, 25, 92,  3, 46, 15,  0, 86, 38])>

In [None]:
# find the minimum
tf.reduce_min(sample)

<tf.Tensor: shape=(), dtype=int64, numpy=0>

In [None]:
# find the maximum
tf.reduce_max(sample)

<tf.Tensor: shape=(), dtype=int64, numpy=99>

In [None]:
# find the mean
tf.reduce_mean(sample)

<tf.Tensor: shape=(), dtype=int64, numpy=47>

In [None]:
# find the sum
tf.reduce_sum(sample)

<tf.Tensor: shape=(), dtype=int64, numpy=2384>

In [None]:
import tensorflow_probability as tfp

In [None]:
# find the variance
tfp.stats.variance(sample)

<tf.Tensor: shape=(), dtype=int64, numpy=1039>

In [None]:
tf.math.reduce_variance(tf.cast(sample, dtype=tf.float32))

<tf.Tensor: shape=(), dtype=float32, numpy=1039.3776>

In [None]:
# find the standard deviation
tf.math.reduce_std(tf.cast(sample, dtype=tf.float32))

<tf.Tensor: shape=(), dtype=float32, numpy=32.23938>


---------

We can also find the standard deviation ([`tf.reduce_std()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_std)) and variance ([`tf.reduce_variance()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_variance)) of elements in a tensor using similar methods.

# Finding the positional maximum and minimum

How about finding the position a tensor where the maximum value occurs?

This is helpful when you want to line up our labels (say `['Green', 'Blue', 'Red']`) with our prediction probabilities tensor (e.g. `[0.98, 0.01, 0.01]`).

In this case, the predicted label (the one with the highest prediction probability) would be `'Green'`.

We can do the same for the minimum (if required) with the following:
* [`tf.argmax()`](https://www.tensorflow.org/api_docs/python/tf/math/argmax) - find the position of the maximum element in a given tensor.
* [`tf.argmin()`](https://www.tensorflow.org/api_docs/python/tf/math/argmin) - find the position of the minimum element in a given tensor.

In [None]:
# create a new tensor for finding positional minimum and maximum
tf.random.set_seed(42)
F = tf.random.uniform(shape=[50])
F

<tf.Tensor: shape=(50,), dtype=float32, numpy=
array([0.6645621 , 0.44100678, 0.3528825 , 0.46448255, 0.03366041,
       0.68467236, 0.74011743, 0.8724445 , 0.22632635, 0.22319686,
       0.3103881 , 0.7223358 , 0.13318717, 0.5480639 , 0.5746088 ,
       0.8996835 , 0.00946367, 0.5212307 , 0.6345445 , 0.1993283 ,
       0.72942245, 0.54583454, 0.10756552, 0.6767061 , 0.6602763 ,
       0.33695042, 0.60141766, 0.21062577, 0.8527372 , 0.44062173,
       0.9485276 , 0.23752594, 0.81179297, 0.5263394 , 0.494308  ,
       0.21612847, 0.8457197 , 0.8718841 , 0.3083862 , 0.6868038 ,
       0.23764038, 0.7817228 , 0.9671384 , 0.06870162, 0.79873943,
       0.66028714, 0.5871513 , 0.16461694, 0.7381023 , 0.32054043],
      dtype=float32)>

In [None]:
# Find the positional maximum, Position where maximum value is located
tf.argmax(F)

<tf.Tensor: shape=(), dtype=int64, numpy=42>

In [None]:
np.argmax(F)

42

In [None]:
# Index on our largest value position
# using that index, we can get the value of largest number
F[tf.argmax(F)]

<tf.Tensor: shape=(), dtype=float32, numpy=0.9671384>

In [None]:
# Find the maximum value using reduce_max
tf.reduce_max(F)

<tf.Tensor: shape=(), dtype=float32, numpy=0.9671384>

All those values using different methods lined up.

In [None]:
# Check for equlity
# As those values are equal, we don't get any errors
assert(F[tf.argmax(F)]) == tf.reduce_max(F)

In [None]:
F[tf.argmax(F)] == tf.reduce_max(F)

<tf.Tensor: shape=(), dtype=bool, numpy=True>

In [None]:
# find the minium position
tf.argmin(F)

<tf.Tensor: shape=(), dtype=int64, numpy=16>

In [None]:
# find the minimum value
F[tf.argmin(F)]

<tf.Tensor: shape=(), dtype=float32, numpy=0.009463668>

In [None]:
# Find the maximum element position of F
print(f"The maximum value of F is at position: {tf.argmax(F).numpy()}") 
print(f"The maximum value of F is: {tf.reduce_max(F).numpy()}") 
print(f"Using tf.argmax() to index F, the maximum value of F is: {F[tf.argmax(F)].numpy()}")
print(f"Are the two max values the same (they should be)? {F[tf.argmax(F)].numpy() == tf.reduce_max(F).numpy()}")

The maximum value of F is at position: 42
The maximum value of F is: 0.967138409614563
Using tf.argmax() to index F, the maximum value of F is: 0.967138409614563
Are the two max values the same (they should be)? True


-------

# Squeezing a tensor (removing all single dimensions)

If we need to remove single-dimensions from a tensor (dimensions with size 1), we can use `tf.squeeze()`.

* [`tf.squeeze()`](https://www.tensorflow.org/api_docs/python/tf/squeeze) - remove all dimensions of 1 from a tensor.

Example: in our following example, our tensor has so many 1 dimenions. So we can squeeze that to become into single dimension.


In [None]:
# Create a rank 5 (5 dimensions) tensor of 50 numbers between 0 and 100
tf.random.set_seed(42)
G = tf.constant(tf.random.uniform(shape=[50]), shape=(1, 1, 1, 1, 50))
G

<tf.Tensor: shape=(1, 1, 1, 1, 50), dtype=float32, numpy=
array([[[[[0.6645621 , 0.44100678, 0.3528825 , 0.46448255, 0.03366041,
           0.68467236, 0.74011743, 0.8724445 , 0.22632635, 0.22319686,
           0.3103881 , 0.7223358 , 0.13318717, 0.5480639 , 0.5746088 ,
           0.8996835 , 0.00946367, 0.5212307 , 0.6345445 , 0.1993283 ,
           0.72942245, 0.54583454, 0.10756552, 0.6767061 , 0.6602763 ,
           0.33695042, 0.60141766, 0.21062577, 0.8527372 , 0.44062173,
           0.9485276 , 0.23752594, 0.81179297, 0.5263394 , 0.494308  ,
           0.21612847, 0.8457197 , 0.8718841 , 0.3083862 , 0.6868038 ,
           0.23764038, 0.7817228 , 0.9671384 , 0.06870162, 0.79873943,
           0.66028714, 0.5871513 , 0.16461694, 0.7381023 , 0.32054043]]]]],
      dtype=float32)>

In [None]:
G.shape

TensorShape([1, 1, 1, 1, 50])

In [None]:
# Squeeze tensor G (remove all 1 dimensions)
G_squeezed = tf.squeeze(G)
G_squeezed, G_squeezed.shape

(<tf.Tensor: shape=(50,), dtype=float32, numpy=
 array([0.6645621 , 0.44100678, 0.3528825 , 0.46448255, 0.03366041,
        0.68467236, 0.74011743, 0.8724445 , 0.22632635, 0.22319686,
        0.3103881 , 0.7223358 , 0.13318717, 0.5480639 , 0.5746088 ,
        0.8996835 , 0.00946367, 0.5212307 , 0.6345445 , 0.1993283 ,
        0.72942245, 0.54583454, 0.10756552, 0.6767061 , 0.6602763 ,
        0.33695042, 0.60141766, 0.21062577, 0.8527372 , 0.44062173,
        0.9485276 , 0.23752594, 0.81179297, 0.5263394 , 0.494308  ,
        0.21612847, 0.8457197 , 0.8718841 , 0.3083862 , 0.6868038 ,
        0.23764038, 0.7817228 , 0.9671384 , 0.06870162, 0.79873943,
        0.66028714, 0.5871513 , 0.16461694, 0.7381023 , 0.32054043],
       dtype=float32)>, TensorShape([50]))

---

# One-hot encoding

If we have a tensor of indicies and would like to one-hot encode it, we can use [`tf.one_hot()`](https://www.tensorflow.org/api_docs/python/tf/one_hot).

We should also specify the `depth` parameter (the level which we want to one-hot encode to).

In [None]:
# create a list of indices
some_list = [0, 1, 2, 3] 

# one hot encode our list of indices
tf.one_hot(some_list, depth=4)

<tf.Tensor: shape=(4, 4), dtype=float32, numpy=
array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]], dtype=float32)>

###  We can also specify values for `on_value` and `off_value` instead of the default `0` and `1`.

In [None]:
# speicify custom values for one hot encoding
tf.one_hot(some_list, depth=4, on_value='Yaeee', off_value="Nayyy")

<tf.Tensor: shape=(4, 4), dtype=string, numpy=
array([[b'Yaeee', b'Nayyy', b'Nayyy', b'Nayyy'],
       [b'Nayyy', b'Yaeee', b'Nayyy', b'Nayyy'],
       [b'Nayyy', b'Nayyy', b'Yaeee', b'Nayyy'],
       [b'Nayyy', b'Nayyy', b'Nayyy', b'Yaeee']], dtype=object)>

-----

# Squaring, log, square root

Many other common mathematical operations we'd like to perform at some stage, probably exist.

Let's take a look at:
* [`tf.square()`](https://www.tensorflow.org/api_docs/python/tf/math/square) - get the square of every value in a tensor. 
* [`tf.sqrt()`](https://www.tensorflow.org/api_docs/python/tf/math/sqrt) - get the squareroot of every value in a tensor (**note:** the elements need to be floats or this will error).
* [`tf.math.log()`](https://www.tensorflow.org/api_docs/python/tf/math/log) - get the natural log of every value in a tensor (elements need to floats).

In [None]:
# create a new tensor
H = tf.range(1, 10)
H

<tf.Tensor: shape=(9,), dtype=int32, numpy=array([1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)>

In [None]:
# Find the square
tf.square(H)

<tf.Tensor: shape=(9,), dtype=int32, numpy=array([ 1,  4,  9, 16, 25, 36, 49, 64, 81], dtype=int32)>

In [None]:
# Find square root (method require Non Integer values)
tf.sqrt(tf.cast(H, dtype=tf.float16))

<tf.Tensor: shape=(9,), dtype=float16, numpy=
array([1.   , 1.414, 1.732, 2.   , 2.236, 2.45 , 2.646, 2.828, 3.   ],
      dtype=float16)>

In [None]:
# Find the Log (method require Non Integer values)
tf.math.log(tf.cast(H, dtype=tf.float32))

<tf.Tensor: shape=(9,), dtype=float32, numpy=
array([0.       , 0.6931472, 1.0986123, 1.3862944, 1.609438 , 1.7917595,
       1.9459102, 2.0794415, 2.1972246], dtype=float32)>

--------

# Manipulating `tf.Variable` tensors

Tensors created with `tf.Variable()` can be changed in place using methods such as:

* [`.assign()`](https://www.tensorflow.org/api_docs/python/tf/Variable#assign) - assign a different value to a particular index of a variable tensor.
* [`.add_assign()`](https://www.tensorflow.org/api_docs/python/tf/Variable#assign_add) - add to an existing value and reassign it at a particular index of a variable tensor.

In [None]:
# Create a variable tensor
I = tf.Variable(np.arange(0, 5))
I

<tf.Variable 'Variable:0' shape=(5,) dtype=int64, numpy=array([0, 1, 2, 3, 4])>

In [None]:
# Assign the final value a new value of 50
I.assign([0, 1, 2, 3, 50])

<tf.Variable 'Variable:0' shape=(5,) dtype=int64, numpy=array([ 0,  1,  2,  3, 50])>

In [None]:
# The change happens in place (the last value is now 50, not 4)
I

<tf.Variable 'Variable:0' shape=(5,) dtype=int64, numpy=array([ 0,  1,  2,  3, 50])>

In [None]:
# Add 10 to every element in I
I.assign_add([10, 10, 10, 10, 10])

<tf.Variable 'UnreadVariable' shape=(5,) dtype=int64, numpy=array([10, 11, 12, 13, 60])>

In [None]:
# Again, the change happens in place
I

<tf.Variable 'Variable:0' shape=(5,) dtype=int64, numpy=array([10, 11, 12, 13, 60])>

In [None]:
# Subtract 2 to every element in I
I.assign_sub([2, 2, 2, 2, 2])

<tf.Variable 'UnreadVariable' shape=(5,) dtype=int64, numpy=array([ 8,  9, 10, 11, 58])>

In [None]:
I

<tf.Variable 'Variable:0' shape=(5,) dtype=int64, numpy=array([ 8,  9, 10, 11, 58])>

-----

#Tensors and NumPy

We've seen some examples of tensors interact with NumPy arrays, such as, using NumPy arrays to create tensors. 

Tensors can also be converted to NumPy arrays using:

* `np.array()` - pass a tensor to convert to an ndarray (NumPy's main datatype).
* `tensor.numpy()` - call on a tensor to convert to an ndarray.

Doing this is helpful as it makes tensors iterable as well as allows us to use any of NumPy's methods on them.

In [None]:
# Create a tensor from a NumPy array
J = tf.constant(np.array([3., 7., 10.]))
J

<tf.Tensor: shape=(3,), dtype=float64, numpy=array([ 3.,  7., 10.])>

In [None]:
# Convert tensor J to NumPy with np.array()
np.array(J), type(np.array(J))

(array([ 3.,  7., 10.]), numpy.ndarray)

In [None]:
# Convert tensor J to NumPy with .numpy()
J.numpy(), type(J.numpy())

(array([3.], dtype=float32), numpy.ndarray)

In [None]:
J = tf.constant([3.])
J

<tf.Tensor: shape=(1,), dtype=float32, numpy=array([3.], dtype=float32)>

In [None]:
J.numpy()[0]

3.0

**By default tensors have `dtype=float32`, where as NumPy arrays have `dtype=float64`.**

This is because neural networks (which are usually built with TensorFlow) can generally work very well with less precision (32-bit rather than 64-bit).

In [None]:
# Create a tensor from NumPy and from an array
numpy_J = tf.constant(np.array([3., 7., 10.])) # will be float64 (due to NumPy)
tensor_J = tf.constant([3., 7., 10.]) # will be float32 (due to being TensorFlow default)

# check the datatype of each ones
numpy_J.dtype, tensor_J.dtype

(tf.float64, tf.float32)

------

# Finding access to GPUs

We've mentioned GPUs plenty of times throughout this notebook.

So how do we check if we've got one available?

We can check if we've got access to a GPU using [`tf.config.list_physical_devices()`](https://www.tensorflow.org/guide/gpu).

In [None]:
tf.config.list_physical_devices()

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]

In [None]:
tf.config.list_physical_devices('GPU')

[]

In [None]:
tf.config.list_physical_devices()

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'),
 PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

In [None]:
tf.config.list_physical_devices('GPU')

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

In [None]:
#  find information about  GPU using `!nvidia-smi`.

!nvidia-smi

Tue Mar 23 15:38:51 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.56       Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   52C    P8    10W /  70W |      3MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

> 🔑 **Note:** If you have access to a GPU, TensorFlow will automatically use it whenever possible.


---------

-----

# Using `@tf.function`

In your TensorFlow adventures, you might come across Python functions which have the decorator [`@tf.function`](https://www.tensorflow.org/api_docs/python/tf/function).

If you aren't sure what Python decorators do, [read RealPython's guide on them](https://realpython.com/primer-on-python-decorators/).

But in short, decorators modify a function in one way or another.

In the `@tf.function` decorator case, it turns a Python function into a callable TensorFlow graph. Which is a fancy way of saying, if you've written your own Python function, and you decorate it with `@tf.function`, when you export your code (to potentially run on another device), TensorFlow will attempt to convert it into a fast(er) version of itself (by making it part of a computation graph).

For more on this, read the [Better performance with tf.function](https://www.tensorflow.org/guide/function) guide.

In [None]:
import numpy as np

In [None]:
# create a simple function
def function(x,y):
  return x+y

x = tf.constant(np.arange(0, 10))
y = tf.constant(np.arange(10, 20))
function(x, y)

<tf.Tensor: shape=(10,), dtype=int64, numpy=array([10, 12, 14, 16, 18, 20, 22, 24, 26, 28])>

In [None]:
# Create the same function and decorate it with tf.function
@tf.function
def tf_function(x,y):
  return x+y

tf_function(x, y)

<tf.Tensor: shape=(10,), dtype=int64, numpy=array([10, 12, 14, 16, 18, 20, 22, 24, 26, 28])>

If you noticed no difference between the above two functions (the decorated one and the non-decorated one) you'd be right.

Much of the difference happens behind the scenes. One of the main ones being potential code speed-ups where possible.

--------

# Exercises

1. Create a vector, scalar, matrix and tensor with values of your choosing using `tf.constant()`.
2. Find the shape, rank and size of the tensors you created in 1.
3. Create two tensors containing random values between 0 and 1 with shape `[5, 300]`.
4. Multiply the two tensors you created in 3 using matrix multiplication.
5. Multiply the two tensors you created in 3 using dot product.
6. Create a tensor with random values between 0 and 1 with shape `[224, 224, 3]`.
7. Find the min and max values of the tensor you created in 6.
8. Created a tensor with random values of shape `[1, 224, 224, 3]` then squeeze it to change the shape to `[224, 224, 3]`.
9. Create a tensor with shape `[10]` using your own choice of values, then find the index which has the maximum value.
10. One-hot encode the tensor you created in 9.

In [5]:
import numpy as np

In [8]:
# 1. Create a vector, scalar, matrix and tensor with values of your choosing using tf.constant().
scalar = tf.constant(12)
vector = tf.constant([1, 2])
matrix = tf.constant([[1, 2], [3, 4]])
tensor = tf.random.Generator.from_seed(42)
tensor = tensor.normal(shape=(3, 2))

scalar, vector, matrix, tensor

(<tf.Tensor: shape=(), dtype=int32, numpy=12>,
 <tf.Tensor: shape=(2,), dtype=int32, numpy=array([1, 2], dtype=int32)>,
 <tf.Tensor: shape=(2, 2), dtype=int32, numpy=
 array([[1, 2],
        [3, 4]], dtype=int32)>,
 <tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[-0.7565803 , -0.06854702],
        [ 0.07595026, -1.2573844 ],
        [-0.23193765, -1.8107855 ]], dtype=float32)>)

In [10]:
# 2.Find the shape, rank and size of the tensors you created in 1
print('Shape of tensor: ', tensor.shape)
print('Rank (dimensions) of tensor: ', tensor.ndim)
print('Size of tensor: ', tf.size(tensor))

Shape of tensor:  (3, 2)
Rank (dimensions) of tensor:  2
Size of tensor:  tf.Tensor(6, shape=(), dtype=int32)


In [14]:
# 3. Create two tensors containing random values between 0 and 1 with shape [5, 300].
tensor1 = tf.random.Generator.from_seed(42)
tensor1 = tensor1.normal(shape=(5, 300))

tensor2 = tf.constant(np.arange(1, 51, 1500), shape=(5, 300))

In [16]:
tensor1.shape, tensor2.shape

(TensorShape([5, 300]), TensorShape([5, 300]))

In [18]:
tf.reshape(tensor2, shape=(300, 5))

<tf.Tensor: shape=(300, 5), dtype=int64, numpy=
array([[1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       ...,
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1]])>

In [19]:
# 4. Multiply the two tensors you created in 3 using matrix multiplication.
tf.matmul(tensor1, tf.cast(tf.reshape(tensor2, shape=(300, 5)), dtype=tf.float32))

<tf.Tensor: shape=(5, 5), dtype=float32, numpy=
array([[-18.116056  , -18.116056  , -18.116056  , -18.116056  ,
        -18.116056  ],
       [-16.531065  , -16.531065  , -16.531065  , -16.531065  ,
        -16.531065  ],
       [  6.35018   ,   6.35018   ,   6.35018   ,   6.35018   ,
          6.35018   ],
       [-24.067091  , -24.067091  , -24.067091  , -24.067091  ,
        -24.067091  ],
       [  0.62193453,   0.62193453,   0.62193453,   0.62193453,
          0.62193453]], dtype=float32)>

In [23]:
# 5.Multiply the two tensors you created in 3 using dot product.
tf.tensordot(tensor1, tf.cast(tf.transpose(tensor2), dtype=tf.float32), axes=0)

<tf.Tensor: shape=(5, 300, 300, 5), dtype=float32, numpy=
array([[[[-0.7565803 , -0.7565803 , -0.7565803 , -0.7565803 ,
          -0.7565803 ],
         [-0.7565803 , -0.7565803 , -0.7565803 , -0.7565803 ,
          -0.7565803 ],
         [-0.7565803 , -0.7565803 , -0.7565803 , -0.7565803 ,
          -0.7565803 ],
         ...,
         [-0.7565803 , -0.7565803 , -0.7565803 , -0.7565803 ,
          -0.7565803 ],
         [-0.7565803 , -0.7565803 , -0.7565803 , -0.7565803 ,
          -0.7565803 ],
         [-0.7565803 , -0.7565803 , -0.7565803 , -0.7565803 ,
          -0.7565803 ]],

        [[-0.06854702, -0.06854702, -0.06854702, -0.06854702,
          -0.06854702],
         [-0.06854702, -0.06854702, -0.06854702, -0.06854702,
          -0.06854702],
         [-0.06854702, -0.06854702, -0.06854702, -0.06854702,
          -0.06854702],
         ...,
         [-0.06854702, -0.06854702, -0.06854702, -0.06854702,
          -0.06854702],
         [-0.06854702, -0.06854702, -0.06854702, -0.

In [27]:
# 6.Create a tensor with random values between 0 and 1 with shape [224, 224, 3].
tensor3 = tf.random.Generator.from_seed(42)
tensor3 = tensor3.uniform(shape=[224, 224, 3])
tensor3.shape

TensorShape([224, 224, 3])

In [33]:
224*224*3

150528

In [36]:
# 7.Find the min and max values of the tensor you created in 6.
tensor3_1D = tf.reshape(tensor3, shape=[150528])
tensor3_1D.shape

TensorShape([150528])

In [37]:
print('Minimum value: ', tensor3_1D[tf.argmin(tensor3_1D)])
print('Maximum value: ', tensor3_1D[tf.argmax(tensor3_1D)])

Minimum value:  tf.Tensor(4.053116e-06, shape=(), dtype=float32)
Maximum value:  tf.Tensor(0.99998736, shape=(), dtype=float32)


In [38]:
# 8.Created a tensor with random values of shape [1, 224, 224, 3] then squeeze it to change the shape to [224, 224, 3].
tensor4 = tf.constant(np.arange(0, 100, 150528), shape=[1, 224, 224, 3])
tensor4.shape

TensorShape([1, 224, 224, 3])

In [39]:
squeezed_tensor4 = tf.squeeze(tensor4)
squeezed_tensor4.shape

TensorShape([224, 224, 3])

In [50]:
# 9.Create a tensor with shape [10] using your own choice of values, then find the index which has the maximum value.
tensor5 = tf.random.Generator.from_seed(42)
tensor5 = tensor5.uniform(shape=[10])
tensor5

<tf.Tensor: shape=(10,), dtype=float32, numpy=
array([0.7493447 , 0.73561966, 0.45230794, 0.49039817, 0.1889317 ,
       0.52027524, 0.8736881 , 0.46921718, 0.63932586, 0.6467117 ],
      dtype=float32)>

In [51]:
print('Index of maxmimum value: ', tf.argmax(tensor5))
print('Maximum value: ', tf.argmax(tensor5))

Index of maxmimum value:  tf.Tensor(6, shape=(), dtype=int64)
Maximum value:  tf.Tensor(6, shape=(), dtype=int64)


In [52]:
tensor5 = tf.constant(np.arange(1, 11))
tensor5

<tf.Tensor: shape=(10,), dtype=int64, numpy=array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])>

In [53]:
print('Index of maxmimum value: ', tf.argmax(tensor5))
print('Maximum value: ', tf.argmax(tensor5))

Index of maxmimum value:  tf.Tensor(9, shape=(), dtype=int64)
Maximum value:  tf.Tensor(9, shape=(), dtype=int64)


In [54]:
# 10.One-hot encode the tensor you created in 9.
tf.one_hot(tensor5, depth=1)

<tf.Tensor: shape=(10, 1), dtype=float32, numpy=
array([[0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.]], dtype=float32)>

----------

## Extensions

* Read through the [list of TensorFlow Python APIs](https://www.tensorflow.org/api_docs/python/), pick one we haven't gone through in this notebook, reverse engineer it (write out the documentation code for yourself) and figure out what it does.
* Try to create a series of tensor functions to calculate your most recent grocery bill (it's okay if you don't use the names of the items, just the price in numerical form).
  * How would you calculate your grocery bill for the month and for the year using tensors?
* Go through the [TensorFlow 2.x quick start for beginners](https://www.tensorflow.org/tutorials/quickstart/beginner) tutorial (be sure to type out all of the code yourself, even if you don't understand it).
  * Are there any functions we used in here that match what's used in there? Which are the same? Which haven't you seen before?
* Watch the video ["What's a tensor?"](https://www.youtube.com/watch?v=f5liqUk0ZTw) - a great visual introduction to many of the concepts we've covered in this notebook.

------

In [63]:
# quick start Tutorial

import tensorflow as tf

In [56]:
# load and prepare the MNIST dataset
mnist = tf.keras.datasets.mnist

In [57]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train.shape, x_test.shape, y_train.shape, y_test.shape

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


((60000, 28, 28), (10000, 28, 28), (60000,), (10000,))

In [58]:
# scale the data
x_train = x_train / 255.0
x_test = x_test / 255.0

In [64]:
# Build the tf.keras.Sequential model by stacking layers. Choose an optimizer and loss function for training:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten, Dense, Dropout

In [65]:
model = Sequential()

model.add(Flatten(input_shape=(28, 28)))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.2))

model.add(Dense(18))

In [66]:
# For each example the model returns a vector of "logits" or "log-odds" scores, one for each class.
predictions = model(x_train[:1]).numpy()
predictions

array([[-0.41044477,  0.18888104,  0.35889456, -0.01425153,  0.4098451 ,
         0.39063883, -0.15033387, -0.5571083 ,  0.25724584, -0.18785247,
         0.10071382,  0.33244976,  0.03001545, -0.33659142, -0.08517634,
        -0.03940674,  0.16222379,  0.16948895]], dtype=float32)

In [67]:
# The tf.nn.softmax function converts these logits to "probabilities" for each class:
tf.nn.softmax(predictions).numpy()

array([[0.03434531, 0.06253906, 0.07412885, 0.05104251, 0.07800363,
        0.07651976, 0.04454841, 0.02966008, 0.06696407, 0.04290798,
        0.05726125, 0.07219423, 0.05335276, 0.03697784, 0.04754772,
        0.04977454, 0.06089396, 0.06133798]], dtype=float32)

**Note: It is possible to bake this tf.nn.softmax in as the activation function for the last layer of the network. While this can make the model output more directly interpretable, this approach is discouraged as it's impossible to provide an exact and numerically stable loss calculation for all models when using a softmax output.**

In [68]:
# The losses.SparseCategoricalCrossentropy loss takes a vector of logits and a True index and returns a scalar loss for each example.
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

In [69]:
loss_fn(y_train[:1], predictions).numpy()

2.5702062

In [70]:
model.compile(
    optimizer='adam',
    loss=loss_fn,
    metrics=['accuracy']
)

In [72]:
model.fit(x_train, y_train, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7f4e16144c10>

In [73]:
# The Model.evaluate method checks the models performance, usually on a "Validation-set" or "Test-set".
model.evaluate(x_test, y_test, verbose=2)

313/313 - 0s - loss: 0.0750 - accuracy: 0.9753


[0.07501718401908875, 0.9753000140190125]

The image classifier is now trained to ~98% accuracy on this dataset. 

In [74]:
# If you want your model to return a probability, you can wrap the trained model, and attach the softmax to it:
probability_model = tf.keras.Sequential([
                                         model, 
                                         tf.keras.layers.Softmax()
])

In [76]:
probability_model(x_test[:5])

<tf.Tensor: shape=(5, 18), dtype=float32, numpy=
array([[1.91499812e-08, 2.06802325e-10, 1.05917785e-07, 1.69238883e-05,
        2.16143128e-11, 4.65180570e-08, 4.74791430e-13, 9.99982357e-01,
        1.65933596e-08, 6.16974319e-07, 4.23938398e-14, 6.51485809e-16,
        1.22118180e-15, 1.58834062e-15, 1.38049027e-15, 3.55220546e-14,
        2.88999442e-14, 9.57161545e-15],
       [1.05828057e-08, 7.38314338e-05, 9.99909759e-01, 5.80282745e-07,
        3.52090288e-16, 1.58822586e-05, 7.18132442e-09, 4.35263465e-14,
        5.93383476e-09, 1.38985747e-13, 1.07795160e-16, 1.00090864e-17,
        4.41524129e-16, 2.61814606e-19, 1.87340626e-17, 5.39844662e-18,
        3.67509464e-17, 1.99869368e-17],
       [4.08835774e-08, 9.97047603e-01, 1.32533512e-03, 6.51346272e-06,
        3.83499864e-04, 2.53563157e-05, 1.60766976e-05, 1.08737533e-03,
        1.05113235e-04, 2.93860421e-06, 1.39729854e-08, 4.33005134e-08,
        7.04784497e-08, 1.39453205e-08, 1.80528517e-08, 9.12380855e-08,
     