## What is TensorFlow?

[TensorFlow](https://www.tensorflow.org/) is an open-source end-to-end machine learning library for preprocessing data, modelling data and serving models (getting them into the hands of others).

## Why use TensorFlow?

Rather than building machine learning and deep learning models from scratch, it's more likely you'll use a library such as TensorFlow. This is because it contains many of the most common machine learning functions you'll want to use.

## What we're going to cover

TensorFlow is vast. But the main premise is simple: turn data into numbers (tensors) and build machine learning algorithms to find patterns in them.

In this notebook we cover some of the most fundamental TensorFlow operations, more specificially:
* Introduction to tensors (creating tensors)
* Getting information from tensors (tensor attributes)
* Manipulating tensors (tensor operations)
* Tensors and NumPy
* Using @tf.function (a way to speed up your regular Python functions)
* Using GPUs with TensorFlow

Things to note:
* Many of the conventions here will happen automatically behind the scenes (when you build a model) but it's worth knowing so if you see any of these things, you know what's happening.
* For any TensorFlow function you see, it's important to be able to check it out in the documentation, for example, going to the Python API docs for all functions and searching for what you need: https://www.tensorflow.org/api_docs/python/ 


In [3]:
# Installing Tensorflow:
# https://www.tensorflow.org/install
!pip install tensorflow --user



In [1]:
# Import Tensorflow and Checking it's Version:
import tensorflow as tf
tf.__version__

'2.16.1'

### Creating Tensors with `tf.constant()`

In general, you usually won't create tensors yourself. This is because TensorFlow has modules built-in (such as [`tf.io`](https://www.tensorflow.org/api_docs/python/tf/io) and [`tf.data`](https://www.tensorflow.org/guide/data)) which are able to read your data sources and automatically convert them to tensors and then later on, neural network models will process these for us.

But for now, because we're getting familar with tensors themselves and how to manipulate them, we'll see how we can create them ourselves.

We'll begin by using [`tf.constant()`](https://www.tensorflow.org/api_docs/python/tf/constant).

In [2]:
# Creating a Scalar (Rank 0 Tensor):

scalar = tf.constant(10)
scalar

<tf.Tensor: shape=(), dtype=int32, numpy=10>

In [3]:
# Checking Number of Dimensions of a Tensor ( ndim stands for Number of Dimensions)
scalar.ndim

0

A scalar is known as a rank 0 tensor. Because it has no dimensions (it's just a number).

> 🔑 **Note:** For now, we don't need to know too much about the different ranks of tensors (but we will see more on this later). The important point is knowing tensors can have an unlimited range of dimensions (the exact amount will depend on what data you're representing).

In [4]:
# Creating a Vector (More than 0 Dimensions):

vector = tf.constant([10,10])
vector

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([10, 10])>

In [5]:
# Checking Number of Dimensions:

vector.ndim

1

In [6]:
# Creating a Matrix (More than 1 Dimension):

matrix1 = tf.constant([[1,2],
                      [3,4]])
matrix1

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[1, 2],
       [3, 4]])>

In [7]:
matrix1.ndim

2

By default, TensorFlow creates tensors with either an `int32` or `float32` datatype.

This is known as [32-bit precision](https://en.wikipedia.org/wiki/Precision_(computer_science)) (the higher the precision, the more precise the number, the more space it takes up on your computer).

In [8]:
# Another Matrix, defining datatype explicitly:

matrix2 = tf.constant([[1.2,2.1],
                      [2.3,3.2],
                      [3.4,4.3]], dtype= tf.float16)

In [9]:
matrix2

<tf.Tensor: shape=(3, 2), dtype=float16, numpy=
array([[1.2, 2.1],
       [2.3, 3.2],
       [3.4, 4.3]], dtype=float16)>

In [10]:
matrix2.ndim

2

In [11]:
# Creating a Tensor (More than 2 Dimensions. Although technically, all of the
# above items are also Tensors):

tensor = tf.constant([[[1,2,3],
                      [4,5,6],
                      [7,8,9]],
                      
                     [[10,11,12],
                     [13,14,15],
                     [16,17,18]],
                     
                     [[19,20,21],
                     [22,23,24],
                     [25,26,27]]])

In [12]:
tensor

<tf.Tensor: shape=(3, 3, 3), dtype=int32, numpy=
array([[[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9]],

       [[10, 11, 12],
        [13, 14, 15],
        [16, 17, 18]],

       [[19, 20, 21],
        [22, 23, 24],
        [25, 26, 27]]])>

In [13]:
tensor.ndim

3

This is known as a rank 3 tensor (3-dimensions), however a tensor can have an arbitrary (unlimited) amount of dimensions.

For example, you might turn a series of images into tensors with shape (224, 224, 3, 32), where:
* 224, 224 (the first 2 dimensions) are the height and width of the images in pixels.
* 3 is the number of colour channels of the image (red, green blue).
* 32 is the batch size (the number of images a neural network sees at any one time).

All of the above variables we've created are actually tensors. But you may also hear them referred to as their different names (the ones we gave them):
* **scalar**: a single number.
* **vector**: a number with direction (e.g. wind speed with direction).
* **matrix**: a 2-dimensional array of numbers.
* **tensor**: an n-dimensional array of numbers (where n can be any number, a 0-dimension tensor is a scalar, a 1-dimension tensor is a vector). 

To add to the confusion, the terms matrix and tensor are often used interchangably.

Going forward since we're using TensorFlow, everything we refer to and use will be tensors.

![difference between scalar, vector, matrix, tensor](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/00-scalar-vector-matrix-tensor.png)

### Creating Tensors with `tf.Variable()`

You can also (although you likely rarely will, because often, when working with data, tensors are created for you automatically) create tensors using [`tf.Variable()`](https://www.tensorflow.org/api_docs/python/tf/Variable).

The difference between `tf.Variable()` and `tf.constant()` is tensors created with `tf.constant()` are immutable (can't be changed, can only be used to create a new tensor), where as, tensors created with `tf.Variable()` are mutable (can be changed).

In [16]:
# Creating same tensors with tf.variable and tf.constant:
changeable_tensor = tf.Variable([10,7])
unchangeable_tensor = tf.constant([10,7])
print(changeable_tensor)
print(unchangeable_tensor)

<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([10,  7])>
tf.Tensor([10  7], shape=(2,), dtype=int32)


In [17]:
# Trying to assign new value in Variable Tensor:
changeable_tensor[0]

<tf.Tensor: shape=(), dtype=int32, numpy=10>

In [18]:
changeable_tensor[0] = 7

TypeError: 'ResourceVariable' object does not support item assignment

In [19]:
changeable_tensor[0].assign(7)

<tf.Variable 'UnreadVariable' shape=(2,) dtype=int32, numpy=array([7, 7])>

In [20]:
changeable_tensor

<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([7, 7])>

In [21]:
# trying to assign new value to a Constant Tensor:
unchangeable_tensor[0].assign[7]

AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'assign'

Which one should you use? `tf.constant()` or `tf.Variable()`?

It will depend on what your problem requires. However, most of the time, TensorFlow will automatically choose for you (when loading data or modelling data).

### Creating random tensors

Random tensors are tensors of some arbitrary size which contain random numbers.

Why would you want to create random tensors? 

This is what neural networks use to intialize their weights (patterns) that they're trying to learn in the data.

For example, the process of a neural network learning often involves taking a random n-dimensional array of numbers and refining them until they represent some kind of pattern (a compressed way to represent the original data).

**How a network learns**
![how a network learns](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/00-how-a-network-learns.png)
*A network learns by starting with random patterns (1) then going through demonstrative examples of data (2) whilst trying to update its random patterns to represent the examples (3).*

We can create random tensors by using the [`tf.random.Generator`](https://www.tensorflow.org/guide/random_numbers#the_tfrandomgenerator_class) class.

In [23]:
# Creating two random but same Tensors by using Seed:

rndm_tensor1 = tf.random.Generator.from_seed(42)
rndm_tensor1 = rndm_tensor1.uniform(shape= (3,3))

rndm_tensor2 = tf.random.Generator.from_seed(42)
rndm_tensor2 = rndm_tensor2.uniform((3,3))

In [24]:
print(rndm_tensor1)
print(rndm_tensor2)

tf.Tensor(
[[0.7493447  0.73561966 0.45230794]
 [0.49039817 0.1889317  0.52027524]
 [0.8736881  0.46921718 0.63932586]], shape=(3, 3), dtype=float32)
tf.Tensor(
[[0.7493447  0.73561966 0.45230794]
 [0.49039817 0.1889317  0.52027524]
 [0.8736881  0.46921718 0.63932586]], shape=(3, 3), dtype=float32)


The random tensors we've made are actually [pseudorandom numbers](https://www.computerhope.com/jargon/p/pseudo-random.htm) (they appear as random, but really aren't).

If we set a seed we'll get the same random numbers (if you've ever used NumPy, this is similar to `np.random.seed(42)`). 

Setting the seed says, "hey, create some random numbers, but flavour them with X" (X is the seed).

What do you think will happen when we change the seed?

In [25]:
rndm_tensor3 = tf.random.Generator.from_seed(42)
rndm_tensor3 = rndm_tensor3.normal(shape= (3,3))

rndm_tensor4 = tf.random.Generator.from_seed(7)
rndm_tensor4 = rndm_tensor4.normal(shape= (3,3))

In [26]:
print(rndm_tensor3)
print(rndm_tensor4)

tf.Tensor(
[[-0.7565803  -0.06854702  0.07595026]
 [-1.2573844  -0.23193763 -1.8107855 ]
 [ 0.09988727 -0.50998646 -0.7535805 ]], shape=(3, 3), dtype=float32)
tf.Tensor(
[[-1.3240396   0.28785667 -0.8757901 ]
 [-0.08857018  0.69211644  0.84215707]
 [-0.06378496  0.92800784 -0.6039789 ]], shape=(3, 3), dtype=float32)


### Shuffling a Tensor

Let's say you working with 15,000 images of cats and dogs and the first 10,000 images of were of cats and the next 5,000 were of dogs. This order could effect how a neural network learns (it may overfit by learning the order of the data), instead, it might be a good idea to move your data around.

In [27]:
# Shuffle a Tensor:

tensor1 = tf.constant([[1,2,3],
                      [4,5,6],
                      [7,8,9]])
print(tensor1)

tf.Tensor(
[[1 2 3]
 [4 5 6]
 [7 8 9]], shape=(3, 3), dtype=int32)


In [32]:
tf.random.shuffle(tensor1)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[1, 2, 3],
       [7, 8, 9],
       [4, 5, 6]])>

In [33]:
# Setting Seed to get Same random order in tensor everytime:
tf.random.shuffle(tensor1, seed= 42)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[7, 8, 9],
       [4, 5, 6],
       [1, 2, 3]])>

In [35]:
tf.random.shuffle(tensor1, seed= 42)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])>

In [37]:
tf.random.shuffle(tensor1, seed= 42)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[4, 5, 6],
       [1, 2, 3],
       [7, 8, 9]])>

Wait... why didn't the numbers come out the same?

It's due to rule #4 of the [`tf.random.set_seed()`](https://www.tensorflow.org/api_docs/python/tf/random/set_seed) documentation.

> "4. If both the global and the operation seed are set: Both seeds are used in conjunction to determine the random sequence."

`tf.random.set_seed(42)` sets the global seed, and the `seed` parameter in `tf.random.shuffle(seed=42)` sets the operation seed.

Because, "Operations that rely on a random seed actually derive it from two seeds: the global and operation-level seeds. This sets the global seed."

In [48]:
# Setting the Global Seed:
tf.random.set_seed(42)
tf.random.shuffle(tensor1, seed= 42)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])>

In [51]:
tf.random.set_seed(42)
tf.random.shuffle(tensor1)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[4, 5, 6],
       [7, 8, 9],
       [1, 2, 3]])>

### Other ways to make tensors

Though you might rarely use these (remember, many tensor operations are done behind the scenes for you), you can use [`tf.ones()`](https://www.tensorflow.org/api_docs/python/tf/ones) to create a tensor of all ones and [`tf.zeros()`](https://www.tensorflow.org/api_docs/python/tf/zeros) to create a tensor of all zeros.

In [52]:
ones = tf.ones(shape= [8,8], dtype= tf.int32)
ones

<tf.Tensor: shape=(8, 8), dtype=int32, numpy=
array([[1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1]])>

In [53]:
zeros = tf.zeros(shape= [8,8], dtype= tf.float32)
zeros

<tf.Tensor: shape=(8, 8), dtype=float32, numpy=
array([[0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32)>

We can also turn NumPy arrays in into tensors.

Remember, the main difference between tensors and NumPy arrays is that tensors can be run on GPUs.

> 🔑 **Note:** A matrix or tensor is typically represented by a capital letter (e.g. `X` or `A`) where as a vector is typically represented by a lowercase letter (e.g. `y` or `b`).

In [57]:
import numpy as np

array = np.arange(0,20)
array

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])

In [58]:
Mat1 = tf.constant(array)
Mat1

<tf.Tensor: shape=(20,), dtype=int32, numpy=
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])>

In [59]:
Mat2 = tf.constant(array, shape= (2,5,2))
Mat2

<tf.Tensor: shape=(2, 5, 2), dtype=int32, numpy=
array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5],
        [ 6,  7],
        [ 8,  9]],

       [[10, 11],
        [12, 13],
        [14, 15],
        [16, 17],
        [18, 19]]])>

## Getting information from tensors (shape, rank, size)

There will be times when you'll want to get different pieces of information from your tensors, in particuluar, you should know the following tensor vocabulary:
* **Shape:** The length (number of elements) of each of the dimensions of a tensor.
* **Rank:** The number of tensor dimensions. A scalar has rank 0, a vector has rank 1, a matrix is rank 2, a tensor has rank n.
* **Axis** or **Dimension:** A particular dimension of a tensor.
* **Size:** The total number of items in the tensor.

You'll use these especially when you're trying to line up the shapes of your data to the shapes of your model. For example, making sure the shape of your image tensors are the same shape as your models input layer.

We've already seen one of these before using the `ndim` attribute. Let's see the rest.

In [60]:
# Creating a Rank 4 Tensor:
rank_4_tensor = np.ones(shape= [2,3,4,5])
rank_4_tensor

array([[[[1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.]],

        [[1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.]],

        [[1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.]]],


       [[[1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.]],

        [[1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.]],

        [[1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.]]]])

In [61]:
# Get various attributes of tensor
print("Datatype of Tensor:", rank_4_tensor.dtype)
print("Number of dimensions (rank):", rank_4_tensor.ndim)
print("Shape of tensor:", rank_4_tensor.shape)
print("Elements along axis 0 of tensor:", rank_4_tensor.shape[0])
print("Elements along last axis of tensor:", rank_4_tensor.shape[-1])
print("Total number of elements (2*3*4*5):", tf.size(rank_4_tensor))
print("Total number of elements (2*3*4*5):", tf.size(rank_4_tensor).numpy()) # .numpy() converts to NumPy array

Datatype of Tensor: float64
Number of dimensions (rank): 4
Shape of tensor: (2, 3, 4, 5)
Elements along axis 0 of tensor: 2
Elements along last axis of tensor: 5
Total number of elements (2*3*4*5): tf.Tensor(120, shape=(), dtype=int32)
Total number of elements (2*3*4*5): 120


### Tensor Indexing:
- Same as Python and Numpy Indexing.

In [62]:
rank_4_tensor

array([[[[1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.]],

        [[1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.]],

        [[1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.]]],


       [[[1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.]],

        [[1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.]],

        [[1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.],
         [1., 1., 1., 1., 1.]]]])

In [63]:
# Getting First 2 Elelments of each Axis:
rank_4_tensor[:2,:2,:2,:2]

array([[[[1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.]]],


       [[[1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.]]]])

In [65]:
# get Single Element from All Dimensions except one:
print(rank_4_tensor[:1,:1,:1,:])
print("\n")
print(rank_4_tensor[:1,:1,:,:1])
print("\n")

print(rank_4_tensor[:1,:,:1,:1])
print("\n")

print(rank_4_tensor[:,:1,:1,:1])

[[[[1. 1. 1. 1. 1.]]]]


[[[[1.]
   [1.]
   [1.]
   [1.]]]]


[[[[1.]]

  [[1.]]

  [[1.]]]]


[[[[1.]]]


 [[[1.]]]]


In [68]:
# Expanding Axis or Adding a New Axis:

rank_2_tensor = tf.constant([[1,2,3],
                           [4,5,6]])
rank_2_tensor

<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[1, 2, 3],
       [4, 5, 6]])>

In [69]:
# Adding an Axis at last Place:
rank_2_tensor[:, :, tf.newaxis]

<tf.Tensor: shape=(2, 3, 1), dtype=int32, numpy=
array([[[1],
        [2],
        [3]],

       [[4],
        [5],
        [6]]])>

In [70]:
rank_2_tensor[:, tf.newaxis, :]

<tf.Tensor: shape=(2, 1, 3), dtype=int32, numpy=
array([[[1, 2, 3]],

       [[4, 5, 6]]])>

In [71]:
rank_2_tensor[tf.newaxis, : ,:]

<tf.Tensor: shape=(1, 2, 3), dtype=int32, numpy=
array([[[1, 2, 3],
        [4, 5, 6]]])>

In [72]:
# Adding an Axis using tf.expand_dims():
tf.expand_dims(rank_2_tensor, axis= -1)

<tf.Tensor: shape=(2, 3, 1), dtype=int32, numpy=
array([[[1],
        [2],
        [3]],

       [[4],
        [5],
        [6]]])>

In [73]:
tf.expand_dims(rank_2_tensor, axis= 0)

<tf.Tensor: shape=(1, 2, 3), dtype=int32, numpy=
array([[[1, 2, 3],
        [4, 5, 6]]])>

In [74]:
tf.expand_dims(rank_2_tensor, axis= 1)

<tf.Tensor: shape=(2, 1, 3), dtype=int32, numpy=
array([[[1, 2, 3]],

       [[4, 5, 6]]])>

In [75]:
tf.expand_dims(rank_2_tensor, axis= 2)

<tf.Tensor: shape=(2, 3, 1), dtype=int32, numpy=
array([[[1],
        [2],
        [3]],

       [[4],
        [5],
        [6]]])>

### Matrix mutliplication

One of the most common operations in machine learning algorithms is [matrix multiplication](https://www.mathsisfun.com/algebra/matrix-multiplying.html).

TensorFlow implements this matrix multiplication functionality in the [`tf.matmul()`](https://www.tensorflow.org/api_docs/python/tf/linalg/matmul) method.

The main two rules for matrix multiplication to remember are:
1. The inner dimensions must match:
  * `(3, 5) @ (3, 5)` won't work
  * `(5, 3) @ (3, 5)` will work
  * `(3, 5) @ (5, 3)` will work
2. The resulting matrix has the shape of the outer dimensions:
 * `(5, 3) @ (3, 5)` -> `(5, 5)`
 * `(3, 5) @ (5, 3)` -> `(3, 3)`

> 🔑 **Note:** '`@`' in Python is the symbol for matrix multiplication.

In [77]:
# Matrix Multiplicqation Using Tensorflow:
tensor = tf.constant([[1,2],
                     [3,4]])
tf.matmul(tensor, tensor)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 7, 10],
       [15, 22]])>

In [79]:
# Matrix Multiplication Using Python Operator "@":

tensor @ tensor

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 7, 10],
       [15, 22]])>

In [82]:
# Multiplication Between Mismatched Shapes Tensors:

X = tf.constant([[1,2],
                [3,4],
                [5,6]])

Y = tf.constant([[7,8],
                [9,10],
                [11,12]])

X, Y

(<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[1, 2],
        [3, 4],
        [5, 6]])>,
 <tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[ 7,  8],
        [ 9, 10],
        [11, 12]])>)

In [83]:
tf.matmul(X, Y)

InvalidArgumentError: {{function_node __wrapped__MatMul_device_/job:localhost/replica:0/task:0/device:CPU:0}} Matrix size-incompatible: In[0]: [3,2], In[1]: [3,2] [Op:MatMul] name: 

Trying to matrix multiply two tensors with the shape `(3, 2)` errors because the inner dimensions don't match.

We need to either:
* Reshape X to `(2, 3)` so it's `(2, 3) @ (3, 2)`.
* Reshape Y to `(3, 2)` so it's `(3, 2) @ (2, 3)`.

We can do this with either:
* [`tf.reshape()`](https://www.tensorflow.org/api_docs/python/tf/reshape) - allows us to reshape a tensor into a defined shape.
* [`tf.transpose()`](https://www.tensorflow.org/api_docs/python/tf/transpose) - switches the dimensions of a given tensor.

![lining up dimensions for dot products](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/00-lining-up-dot-products.png)

Let's try `tf.reshape()` first.

In [84]:
tf.reshape(Y, shape= (2,3))

<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[ 7,  8,  9],
       [10, 11, 12]])>

In [85]:
tf.matmul(X, tf.reshape(Y, shape= (2,3)))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 27,  30,  33],
       [ 61,  68,  75],
       [ 95, 106, 117]])>

In [87]:
# Trying Same by Transposing X:
X, tf.transpose(X)

(<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[1, 2],
        [3, 4],
        [5, 6]])>,
 <tf.Tensor: shape=(2, 3), dtype=int32, numpy=
 array([[1, 3, 5],
        [2, 4, 6]])>)

In [89]:
tf.matmul(tf.transpose(X), Y)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 89,  98],
       [116, 128]])>

In [90]:
# We can also Transpose One or Both Matrix by using Arguments in tf.matmul():

tf.matmul(X, Y, transpose_a= True, transpose_b= False)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 89,  98],
       [116, 128]])>

Notice the difference in the resulting shapes when tranposing `X` or reshaping `Y`.

This is because of the 2nd rule mentioned above:
 * `(3, 2) @ (2, 3)` -> `(2, 2)` done with `tf.matmul(tf.transpose(X), Y)`
 * `(2, 3) @ (3, 2)` -> `(3, 3)` done with `X @ tf.reshape(Y, shape=(2, 3))`

This kind of data manipulation is a reminder: you'll spend a lot of your time in machine learning and working with neural networks reshaping data (in the form of tensors) to prepare it to be used with various operations (such as feeding it to a model).

### The dot product

Multiplying matrices by eachother is also referred to as the dot product.

You can perform the `tf.matmul()` operation using [`tf.tensordot()`](https://www.tensorflow.org/api_docs/python/tf/tensordot). 

In [91]:
tf.tensordot(tf.transpose(X), Y, axes= 1)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 89,  98],
       [116, 128]])>

You might notice that although using both `reshape` and `tranpose` work, you get different results when using each.

Let's see an example, first with `tf.transpose()` then with `tf.reshape()`.

In [92]:
# Perform Matrix Multiplication Using tf.transpose on Y:
tf.matmul(X, tf.transpose(Y))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 23,  29,  35],
       [ 53,  67,  81],
       [ 83, 105, 127]])>

In [93]:
# Perform Matrix Multiplication Using tf.reshape on Y:
tf.matmul(X, tf.reshape(Y, shape=(2,3)))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 27,  30,  33],
       [ 61,  68,  75],
       [ 95, 106, 117]])>

Hmm... they result in different values.

Which is strange because when dealing with `Y` (a `(3x2)` matrix), reshaping to `(2, 3)` and tranposing it result in the same shape.

In [94]:
# Checking Shapes of Y, Y Transposed and Y Reshaped:
Y.shape, tf.transpose(Y).shape, tf.reshape(Y, shape=(2,3)).shape

(TensorShape([3, 2]), TensorShape([2, 3]), TensorShape([2, 3]))

But calling `tf.reshape()` and `tf.transpose()` on `Y` don't necessarily result in the same values.

In [95]:
# Checking Value of Y, Y Transposed and Y Reshaped:
print("Y:")
print(Y, "\n")

print("Y Transposed:")
print(tf.transpose(Y), "\n")

print("Y Reshaped:")
print(tf.reshape(Y, shape=(2,3)))

Y:
tf.Tensor(
[[ 7  8]
 [ 9 10]
 [11 12]], shape=(3, 2), dtype=int32) 

Y Transposed:
tf.Tensor(
[[ 7  9 11]
 [ 8 10 12]], shape=(2, 3), dtype=int32) 

Y Reshaped:
tf.Tensor(
[[ 7  8  9]
 [10 11 12]], shape=(2, 3), dtype=int32)


As you can see, the outputs of `tf.reshape()` and `tf.transpose()` when called on `Y`, even though they have the same shape, are different.

This can be explained by the default behaviour of each method:
* [`tf.reshape()`](https://www.tensorflow.org/api_docs/python/tf/reshape) - change the shape of the given tensor (first) and then insert values in order they appear (in our case, 7, 8, 9, 10, 11, 12).
* [`tf.transpose()`](https://www.tensorflow.org/api_docs/python/tf/transpose) - swap the order of the axes, by default the last axis becomes the first, however the order can be changed using the [`perm` parameter](https://www.tensorflow.org/api_docs/python/tf/transpose).

So which should you use?

Again, most of the time these operations (when they need to be run, such as during the training a neural network, will be implemented for you).

But generally, whenever performing a matrix multiplication and the shapes of two matrices don't line up, you will transpose (not reshape) one of them in order to line them up.

### Matrix multiplication tidbits
* If we transposed `Y`, it would be represented as $\mathbf{Y}^\mathsf{T}$ (note the capital T for tranpose).
* Get an illustrative view of matrix multiplication [by Math is Fun](https://www.mathsisfun.com/algebra/matrix-multiplying.html).
* Try a hands-on demo of matrix multiplcation: http://matrixmultiplication.xyz/ (shown below).

![visual demo of matrix multiplication](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/00-matrix-multiply-crop.gif)

### Changing the datatype of a Tensor

Sometimes you'll want to alter the default datatype of your tensor. 

This is common when you want to compute using less precision (e.g. 16-bit floating point numbers vs. 32-bit floating point numbers). 

Computing with less precision is useful on devices with less computing capacity such as mobile devices (because the less bits, the less space the computations require).

You can change the datatype of a tensor using [`tf.cast()`](https://www.tensorflow.org/api_docs/python/tf/cast).

In [105]:
# Creating Tensors with Default DataTypes (int32 and float32):

ten1 = tf.constant([1,2])
ten2 = tf.constant([1.1,2.2])

ten1, ten2

(<tf.Tensor: shape=(2,), dtype=int32, numpy=array([1, 2])>,
 <tf.Tensor: shape=(2,), dtype=float32, numpy=array([1.1, 2.2], dtype=float32)>)

In [106]:
# Changing from int32 to int16 (Reduced Precision):
tf.cast(ten1, dtype= tf.int16)

<tf.Tensor: shape=(2,), dtype=int16, numpy=array([1, 2], dtype=int16)>

In [107]:
# Changing from float32 to float16 (Reduced Precision):
tf.cast(ten2, dtype= tf.float16)

<tf.Tensor: shape=(2,), dtype=float16, numpy=array([1.1, 2.2], dtype=float16)>

In [108]:
# Changing from int32 to float32:
tf.cast(ten1, dtype= tf.float32)

<tf.Tensor: shape=(2,), dtype=float32, numpy=array([1., 2.], dtype=float32)>

In [109]:
# Changing from float32 to int32:
tf.cast(ten2, dtype= tf.int32)

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([1, 2])>

### Finding the min, max, mean, sum (aggregation)

You can quickly aggregate (perform a calculation on a whole tensor) tensors to find things like the minimum value, maximum value, mean and sum of all the elements.

To do so, aggregation methods typically have the syntax `reduce()_[action]`, such as:
* [`tf.reduce_min()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_min) - find the minimum value in a tensor.
* [`tf.reduce_max()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_max) - find the maximum value in a tensor (helpful for when you want to find the highest prediction probability).
* [`tf.reduce_mean()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_mean) - find the mean of all elements in a tensor.
* [`tf.reduce_sum()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_sum) - find the sum of all elements in a tensor.
* **Note:** typically, each of these is under the `math` module, e.g. `tf.math.reduce_min()` but you can use the alias `tf.reduce_min()`.

In [111]:
tf.random.set_seed(42)
agg_ten = tf.random.uniform(shape= [2,5,10], minval=0, maxval=100, seed= 42)
agg_ten

<tf.Tensor: shape=(2, 5, 10), dtype=float32, numpy=
array([[[4.1630280e+01, 2.6858162e+01, 4.7968315e+01, 3.6457134e+01,
         9.5471146e+01, 9.4186462e+01, 6.1483395e+01, 3.5842144e+01,
         5.9360241e+01, 2.1551096e+01],
        [7.7451706e+00, 5.7921314e+01, 2.9180395e+01, 2.6718033e+01,
         3.7012459e+01, 7.1610329e+01, 4.5877766e+01, 1.1764563e+01,
         2.1073711e+01, 5.4419731e+01],
        [9.8980690e+01, 3.8395859e+01, 4.6835661e+00, 8.7184616e+01,
         2.5881708e+01, 8.7313499e+01, 6.4698433e+01, 4.1981232e+01,
         2.4148273e+01, 9.5500584e+00],
        [9.8208191e+01, 1.5702081e+01, 2.9976822e+01, 3.6795307e+01,
         9.4537163e+01, 1.1056781e+01, 5.2287628e+01, 8.3054413e+01,
         2.0720959e-01, 9.5940338e+01],
        [8.5630020e+01, 3.9444969e+01, 2.2028875e+01, 6.7066071e+01,
         1.8757463e+00, 4.8057056e+01, 5.9534538e+01, 6.8473289e+01,
         1.8988943e+01, 1.2489867e+01]],

       [[3.3931625e+01, 5.4095234e+01, 5.8585655e+01, 9.

In [112]:
# Minimum:
tf.reduce_min(agg_ten)

<tf.Tensor: shape=(), dtype=float32, numpy=0.08791685>

In [113]:
# Maximum:
tf.reduce_max(agg_ten)

<tf.Tensor: shape=(), dtype=float32, numpy=98.98069>

In [114]:
# Mean:
tf.reduce_mean(agg_ten)

<tf.Tensor: shape=(), dtype=float32, numpy=46.776207>

In [115]:
# Sum:
tf.reduce_sum(agg_ten)

<tf.Tensor: shape=(), dtype=float32, numpy=4677.6206>

In [117]:
# Standard Deviation:
tf.math.reduce_std(agg_ten)

<tf.Tensor: shape=(), dtype=float32, numpy=28.15716>

In [118]:
tf.math.reduce_variance(agg_ten)

<tf.Tensor: shape=(), dtype=float32, numpy=792.8256>

### Finding the position of maximum and minimum

How about finding the position a tensor where the maximum value occurs?

This is helpful when you want to line up your labels (say `['Green', 'Blue', 'Red']`) with your prediction probabilities tensor (e.g. `[0.98, 0.01, 0.01]`).

In this case, the predicted label (the one with the highest prediction probability) would be `'Green'`.

You can do the same for the minimum (if required) with the following:
* [`tf.argmax()`](https://www.tensorflow.org/api_docs/python/tf/math/argmax) - find the position of the maximum element in a given tensor.
* [`tf.argmin()`](https://www.tensorflow.org/api_docs/python/tf/math/argmin) - find the position of the minimum element in a given tensor.

In [121]:
arg_ten = tf.random.normal(shape= [50])
arg_ten

<tf.Tensor: shape=(50,), dtype=float32, numpy=
array([ 8.4224582e-02, -8.6090374e-01,  3.7812304e-01, -5.1962738e-03,
       -4.9453196e-01,  6.1781919e-01, -3.3082047e-01, -1.3840806e-03,
       -4.2373410e-01, -1.3872087e+00, -1.5488191e+00, -5.3198391e-01,
       -4.4756433e-01, -2.0115814e+00, -5.7926011e-01,  5.7938927e-01,
        1.3041967e+00,  6.7720258e-01, -7.4587613e-01,  1.0378964e+00,
        1.3820479e+00,  1.4319171e+00, -3.7643117e-01,  9.8158473e-01,
       -2.3597862e-01, -3.3763260e-01, -8.9593250e-01,  4.2754072e-01,
       -3.8105518e-01,  4.7006992e-01,  3.5413779e-02, -2.9272759e+00,
       -9.6707004e-01, -4.1402709e-01, -4.0137586e-01,  6.2328768e-01,
       -9.3648863e-01,  9.5449388e-01,  4.9025390e-01, -9.9804842e-01,
       -1.1686406e+00, -6.7897290e-01,  1.7331039e+00,  7.8643018e-01,
        9.2237018e-02,  2.2711790e-01, -9.1896117e-02,  1.1224977e+00,
       -9.1732341e-01,  8.0541009e-01], dtype=float32)>

In [122]:
# Index of Maximum Value inside a Tensor:
tf.argmax(arg_ten)

<tf.Tensor: shape=(), dtype=int64, numpy=42>

In [123]:
arg_ten[42]

<tf.Tensor: shape=(), dtype=float32, numpy=1.7331039>

In [124]:
# Index of Minimum Value inside a Tensor:
tf.argmin(arg_ten)

<tf.Tensor: shape=(), dtype=int64, numpy=31>

In [125]:
arg_ten[31]

<tf.Tensor: shape=(), dtype=float32, numpy=-2.927276>

In [126]:
random_ten = tf.random.normal([2,5])
random_ten

<tf.Tensor: shape=(2, 5), dtype=float32, numpy=
array([[-0.55909735, -0.5347214 ,  2.3730333 , -1.5725931 ,  0.8055056 ],
       [-0.83387697,  0.30611223,  2.2660494 ,  0.2856414 , -1.5536156 ]],
      dtype=float32)>

In [130]:
tf.argmax(random_ten, axis= 0)

<tf.Tensor: shape=(5,), dtype=int64, numpy=array([0, 1, 0, 1, 0], dtype=int64)>

In [131]:
tf.argmin(random_ten, axis= 1)

<tf.Tensor: shape=(2,), dtype=int64, numpy=array([3, 4], dtype=int64)>

### Squeezing a tensor (removing all single dimensions)

If you need to remove single-dimensions from a tensor (dimensions with size 1), you can use `tf.squeeze()`.

* [`tf.squeeze()`](https://www.tensorflow.org/api_docs/python/tf/squeeze) - remove all dimensions of 1 from a tensor.

In [132]:
sqz_tensor = tf.random.uniform(shape= [1,1,1,1,50])
sqz_tensor

<tf.Tensor: shape=(1, 1, 1, 1, 50), dtype=float32, numpy=
array([[[[[0.7402308 , 0.33938193, 0.5692506 , 0.44811392, 0.29285502,
           0.4260056 , 0.62890387, 0.691061  , 0.30925727, 0.89236605,
           0.66396606, 0.30541587, 0.8724164 , 0.1025728 , 0.56819403,
           0.25427842, 0.7253866 , 0.4770788 , 0.46289814, 0.88944995,
           0.6792555 , 0.09752727, 0.01609659, 0.4876021 , 0.5832968 ,
           0.41212583, 0.731905  , 0.93418944, 0.5298122 , 0.9664817 ,
           0.88391197, 0.10578597, 0.44439578, 0.7851516 , 0.47332513,
           0.89893615, 0.04290593, 0.8717004 , 0.6068529 , 0.12963045,
           0.4527359 , 0.24573493, 0.34777248, 0.582147  , 0.82298195,
           0.82862926, 0.877372  , 0.5319803 , 0.03594303, 0.03986669]]]]],
      dtype=float32)>

In [133]:
sqz_tensor.shape

TensorShape([1, 1, 1, 1, 50])

In [134]:
tf.squeeze(sqz_tensor)

<tf.Tensor: shape=(50,), dtype=float32, numpy=
array([0.7402308 , 0.33938193, 0.5692506 , 0.44811392, 0.29285502,
       0.4260056 , 0.62890387, 0.691061  , 0.30925727, 0.89236605,
       0.66396606, 0.30541587, 0.8724164 , 0.1025728 , 0.56819403,
       0.25427842, 0.7253866 , 0.4770788 , 0.46289814, 0.88944995,
       0.6792555 , 0.09752727, 0.01609659, 0.4876021 , 0.5832968 ,
       0.41212583, 0.731905  , 0.93418944, 0.5298122 , 0.9664817 ,
       0.88391197, 0.10578597, 0.44439578, 0.7851516 , 0.47332513,
       0.89893615, 0.04290593, 0.8717004 , 0.6068529 , 0.12963045,
       0.4527359 , 0.24573493, 0.34777248, 0.582147  , 0.82298195,
       0.82862926, 0.877372  , 0.5319803 , 0.03594303, 0.03986669],
      dtype=float32)>

### One-hot encoding

If you have a tensor of indicies and would like to one-hot encode it, you can use [`tf.one_hot()`](https://www.tensorflow.org/api_docs/python/tf/one_hot).

You should also specify the `depth` parameter (the level which you want to one-hot encode to).

In [146]:
python_list = [0,1,2,3]

In [147]:
tf.one_hot(python_list, depth=4)

<tf.Tensor: shape=(4, 4), dtype=float32, numpy=
array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]], dtype=float32)>

In [148]:
tf.one_hot(python_list, depth= 4, on_value= "On", off_value= "Off")

<tf.Tensor: shape=(4, 4), dtype=string, numpy=
array([[b'On', b'Off', b'Off', b'Off'],
       [b'Off', b'On', b'Off', b'Off'],
       [b'Off', b'Off', b'On', b'Off'],
       [b'Off', b'Off', b'Off', b'On']], dtype=object)>

## Using `@tf.function`

In your TensorFlow adventures, you might come across Python functions which have the decorator [`@tf.function`](https://www.tensorflow.org/api_docs/python/tf/function).

If you aren't sure what Python decorators do, [read RealPython's guide on them](https://realpython.com/primer-on-python-decorators/).

But in short, decorators modify a function in one way or another.

In the `@tf.function` decorator case, it turns a Python function into a callable TensorFlow graph. Which is a fancy way of saying, if you've written your own Python function, and you decorate it with `@tf.function`, when you export your code (to potentially run on another device), TensorFlow will attempt to convert it into a fast(er) version of itself (by making it part of a computation graph).

For more on this, read the [Better performnace with tf.function](https://www.tensorflow.org/guide/function) guide.

In [150]:
# Simple Python Function:

def func(x, y):
    return x ** 2 + y

x = tf.constant(np.arange(0,10))
y = tf.constant(np.arange(10,20))

x,y, func(x, y)

(<tf.Tensor: shape=(10,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])>,
 <tf.Tensor: shape=(10,), dtype=int32, numpy=array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])>,
 <tf.Tensor: shape=(10,), dtype=int32, numpy=array([ 10,  12,  16,  22,  30,  40,  52,  66,  82, 100])>)

In [151]:
# Same Function Decorated with @tf.function:

@tf.function
def func(x, y):
    return x ** 2 + y

func(x, y)

<tf.Tensor: shape=(10,), dtype=int32, numpy=array([ 10,  12,  16,  22,  30,  40,  52,  66,  82, 100])>

If you noticed no difference between the above two functions (the decorated one and the non-decorated one) you'd be right.

Much of the difference happens behind the scenes. One of the main ones being potential code speed-ups where possible.

## Finding access to GPUs

We've mentioned GPUs plenty of times throughout this notebook.

So how do you check if you've got one available?

You can check if you've got access to a GPU using [`tf.config.list_physical_devices()`](https://www.tensorflow.org/guide/gpu).

In [155]:
tf.config.list_physical_devices()

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]