**Welcome to TensorFlow: A Beginner's Guide**

**What is TensorFlow?**

TensorFlow is a powerful tool that helps us build and use machine learning models. It's like a big library that has many useful functions that we can use to make our lives easier. You can think of it like a toolbox that has many tools that we can use to build and train our models.

**Why Use TensorFlow?**

Instead of building everything from scratch, we can use TensorFlow to make our lives easier. It has many common machine learning functions that we can use, so we don't have to start from zero.

**What We're Going to Cover**

TensorFlow is a big topic, but don't worry, we're going to take it one step at a time. We'll cover the basics of TensorFlow, including:

* What are tensors and how do we create them?
* How do we get information from tensors?
* How do we manipulate tensors?
* How do tensors work with NumPy?
* How do we use a special function called `@tf.function` to make our code faster?
* How do we use GPUs with TensorFlow?
* Some exercises to try on your own

**Important Notes**

* Many of the things we'll cover will happen automatically when we build a model, but it's good to know what's going on behind the scenes.
* If you see a TensorFlow function that you don't understand, you can always look it up in the documentation. The documentation is like a big manual that explains how everything works. You can find it here:
     https://www.tensorflow.org/api_docs/python/ .

In [1]:
import tensorflow as tf
print(tf.__version__)

2.10.1


# Tensors

A **tensor** is a mathematical object that represents a multi-dimensional array of numerical values.

Think of a tensor like a container that holds a set of values, similar to a matrix or an array. However, tensors can have more than two dimensions, unlike matrices, which are limited to two dimensions.

Here's a simple analogy to help you understand tensors:

* A scalar is a single value, like a number (e.g., 5).
* A vector is a 1-dimensional array of values, like a list of numbers (e.g., [1, 2, 3, 4, 5]).
* A matrix is a 2-dimensional array of values, like a table of numbers (e.g., [[1, 2], [3, 4]]).
* A tensor is a multi-dimensional array of values, like a cube or a higher-dimensional structure of numbers (e.g., [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]).

Tensors are used to represent complex data structures in machine learning and deep learning, such as:

* Images: A color image can be represented as a 3-dimensional tensor, with dimensions for height, width, and color channels (e.g., RGB).
* Audio: A sound wave can be represented as a 2-dimensional tensor, with dimensions for time and frequency.
* Text: A sentence or a document can be represented as a high-dimensional tensor, with dimensions for words, sentences, and semantic meaning.

In TensorFlow, tensors are the fundamental data structure used to represent inputs, outputs, and intermediate results of machine learning models. TensorFlow provides various operations and functions to manipulate and transform tensors, which enables the creation of complex machine learning models.

In [2]:
scalar = tf.constant(7)
scalar

<tf.Tensor: shape=(), dtype=int32, numpy=7>

In [3]:
# Cheack the number of dimentions
scalar.ndim

0

In [4]:
# Create a vector
vector = tf.constant([10,10])
vector

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([10, 10])>

In [5]:
# Cheack the number of dimentions
vector.ndim

1

In [6]:
# Create a matrix
matix = tf.constant([[10,7],[7,10]])
matix

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[10,  7],
       [ 7, 10]])>

In [7]:
matix.ndim

2

In [8]:
# How about a tensor? (more than 2 dimensions, although, all of the above items are also technically tensors)
tensor = tf.constant([[[1, 2, 3],
                       [4, 5, 6]],
                      [[7, 8, 9],
                       [10, 11, 12]],
                      [[13, 14, 15],
                       [16, 17, 18]]])
tensor

<tf.Tensor: shape=(3, 2, 3), dtype=int32, numpy=
array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]],

       [[13, 14, 15],
        [16, 17, 18]]])>

In [9]:
tensor.ndim

3

Let's break down an example of a tensor with shape (224, 224, 3, 32):

* 224 and 224 are the height and width of an image in pixels.
* 3 is the number of color channels (red, green, and blue).
* 32 is the batch size, or the number of images processed at once.

**Scalars, Vectors, Matrices, and Tensors**

These terms are often used to describe tensors with specific numbers of dimensions:

* **Scalar**: a single number.
* **Vector**: a number with direction.
* **Matrix**: a 2-dimensional array of numbers.
* **Tensor**: an n-dimensional array of numbers (where n can be any number).

Note that the terms "matrix" and "tensor" are often used interchangeably.

From now on, we'll refer to everything as tensors, since we're using TensorFlow.

If you want to learn more about the mathematical differences between scalars, vectors, and matrices, check out the [visual algebra post by Math is Fun](https://www.mathsisfun.com/algebra/scalar-vector-matrix.html).

![difference between scalar, vector, matrix, tensor](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/00-scalar-vector-matrix-tensor.png)

**Creating Tensors with `tf.Variable()`**

You can create tensors using `tf.Variable()`. This is not usually necessary, as tensors are often created automatically when working with data.

**The Difference Between `tf.Variable()` and `tf.constant()`**

There are two ways to create tensors: `tf.Variable()` and `tf.constant()`. The main difference is:

* **Immutable**: Tensors created with `tf.constant()` cannot be changed. You can only use them to create a new tensor.
* **Mutable**: Tensors created with `tf.Variable()` can be changed.

In [10]:
# Mutable = m , NON-Mutable = nm
m = tf.Variable([10,7])
nm = tf.constant([10,7])
m , nm

(<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([10,  7])>,
 <tf.Tensor: shape=(2,), dtype=int32, numpy=array([10,  7])>)

To change an element of a `tf.Variable()` tensor requires the `assign()` method.

In [11]:
# requires the .assign() method
m[0].assign(1)
m

<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([1, 7])>

Creating Random Tensors for Beginners

### What are Random Tensors?

Random tensors are tensors of arbitrary size that contain random numbers. These tensors are used in various applications, particularly in neural networks to initialize weights and patterns that need to be learned from data.

### Why Do We Need Random Tensors?

Neural networks use random tensors to initialize their weights, which are then refined through training to represent patterns in the data. This process involves taking a random n-dimensional array of numbers and updating them to represent a compressed version of the original data.

### How Do Neural Networks Learn?

The learning process in neural networks involves the following steps:

1. **Initialization**: The network starts with random patterns.
2. **Training**: The network is trained on demonstrative examples of data.
3. **Refining**: The network updates its random patterns to represent the examples.

### Creating Random Tensors with TensorFlow

TensorFlow provides the `tf.random.Generator` class to create random tensors. This class maintains an internal state that is updated every time random numbers are generated. You can create a `tf.random.Generator` object by manually creating an instance of the class or by using `tf.random.get_global_generator()` to get the default global generator.

Here's an example of creating a random tensor using `tf.random.Generator`:

```
g = tf.random.Generator.from_seed(1)
print(g.normal(shape=[2, 3]))
```

This will generate a random tensor of shape (2, 3) with random numbers.

### Key Points to Remember

- `tf.random.Generator` maintains an internal state that is updated every time random numbers are generated.
- You can create a `tf.random.Generator` object manually or use `tf.random.get_global_generator()` to get the default global generator.
- The `tf.random.Generator` class is used to create independent random-number streams.
- It interacts with `tf.function` and distribution strategies in specific ways.
- You can save and restore generators using `tf.train.Checkpoint`.


In [12]:
# Create two random (but the same) tensors
random_1 = tf.random.Generator.from_seed(42) # set the seed for reproducibility
random_1 = random_1.normal(shape=(3, 2)) # create tensor from a normal distribution 
random_2 = tf.random.Generator.from_seed(42)
random_2 = random_2.normal(shape=(3, 2))

# Are they equal?
random_1, random_2, random_1 == random_2


(<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[-0.7565803 , -0.06854702],
        [ 0.07595026, -1.2573844 ],
        [-0.23193765, -1.8107855 ]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[-0.7565803 , -0.06854702],
        [ 0.07595026, -1.2573844 ],
        [-0.23193765, -1.8107855 ]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=bool, numpy=
 array([[ True,  True],
        [ True,  True],
        [ True,  True]])>)

In [13]:
# Create two random (and different) tensors
random_3 = tf.random.Generator.from_seed(42)
random_3 = random_3.normal(shape=(3, 2))
random_4 = tf.random.Generator.from_seed(11)
random_4 = random_4.normal(shape=(3, 2))

# Check the tensors and see if they are equal
random_3, random_4, random_1 == random_3, random_3 == random_4

(<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[-0.7565803 , -0.06854702],
        [ 0.07595026, -1.2573844 ],
        [-0.23193765, -1.8107855 ]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[ 0.2730574 , -0.29925638],
        [-0.3652325 ,  0.61883307],
        [-1.0130816 ,  0.2829171 ]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=bool, numpy=
 array([[ True,  True],
        [ True,  True],
        [ True,  True]])>,
 <tf.Tensor: shape=(3, 2), dtype=bool, numpy=
 array([[False, False],
        [False, False],
        [False, False]])>)

What if you wanted to shuffle the order of a tensor?

Wait, why would you want to do that?

Let's say you working with 15,000 images of cats and dogs and the first 10,000 images of were of cats and the next 5,000 were of dogs. This order could effect how a neural network learns (it may overfit by learning the order of the data), instead, it might be a good idea to move your data around.

In [14]:
# Shuffle a tensor (valuable for when you want to shuffle your data)
not_shuffled = tf.constant([[10, 7],
                            [3, 4],
                            [2, 5]])
# Gets different results each time
tf.random.shuffle(not_shuffled)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[ 3,  4],
       [ 2,  5],
       [10,  7]])>

Wait... why didn't the numbers come out the same?

It's due to rule #4 of the [`tf.random.set_seed()`](https://www.tensorflow.org/api_docs/python/tf/random/set_seed) documentation.

> "4. If both the global and the operation seed are set: Both seeds are used in conjunction to determine the random sequence."

`tf.random.set_seed(42)` sets the global seed, and the `seed` parameter in `tf.random.shuffle(seed=42)` sets the operation seed.

Because, "Operations that rely on a random seed actually derive it from two seeds: the global and operation-level seeds. This sets the global seed."


In [15]:
not_shuffled

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4],
       [ 2,  5]])>

In [16]:
# Shuffle in the same order every time using the seed parameter (won't acutally be the same)
tf.random.shuffle(not_shuffled, seed=42)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[ 2,  5],
       [ 3,  4],
       [10,  7]])>

--------------------------------- Exercise -------------------------------

In [17]:
rd1 = tf.constant([[1,1,1],[2,2,2],[3,3,3]])
rd2 = tf.constant([[1,0,0],[0,1,0],[0,0,1]])
rd3 = tf.constant([[2,3,5],[7,11,13],[17,19,23]])
rd1 , rd2, rd3 

(<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
 array([[1, 1, 1],
        [2, 2, 2],
        [3, 3, 3]])>,
 <tf.Tensor: shape=(3, 3), dtype=int32, numpy=
 array([[1, 0, 0],
        [0, 1, 0],
        [0, 0, 1]])>,
 <tf.Tensor: shape=(3, 3), dtype=int32, numpy=
 array([[ 2,  3,  5],
        [ 7, 11, 13],
        [17, 19, 23]])>)

In [18]:
tf.random.shuffle(rd1, seed=23)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[3, 3, 3],
       [1, 1, 1],
       [2, 2, 2]])>

-------------------------------------------------------------------------------

In [19]:
tf.random.set_seed(42)
tf.random.shuffle(rd2)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[0, 1, 0],
       [0, 0, 1],
       [1, 0, 0]])>

### Other ways to make tensors

Though you might rarely use these (remember, many tensor operations are done behind the scenes for you), you can use [`tf.ones()`](https://www.tensorflow.org/api_docs/python/tf/ones) to create a tensor of all ones and [`tf.zeros()`](https://www.tensorflow.org/api_docs/python/tf/zeros) to create a tensor of all zeros.

In [20]:
# Make a tensor of all ones
tf.ones(shape=(3, 3))

<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]], dtype=float32)>

In [21]:
tf.zeros(shape=(3,3))

<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]], dtype=float32)>

You can also turn NumPy arrays in into tensors.

Remember, the main difference between tensors and NumPy arrays is that tensors can be run on GPUs.

> 🔑 **Note:** A matrix or tensor is typically represented by a capital letter (e.g. `X` or `A`) where as a vector is typically represented by a lowercase letter (e.g. `y` or `b`).

In [22]:
# Numpy arrays into tensors
import numpy as np
# Numpy array = NA
NA = np.arange(1,25, dtype= np.int32)
NA

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24])

In [23]:
# numpy to tensor
NtT = tf.constant(NA,shape=(4,3,2))
NtT

<tf.Tensor: shape=(4, 3, 2), dtype=int32, numpy=
array([[[ 1,  2],
        [ 3,  4],
        [ 5,  6]],

       [[ 7,  8],
        [ 9, 10],
        [11, 12]],

       [[13, 14],
        [15, 16],
        [17, 18]],

       [[19, 20],
        [21, 22],
        [23, 24]]])>

In [24]:
NtT.ndim

3

**Getting Information from Tensors: Shape, Rank, Size, and More**

When working with tensors, it's essential to understand how to extract valuable information from them. Here are some key concepts to grasp:

* **Shape**: The number of elements in each dimension of a tensor. Think of it as the length of each dimension.
* **Rank**: The number of dimensions in a tensor. For example, a scalar has a rank of 0, a vector has a rank of 1, a matrix has a rank of 2, and so on.
* **Axis** or **Dimension**: A specific dimension of a tensor. You can think of it as a particular direction or feature of the tensor.
* **Size**: The total number of elements in a tensor. It's the product of the lengths of all its dimensions.

Understanding these concepts is crucial when working with tensors, especially when aligning the shapes of your data with the shapes of your model. For instance, you might need to ensure that the shape of your image tensors matches the shape of your model's input layer.


In [25]:
# Creating a 4 rank tensors 
r4 = tf.zeros(shape=[2,5,3,3])
r4

<tf.Tensor: shape=(2, 5, 3, 3), dtype=float32, numpy=
array([[[[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]]],


       [[[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]]]], dtype=float32)>

In [26]:
r4[0]

<tf.Tensor: shape=(5, 3, 3), dtype=float32, numpy=
array([[[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]]], dtype=float32)>

In [27]:
r4.shape , r4.ndim , tf.size(r4)

(TensorShape([2, 5, 3, 3]), 4, <tf.Tensor: shape=(), dtype=int32, numpy=90>)

In [28]:
dim5 = tf.random.Generator.from_seed(42)
dim5 = dim5.normal(shape = (1,1,1,1))
dim5

<tf.Tensor: shape=(1, 1, 1, 1), dtype=float32, numpy=array([[[[-0.7565803]]]], dtype=float32)>

In [29]:
dim5.ndim, dim5.shape, tf.size(dim5)

(4, TensorShape([1, 1, 1, 1]), <tf.Tensor: shape=(), dtype=int32, numpy=1>)

In [30]:
# Get various attributes of our tensor
print("Datatype of every element:",r4.dtype)
print("Number of dimention:",r4.ndim)
print("Shape of tensor:",r4.shape)
print("Element along axis 0 of tensor:",r4.shape[0])
print("Element along the last axis of tensor:",r4.shape[-1])
print("Total number of elements:",tf.size(r4))
print("Total number of elements:",tf.size(r4).numpy())


Datatype of every element: <dtype: 'float32'>
Number of dimention: 4
Shape of tensor: (2, 5, 3, 3)
Element along axis 0 of tensor: 2
Element along the last axis of tensor: 3
Total number of elements: tf.Tensor(90, shape=(), dtype=int32)
Total number of elements: 90


## Indexing tensors
You can also index tensors just like Python lists.

In [31]:
random_list = [1,2,3,4]
random_list[:2]

[1, 2]

In [32]:
r4[:2,:2,:2,:2 ]

<tf.Tensor: shape=(2, 2, 2, 2), dtype=float32, numpy=
array([[[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]],


       [[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]]], dtype=float32)>

In [33]:
r4[:1,:1,:1,:]

<tf.Tensor: shape=(1, 1, 1, 3), dtype=float32, numpy=array([[[[0., 0., 0.]]]], dtype=float32)>

In [34]:
# r2 = tf.random.Generator.from_seed(42)
# r2= r2.normal(shape= (3,3))
r2 = tf.constant([[1,2],[3,4]])
r2

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[1, 2],
       [3, 4]])>

In [35]:
r2[:,-1]

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([2, 4])>

In [36]:
# Add extra dimention to our rank2 tensor
# [:,:,] alternative -->  ... - this is a notation to include all dimentions before the next one
# tf.newaxis = crates a new axis/dimention in the existing one
r3t = r2[...,tf.newaxis]
r3t

<tf.Tensor: shape=(2, 2, 1), dtype=int32, numpy=
array([[[1],
        [2]],

       [[3],
        [4]]])>

In [37]:
# alternative to tf.newaxis
tf.expand_dims(r2, axis=-1) # "-1" means expand the final axis


<tf.Tensor: shape=(2, 2, 1), dtype=int32, numpy=
array([[[1],
        [2]],

       [[3],
        [4]]])>

In [38]:
tf.expand_dims(r2, axis=1)

<tf.Tensor: shape=(2, 1, 2), dtype=int32, numpy=
array([[[1, 2]],

       [[3, 4]]])>

In [39]:
r2

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[1, 2],
       [3, 4]])>

### Manupulating tensors

Here are the main points:

**Basic Operations on Tensors**

* You can perform basic math operations on tensors using Python operators like `+`, `-`, and `*`.
* These operations create a new tensor with the result, leaving the original tensor unchanged.
* You can also use TensorFlow functions like `tf.multiply()` for operations, which can make your code faster.
* The original tensor remains unchanged no matter what operations you perform.

**Basic Operations on Tensors for Beginners**

**What are Tensors?**
A tensor is a multi-dimensional array of numbers. Think of it like a table with rows and columns.

**Basic Operations**
You can perform basic math operations on tensors using Python operators like `+`, `-`, and `*`.

**Adding Values**
You can add a number to a tensor using the `+` operator.
```
tensor = tf.constant([[10, 7], [3, 4]])
tensor + 10
```
This will add 10 to each number in the tensor, resulting in:
```
[[20, 17],
 [13, 14]]
```
**Important:** The original tensor remains unchanged. The operation creates a new tensor with the result.

**Original Tensor Unchanged**
```
tensor
```
This will still show the original tensor:
```
[[10, 7],
 [3, 4]]
```
**Multiplication (Element-wise)**
You can multiply a tensor by a number using the `*` operator.
```
tensor * 10
```
This will multiply each number in the tensor by 10, resulting in:
```
[[100, 70],
 [30, 40]]
```
**Subtraction**
You can subtract a number from a tensor using the `-` operator.
```
tensor - 10
```
This will subtract 10 from each number in the tensor, resulting in:
```
[[0, -3],
 [-7, -6]]
```
**Using TensorFlow Functions**
TensorFlow has its own functions for performing operations, like `tf.multiply()`.
```
tf.multiply(tensor, 10)
```
This will give the same result as the `*` operator:
```
[[100, 70],
 [30, 40]]
```
**Why Use TensorFlow Functions?**
Using TensorFlow functions can make your code faster when working with large datasets.

**Original Tensor Still Unchanged**
No matter what operations you perform, the original tensor remains unchanged.
```
tensor
```
This will still show the original tensor:
```
[[10, 7],
 [3, 4]]
```

Trying to matrix multiply two tensors with the shape `(3, 2)` errors because the inner dimensions don't match.

We need to either:
* Reshape X to `(2, 3)` so it's `(2, 3) @ (3, 2)`.
* Reshape Y to `(3, 2)` so it's `(3, 2) @ (2, 3)`.

We can do this with either:
* [`tf.reshape()`](https://www.tensorflow.org/api_docs/python/tf/reshape) - allows us to reshape a tensor into a defined shape.
* [`tf.transpose()`](https://www.tensorflow.org/api_docs/python/tf/transpose) - switches the dimensions of a given tensor.

![lining up dimensions for dot products](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/00-lining-up-dot-products.png)

Let's try `tf.reshape()` first.

In [40]:
# Create (3, 2) tensor
X = tf.constant([[1, 2],
                 [3, 4],
                 [5, 6]])

# Create another (3, 2) tensor
Y = tf.constant([[7, 8],
                 [9, 10],
                 [11, 12]])
X, Y

(<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[1, 2],
        [3, 4],
        [5, 6]])>,
 <tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[ 7,  8],
        [ 9, 10],
        [11, 12]])>)

In [41]:
# Changing the shape 
tf.reshape(Y, shape= (2,3))

<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[ 7,  8,  9],
       [10, 11, 12]])>

In [42]:
X @ tf.reshape(Y, shape =(2,3))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 27,  30,  33],
       [ 61,  68,  75],
       [ 95, 106, 117]])>

In [43]:
# You can achieve the same result with parameters
tf.matmul(a=X, b=Y, transpose_a=False, transpose_b=True)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 23,  29,  35],
       [ 53,  67,  81],
       [ 83, 105, 127]])>

It worked, let's try the same with a reshaped `X`, except this time we'll use [`tf.transpose()`](https://www.tensorflow.org/api_docs/python/tf/transpose) and `tf.matmul()`.

In [44]:
# Example of transpose (3, 2) -> (2, 3)
tf.transpose(X)

<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[1, 3, 5],
       [2, 4, 6]])>

In [45]:
# Try matrix multiplication 
tf.matmul(tf.transpose(X), Y)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 89,  98],
       [116, 128]])>

In [46]:
# You can achieve the same result with parameters
tf.matmul(a=X, b=Y, transpose_a=True, transpose_b=False)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 89,  98],
       [116, 128]])>

Notice the difference in the resulting shapes when tranposing `X` or reshaping `Y`.

This is because of the 2nd rule mentioned above:
 * `(3, 2) @ (2, 3)` -> `(3, 3)` done with `X @ tf.reshape(Y, shape=(2, 3))` 
 * `(2, 3) @ (3, 2)` -> `(2, 2)` done with `tf.matmul(tf.transpose(X), Y)`

This kind of data manipulation is a reminder: you'll spend a lot of your time in machine learning and working with neural networks reshaping data (in the form of tensors) to prepare it to be used with various operations (such as feeding it to a model).

### The dot product

Multiplying matrices by eachother is also referred to as the dot product.

You can perform the `tf.matmul()` operation using [`tf.tensordot()`](https://www.tensorflow.org/api_docs/python/tf/tensordot). 

In [47]:
# Perform the dot product on X and Y (requires X to be transposed)
tf.tensordot(tf.transpose(X), Y, axes=1)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 89,  98],
       [116, 128]])>

You might notice that although using both `reshape` and `tranpose` work, you get different results when using each.

Let's see an example, first with `tf.transpose()` then with `tf.reshape()`.

In [48]:
# Perform matrix multiplication between X and Y (transposed)
tf.matmul(X, tf.transpose(Y))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 23,  29,  35],
       [ 53,  67,  81],
       [ 83, 105, 127]])>

In [49]:
# Perform matrix multiplication between X and Y (reshaped)
tf.matmul(X, tf.reshape(Y, (2, 3)))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 27,  30,  33],
       [ 61,  68,  75],
       [ 95, 106, 117]])>

Hmm... they result in different values.

Which is strange because when dealing with `Y` (a `(3x2)` matrix), reshaping to `(2, 3)` and tranposing it result in the same shape.

In [50]:
# Check shapes of Y, reshaped Y and tranposed Y
Y.shape, tf.reshape(Y, (2, 3)).shape, tf.transpose(Y).shape

(TensorShape([3, 2]), TensorShape([2, 3]), TensorShape([2, 3]))

But calling `tf.reshape()` and `tf.transpose()` on `Y` don't necessarily result in the same values.

In [51]:
# Check values of Y, reshape Y and tranposed Y
print("Normal Y:")
print(Y, "\n") # "\n" for newline

print("Y reshaped to (2, 3):")
print(tf.reshape(Y, (2, 3)), "\n")

print("Y transposed:")
print(tf.transpose(Y))

Normal Y:
tf.Tensor(
[[ 7  8]
 [ 9 10]
 [11 12]], shape=(3, 2), dtype=int32) 

Y reshaped to (2, 3):
tf.Tensor(
[[ 7  8  9]
 [10 11 12]], shape=(2, 3), dtype=int32) 

Y transposed:
tf.Tensor(
[[ 7  9 11]
 [ 8 10 12]], shape=(2, 3), dtype=int32)


So which should you use?

Again, most of the time these operations (when they need to be run, such as during the training a neural network, will be implemented for you).

But generally, whenever performing a matrix multiplication and the shapes of two matrices don't line up, you will transpose (not reshape) one of them in order to line them up.

### Matrix multiplication tidbits
* If we transposed `Y`, it would be represented as $\mathbf{Y}^\mathsf{T}$ (note the capital T for tranpose).
* Get an illustrative view of matrix multiplication [by Math is Fun](https://www.mathsisfun.com/algebra/matrix-multiplying.html).
* Try a hands-on demo of matrix multiplcation: http://matrixmultiplication.xyz/ (shown below).

![visual demo of matrix multiplication](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/00-matrix-multiply-crop.gif)

### Changing the datatype of a tensor

Sometimes you'll want to alter the default datatype of your tensor. 

This is common when you want to compute using less precision (e.g. 16-bit floating point numbers vs. 32-bit floating point numbers). 

Computing with less precision is useful on devices with less computing capacity such as mobile devices (because the less bits, the less space the computations require).

You can change the datatype of a tensor using [`tf.cast()`](https://www.tensorflow.org/api_docs/python/tf/cast).

In [52]:
# Create a new tensor with default datatype (float32)
B = tf.constant([1.7, 7.4])

# Create a new tensor with default datatype (int32)
C = tf.constant([1, 7])
B, C

(<tf.Tensor: shape=(2,), dtype=float32, numpy=array([1.7, 7.4], dtype=float32)>,
 <tf.Tensor: shape=(2,), dtype=int32, numpy=array([1, 7])>)

In [53]:
# Change from int32 to float32
C = tf.cast(C, dtype=tf.float32)
C

<tf.Tensor: shape=(2,), dtype=float32, numpy=array([1., 7.], dtype=float32)>

In [54]:
D = tf.cast(B, dtype =tf.float16)
D , D.dtype

(<tf.Tensor: shape=(2,), dtype=float16, numpy=array([1.7, 7.4], dtype=float16)>,
 tf.float16)

In [55]:
E = tf.cast(C,dtype=tf.float32)
E, E.dtype

(<tf.Tensor: shape=(2,), dtype=float32, numpy=array([1., 7.], dtype=float32)>,
 tf.float32)

### Getting the absolute value
Sometimes you'll want the absolute values (all values are positive) of elements in your tensors.

To do so, you can use [`tf.abs()`](https://www.tensorflow.org/api_docs/python/tf/math/abs).

In [56]:
# Create tensor with negative values
D = tf.constant([-7, -10])
D

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([ -7, -10])>

In [57]:
# Get the absolute values
tf.abs(D)

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([ 7, 10])>

### Finding the min, max, mean, sum (aggregation)

You can quickly aggregate (perform a calculation on a whole tensor) tensors to find things like the minimum value, maximum value, mean and sum of all the elements.

To do so, aggregation methods typically have the syntax `reduce()_[action]`, such as:
* [`tf.reduce_min()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_min) - find the minimum value in a tensor.
* [`tf.reduce_max()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_max) - find the maximum value in a tensor (helpful for when you want to find the highest prediction probability).
* [`tf.reduce_mean()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_mean) - find the mean of all elements in a tensor.
* [`tf.reduce_sum()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_sum) - find the sum of all elements in a tensor.
* **Note:** typically, each of these is under the `math` module, e.g. `tf.math.reduce_min()` but you can use the alias `tf.reduce_min()`.

Let's see them in action.

In [58]:
# Create a random tensor

E = tf.constant(np.random.randint(0,100,size=50))
E

<tf.Tensor: shape=(50,), dtype=int32, numpy=
array([19, 98,  7, 67, 19,  9, 60, 50, 80, 30, 28, 18,  6, 95, 67, 13, 75,
       12, 17, 27, 64, 42, 23,  8,  1, 30, 65, 19,  4, 76, 46, 93,  9, 48,
       16, 30,  9, 22, 65, 57, 78, 55, 20, 43, 98, 85, 11, 30, 95, 93])>

In [59]:
tf.size(E), E.shape, E.ndim       

(<tf.Tensor: shape=(), dtype=int32, numpy=50>, TensorShape([50]), 1)

In [60]:
# Finding the minimum
#In numpy we used np.min() but in tf we use:
tf.reduce_min(E), tf.reduce_max(E), tf.reduce_mean(E), tf.reduce_sum(E)

(<tf.Tensor: shape=(), dtype=int32, numpy=1>,
 <tf.Tensor: shape=(), dtype=int32, numpy=98>,
 <tf.Tensor: shape=(), dtype=int32, numpy=42>,
 <tf.Tensor: shape=(), dtype=int32, numpy=2132>)

In [61]:
# Did not work
# tf.math.reduce_variance(E) tf.math.reduce_std(E, axis=1)

In [62]:
# TO find the varience
# import tensorflow_probability as tfp
# tfp.stats.variance(E)

In [63]:
# tfp.stats.stddev(tf.cast(E, dtype=float))

You can also find the standard deviation ([`tf.reduce_std()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_std)) and variance ([`tf.reduce_variance()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_variance)) of elements in a tensor using similar methods.

### Finding the positional maximum and minimum

How about finding the position a tensor where the maximum value occurs?

This is helpful when you want to line up your labels (say `['Green', 'Blue', 'Red']`) with your prediction probabilities tensor (e.g. `[0.98, 0.01, 0.01]`).

In this case, the predicted label (the one with the highest prediction probability) would be `'Green'`.

You can do the same for the minimum (if required) with the following:
* [`tf.argmax()`](https://www.tensorflow.org/api_docs/python/tf/math/argmax) - find the position of the maximum element in a given tensor.
* [`tf.argmin()`](https://www.tensorflow.org/api_docs/python/tf/math/argmin) - find the position of the minimum element in a given tensor.

In [64]:
tf.random.set_seed(42)
F = tf.random.uniform(shape=[50])
F

<tf.Tensor: shape=(50,), dtype=float32, numpy=
array([0.6645621 , 0.44100678, 0.3528825 , 0.46448255, 0.03366041,
       0.68467236, 0.74011743, 0.8724445 , 0.22632635, 0.22319686,
       0.3103881 , 0.7223358 , 0.13318717, 0.5480639 , 0.5746088 ,
       0.8996835 , 0.00946367, 0.5212307 , 0.6345445 , 0.1993283 ,
       0.72942245, 0.54583454, 0.10756552, 0.6767061 , 0.6602763 ,
       0.33695042, 0.60141766, 0.21062577, 0.8527372 , 0.44062173,
       0.9485276 , 0.23752594, 0.81179297, 0.5263394 , 0.494308  ,
       0.21612847, 0.8457197 , 0.8718841 , 0.3083862 , 0.6868038 ,
       0.23764038, 0.7817228 , 0.9671384 , 0.06870162, 0.79873943,
       0.66028714, 0.5871513 , 0.16461694, 0.7381023 , 0.32054043],
      dtype=float32)>

In [65]:
tf.argmax(F)# Find the maximum element position of F


<tf.Tensor: shape=(), dtype=int64, numpy=42>

In [66]:
F[tf.argmax(F)]

<tf.Tensor: shape=(), dtype=float32, numpy=0.9671384>

In [67]:
# Find the minimum element position of F
tf.argmin(F)

<tf.Tensor: shape=(), dtype=int64, numpy=16>

In [68]:
F[tf.argmin(F)]

<tf.Tensor: shape=(), dtype=float32, numpy=0.009463668>

In [69]:
# Find the maximum element position of F
print(f"The maximum value of F is at position: {tf.argmax(F).numpy()}") 
print(f"The maximum value of F is: {tf.reduce_max(F).numpy()}") 
print(f"Using tf.argmax() to index F, the maximum value of F is: {F[tf.argmax(F)].numpy()}")
print(f"Are the two max values the same (they should be)? {F[tf.argmax(F)].numpy() == tf.reduce_max(F).numpy()}")

The maximum value of F is at position: 42
The maximum value of F is: 0.967138409614563
Using tf.argmax() to index F, the maximum value of F is: 0.967138409614563
Are the two max values the same (they should be)? True


### Squeezing a tensor (removing all single dimensions)

If you need to remove single-dimensions from a tensor (dimensions with size 1), you can use `tf.squeeze()`.

* [`tf.squeeze()`](https://www.tensorflow.org/api_docs/python/tf/squeeze) - remove all dimensions of 1 from a tensor.


In [70]:
tf.random.set_seed(42)
G = tf.constant(tf.random.uniform(shape=[50]),shape=(1,1,1,1,50))
G

<tf.Tensor: shape=(1, 1, 1, 1, 50), dtype=float32, numpy=
array([[[[[0.6645621 , 0.44100678, 0.3528825 , 0.46448255, 0.03366041,
           0.68467236, 0.74011743, 0.8724445 , 0.22632635, 0.22319686,
           0.3103881 , 0.7223358 , 0.13318717, 0.5480639 , 0.5746088 ,
           0.8996835 , 0.00946367, 0.5212307 , 0.6345445 , 0.1993283 ,
           0.72942245, 0.54583454, 0.10756552, 0.6767061 , 0.6602763 ,
           0.33695042, 0.60141766, 0.21062577, 0.8527372 , 0.44062173,
           0.9485276 , 0.23752594, 0.81179297, 0.5263394 , 0.494308  ,
           0.21612847, 0.8457197 , 0.8718841 , 0.3083862 , 0.6868038 ,
           0.23764038, 0.7817228 , 0.9671384 , 0.06870162, 0.79873943,
           0.66028714, 0.5871513 , 0.16461694, 0.7381023 , 0.32054043]]]]],
      dtype=float32)>

In [71]:
G.shape

TensorShape([1, 1, 1, 1, 50])

In [72]:
Gsq = tf.squeeze(G)
Gsq, Gsq.shape

(<tf.Tensor: shape=(50,), dtype=float32, numpy=
 array([0.6645621 , 0.44100678, 0.3528825 , 0.46448255, 0.03366041,
        0.68467236, 0.74011743, 0.8724445 , 0.22632635, 0.22319686,
        0.3103881 , 0.7223358 , 0.13318717, 0.5480639 , 0.5746088 ,
        0.8996835 , 0.00946367, 0.5212307 , 0.6345445 , 0.1993283 ,
        0.72942245, 0.54583454, 0.10756552, 0.6767061 , 0.6602763 ,
        0.33695042, 0.60141766, 0.21062577, 0.8527372 , 0.44062173,
        0.9485276 , 0.23752594, 0.81179297, 0.5263394 , 0.494308  ,
        0.21612847, 0.8457197 , 0.8718841 , 0.3083862 , 0.6868038 ,
        0.23764038, 0.7817228 , 0.9671384 , 0.06870162, 0.79873943,
        0.66028714, 0.5871513 , 0.16461694, 0.7381023 , 0.32054043],
       dtype=float32)>,
 TensorShape([50]))

### One-hot encoding

If you have a tensor of indicies and would like to one-hot encode it, you can use [`tf.one_hot()`](https://www.tensorflow.org/api_docs/python/tf/one_hot).

You should also specify the `depth` parameter (the level which you want to one-hot encode to).

In [73]:
# Create a list of indices
some_list = [0, 1, 2, 3]

# One hot encode them
tf.one_hot(some_list, depth=4)

<tf.Tensor: shape=(4, 4), dtype=float32, numpy=
array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]], dtype=float32)>

In [74]:
# Specify custom values for on and off encoding
tf.one_hot(some_list, depth=4, on_value="Yassss", off_value="Pufff")

<tf.Tensor: shape=(4, 4), dtype=string, numpy=
array([[b'Yassss', b'Pufff', b'Pufff', b'Pufff'],
       [b'Pufff', b'Yassss', b'Pufff', b'Pufff'],
       [b'Pufff', b'Pufff', b'Yassss', b'Pufff'],
       [b'Pufff', b'Pufff', b'Pufff', b'Yassss']], dtype=object)>

In the output, the `b` prefix before the strings (e.g., `b'Yassss'`) indicates that the strings are **byte strings**, which are a sequence of raw, uninterpreted bytes. This is a notation used in Python to distinguish byte strings from Unicode strings. It's not part of the string itself, but rather a way to represent the internal storage of strings in TensorFlow.

In [75]:
output = tf.one_hot(some_list, depth=4, on_value="Yassss", off_value="Pufff")
print(output.numpy().astype(str))

[['Yassss' 'Pufff' 'Pufff' 'Pufff']
 ['Pufff' 'Yassss' 'Pufff' 'Pufff']
 ['Pufff' 'Pufff' 'Yassss' 'Pufff']
 ['Pufff' 'Pufff' 'Pufff' 'Yassss']]


### Squaring, log, square root

Many other common mathematical operations you'd like to perform at some stage, probably exist.

Let's take a look at:
* [`tf.square()`](https://www.tensorflow.org/api_docs/python/tf/math/square) - get the square of every value in a tensor. 
* [`tf.sqrt()`](https://www.tensorflow.org/api_docs/python/tf/math/sqrt) - get the squareroot of every value in a tensor (**note:** the elements need to be floats or this will error).
* [`tf.math.log()`](https://www.tensorflow.org/api_docs/python/tf/math/log) - get the natural log of every value in a tensor (elements need to floats).

In [76]:
# Create a new tensor
H = tf.constant(np.arange(1, 10))
H

<tf.Tensor: shape=(9,), dtype=int32, numpy=array([1, 2, 3, 4, 5, 6, 7, 8, 9])>

In [77]:
# Square it
tf.square(H)

<tf.Tensor: shape=(9,), dtype=int32, numpy=array([ 1,  4,  9, 16, 25, 36, 49, 64, 81])>

In [78]:
# Find the squareroot (will error), needs to be non-integer
#tf.sqrt(H)

In [79]:
# Change H to float32
H = tf.cast(H, dtype=tf.float32)
H

<tf.Tensor: shape=(9,), dtype=float32, numpy=array([1., 2., 3., 4., 5., 6., 7., 8., 9.], dtype=float32)>

In [80]:
# Find the square root
tf.sqrt(H)

<tf.Tensor: shape=(9,), dtype=float32, numpy=
array([1.       , 1.4142135, 1.7320508, 2.       , 2.2360678, 2.4494896,
       2.6457512, 2.828427 , 3.       ], dtype=float32)>

In [81]:
# Find the log (input also needs to be float)
tf.math.log(H)

<tf.Tensor: shape=(9,), dtype=float32, numpy=
array([0.       , 0.6931472, 1.0986123, 1.3862944, 1.609438 , 1.7917595,
       1.9459102, 2.0794415, 2.1972246], dtype=float32)>

### Manipulating `tf.Variable` tensors

Tensors created with `tf.Variable()` can be changed in place using methods such as:

* [`.assign()`](https://www.tensorflow.org/api_docs/python/tf/Variable#assign) - assign a different value to a particular index of a variable tensor.
* [`.add_assign()`](https://www.tensorflow.org/api_docs/python/tf/Variable#assign_add) - add to an existing value and reassign it at a particular index of a variable tensor.


In [82]:
# Create a variable tensor
I = tf.Variable(np.arange(0, 5))
I

<tf.Variable 'Variable:0' shape=(5,) dtype=int32, numpy=array([0, 1, 2, 3, 4])>

In [83]:
# Assign the final value a new value of 50
I.assign([0, 1, 2, 3, 50])

<tf.Variable 'UnreadVariable' shape=(5,) dtype=int32, numpy=array([ 0,  1,  2,  3, 50])>

In [84]:
# The change happens in place (the last value is now 50, not 4)
I

<tf.Variable 'Variable:0' shape=(5,) dtype=int32, numpy=array([ 0,  1,  2,  3, 50])>

In [85]:
# Add 10 to every element in I
I.assign_add([10, 10, 10, 10, 10])

<tf.Variable 'UnreadVariable' shape=(5,) dtype=int32, numpy=array([10, 11, 12, 13, 60])>

In [86]:
# Again, the change happens in place
I

<tf.Variable 'Variable:0' shape=(5,) dtype=int32, numpy=array([10, 11, 12, 13, 60])>

## Tensors and NumPy

We've seen some examples of tensors interact with NumPy arrays, such as, using NumPy arrays to create tensors. 

Tensors can also be converted to NumPy arrays using:

* `np.array()` - pass a tensor to convert to an ndarray (NumPy's main datatype).
* `tensor.numpy()` - call on a tensor to convert to an ndarray.

Doing this is helpful as it makes tensors iterable as well as allows us to use any of NumPy's methods on them.

In [87]:
# Create a tensor from a NumPy array
J = tf.constant(np.array([3., 7., 10.]))
J

<tf.Tensor: shape=(3,), dtype=float64, numpy=array([ 3.,  7., 10.])>

In [88]:
# Convert tensor J to NumPy with np.array()
np.array(J), type(np.array(J))

(array([ 3.,  7., 10.]), numpy.ndarray)

In [89]:
# Convert tensor J to NumPy with .numpy()
J.numpy(), type(J.numpy())

(array([ 3.,  7., 10.]), numpy.ndarray)

By default tensors have `dtype=float32`, where as NumPy arrays have `dtype=float64`.

This is because neural networks (which are usually built with TensorFlow) can generally work very well with less precision (32-bit rather than 64-bit).

In [90]:
# Create a tensor from NumPy and from an array
numpy_J = tf.constant(np.array([3., 7., 10.])) # will be float64 (due to NumPy)
tensor_J = tf.constant([3., 7., 10.]) # will be float32 (due to being TensorFlow default)
numpy_J.dtype, tensor_J.dtype

(tf.float64, tf.float32)

## Using `@tf.function`

In your TensorFlow adventures, you might come across Python functions which have the decorator [`@tf.function`](https://www.tensorflow.org/api_docs/python/tf/function).

If you aren't sure what Python decorators do, [read RealPython's guide on them](https://realpython.com/primer-on-python-decorators/).

But in short, decorators modify a function in one way or another.

In the `@tf.function` decorator case, it turns a Python function into a callable TensorFlow graph. Which is a fancy way of saying, if you've written your own Python function, and you decorate it with `@tf.function`, when you export your code (to potentially run on another device), TensorFlow will attempt to convert it into a fast(er) version of itself (by making it part of a computation graph).

For more on this, read the [Better performnace with tf.function](https://www.tensorflow.org/guide/function) guide.

In [91]:
tf.config.list_physical_devices()

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'),
 PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

In [92]:
tf.__version__

'2.10.1'

In [93]:
tf.config.list_physical_devices('GPU')

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

In [94]:
import tensorflow
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

Num GPUs Available:  1


If you've got access to a GPU, the cell above should output something like:

`[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]`

You can also find information about your GPU using `!nvidia-smi`.

------------------------------------------------------------------------------------------
                                        End Of Notebook                                                             