**Welcome to TensorFlow: A Beginner's Guide**

**What is TensorFlow?**

TensorFlow is a powerful tool that helps us build and use machine learning models. It's like a big library that has many useful functions that we can use to make our lives easier. You can think of it like a toolbox that has many tools that we can use to build and train our models.

**Why Use TensorFlow?**

Instead of building everything from scratch, we can use TensorFlow to make our lives easier. It has many common machine learning functions that we can use, so we don't have to start from zero.

**What We're Going to Cover**

TensorFlow is a big topic, but don't worry, we're going to take it one step at a time. We'll cover the basics of TensorFlow, including:

* What are tensors and how do we create them?
* How do we get information from tensors?
* How do we manipulate tensors?
* How do tensors work with NumPy?
* How do we use a special function called `@tf.function` to make our code faster?
* How do we use GPUs with TensorFlow?
* Some exercises to try on your own

**Important Notes**

* Many of the things we'll cover will happen automatically when we build a model, but it's good to know what's going on behind the scenes.
* If you see a TensorFlow function that you don't understand, you can always look it up in the documentation. The documentation is like a big manual that explains how everything works. You can find it here:
     https://www.tensorflow.org/api_docs/python/ .

In [103]:
import tensorflow as tf
print(tf.__version__)

2.16.1


# Tensors

A **tensor** is a mathematical object that represents a multi-dimensional array of numerical values.

Think of a tensor like a container that holds a set of values, similar to a matrix or an array. However, tensors can have more than two dimensions, unlike matrices, which are limited to two dimensions.

Here's a simple analogy to help you understand tensors:

* A scalar is a single value, like a number (e.g., 5).
* A vector is a 1-dimensional array of values, like a list of numbers (e.g., [1, 2, 3, 4, 5]).
* A matrix is a 2-dimensional array of values, like a table of numbers (e.g., [[1, 2], [3, 4]]).
* A tensor is a multi-dimensional array of values, like a cube or a higher-dimensional structure of numbers (e.g., [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]).

Tensors are used to represent complex data structures in machine learning and deep learning, such as:

* Images: A color image can be represented as a 3-dimensional tensor, with dimensions for height, width, and color channels (e.g., RGB).
* Audio: A sound wave can be represented as a 2-dimensional tensor, with dimensions for time and frequency.
* Text: A sentence or a document can be represented as a high-dimensional tensor, with dimensions for words, sentences, and semantic meaning.

In TensorFlow, tensors are the fundamental data structure used to represent inputs, outputs, and intermediate results of machine learning models. TensorFlow provides various operations and functions to manipulate and transform tensors, which enables the creation of complex machine learning models.

In [104]:
scalar = tf.constant(7)
scalar

<tf.Tensor: shape=(), dtype=int32, numpy=7>

In [105]:
# Cheack the number of dimentions
scalar.ndim

0

In [106]:
# Create a vector
vector = tf.constant([10,10])
vector

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([10, 10])>

In [107]:
# Cheack the number of dimentions
vector.ndim

1

In [108]:
# Create a matrix
matix = tf.constant([[10,7],[7,10]])
matix

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[10,  7],
       [ 7, 10]])>

In [109]:
matix.ndim

2

In [110]:
# How about a tensor? (more than 2 dimensions, although, all of the above items are also technically tensors)
tensor = tf.constant([[[1, 2, 3],
                       [4, 5, 6]],
                      [[7, 8, 9],
                       [10, 11, 12]],
                      [[13, 14, 15],
                       [16, 17, 18]]])
tensor

<tf.Tensor: shape=(3, 2, 3), dtype=int32, numpy=
array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]],

       [[13, 14, 15],
        [16, 17, 18]]])>

In [111]:
tensor.ndim

3

Let's break down an example of a tensor with shape (224, 224, 3, 32):

* 224 and 224 are the height and width of an image in pixels.
* 3 is the number of color channels (red, green, and blue).
* 32 is the batch size, or the number of images processed at once.

**Scalars, Vectors, Matrices, and Tensors**

These terms are often used to describe tensors with specific numbers of dimensions:

* **Scalar**: a single number.
* **Vector**: a number with direction.
* **Matrix**: a 2-dimensional array of numbers.
* **Tensor**: an n-dimensional array of numbers (where n can be any number).

Note that the terms "matrix" and "tensor" are often used interchangeably.

From now on, we'll refer to everything as tensors, since we're using TensorFlow.

If you want to learn more about the mathematical differences between scalars, vectors, and matrices, check out the [visual algebra post by Math is Fun](https://www.mathsisfun.com/algebra/scalar-vector-matrix.html).

![difference between scalar, vector, matrix, tensor](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/00-scalar-vector-matrix-tensor.png)

**Creating Tensors with `tf.Variable()`**

You can create tensors using `tf.Variable()`. This is not usually necessary, as tensors are often created automatically when working with data.

**The Difference Between `tf.Variable()` and `tf.constant()`**

There are two ways to create tensors: `tf.Variable()` and `tf.constant()`. The main difference is:

* **Immutable**: Tensors created with `tf.constant()` cannot be changed. You can only use them to create a new tensor.
* **Mutable**: Tensors created with `tf.Variable()` can be changed.

In [112]:
# Mutable = m , NON-Mutable = nm
m = tf.Variable([10,7])
nm = tf.constant([10,7])
m , nm

(<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([10,  7])>,
 <tf.Tensor: shape=(2,), dtype=int32, numpy=array([10,  7])>)

To change an element of a `tf.Variable()` tensor requires the `assign()` method.

In [113]:
# requires the .assign() method
m[0].assign(1)
m

<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([1, 7])>

Creating Random Tensors for Beginners

### What are Random Tensors?

Random tensors are tensors of arbitrary size that contain random numbers. These tensors are used in various applications, particularly in neural networks to initialize weights and patterns that need to be learned from data.

### Why Do We Need Random Tensors?

Neural networks use random tensors to initialize their weights, which are then refined through training to represent patterns in the data. This process involves taking a random n-dimensional array of numbers and updating them to represent a compressed version of the original data.

### How Do Neural Networks Learn?

The learning process in neural networks involves the following steps:

1. **Initialization**: The network starts with random patterns.
2. **Training**: The network is trained on demonstrative examples of data.
3. **Refining**: The network updates its random patterns to represent the examples.

### Creating Random Tensors with TensorFlow

TensorFlow provides the `tf.random.Generator` class to create random tensors. This class maintains an internal state that is updated every time random numbers are generated. You can create a `tf.random.Generator` object by manually creating an instance of the class or by using `tf.random.get_global_generator()` to get the default global generator.

Here's an example of creating a random tensor using `tf.random.Generator`:

```
g = tf.random.Generator.from_seed(1)
print(g.normal(shape=[2, 3]))
```

This will generate a random tensor of shape (2, 3) with random numbers.

### Key Points to Remember

- `tf.random.Generator` maintains an internal state that is updated every time random numbers are generated.
- You can create a `tf.random.Generator` object manually or use `tf.random.get_global_generator()` to get the default global generator.
- The `tf.random.Generator` class is used to create independent random-number streams.
- It interacts with `tf.function` and distribution strategies in specific ways.
- You can save and restore generators using `tf.train.Checkpoint`.


In [114]:
# Create two random (but the same) tensors
random_1 = tf.random.Generator.from_seed(42) # set the seed for reproducibility
random_1 = random_1.normal(shape=(3, 2)) # create tensor from a normal distribution 
random_2 = tf.random.Generator.from_seed(42)
random_2 = random_2.normal(shape=(3, 2))

# Are they equal?
random_1, random_2, random_1 == random_2


(<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[-0.7565803 , -0.06854702],
        [ 0.07595026, -1.2573844 ],
        [-0.23193763, -1.8107855 ]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[-0.7565803 , -0.06854702],
        [ 0.07595026, -1.2573844 ],
        [-0.23193763, -1.8107855 ]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=bool, numpy=
 array([[ True,  True],
        [ True,  True],
        [ True,  True]])>)

In [115]:
# Create two random (and different) tensors
random_3 = tf.random.Generator.from_seed(42)
random_3 = random_3.normal(shape=(3, 2))
random_4 = tf.random.Generator.from_seed(11)
random_4 = random_4.normal(shape=(3, 2))

# Check the tensors and see if they are equal
random_3, random_4, random_1 == random_3, random_3 == random_4

(<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[-0.7565803 , -0.06854702],
        [ 0.07595026, -1.2573844 ],
        [-0.23193763, -1.8107855 ]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[ 0.27305737, -0.29925638],
        [-0.3652325 ,  0.61883307],
        [-1.0130816 ,  0.28291714]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=bool, numpy=
 array([[ True,  True],
        [ True,  True],
        [ True,  True]])>,
 <tf.Tensor: shape=(3, 2), dtype=bool, numpy=
 array([[False, False],
        [False, False],
        [False, False]])>)

What if you wanted to shuffle the order of a tensor?

Wait, why would you want to do that?

Let's say you working with 15,000 images of cats and dogs and the first 10,000 images of were of cats and the next 5,000 were of dogs. This order could effect how a neural network learns (it may overfit by learning the order of the data), instead, it might be a good idea to move your data around.

In [116]:
# Shuffle a tensor (valuable for when you want to shuffle your data)
not_shuffled = tf.constant([[10, 7],
                            [3, 4],
                            [2, 5]])
# Gets different results each time
tf.random.shuffle(not_shuffled)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[ 2,  5],
       [10,  7],
       [ 3,  4]])>

Wait... why didn't the numbers come out the same?

It's due to rule #4 of the [`tf.random.set_seed()`](https://www.tensorflow.org/api_docs/python/tf/random/set_seed) documentation.

> "4. If both the global and the operation seed are set: Both seeds are used in conjunction to determine the random sequence."

`tf.random.set_seed(42)` sets the global seed, and the `seed` parameter in `tf.random.shuffle(seed=42)` sets the operation seed.

Because, "Operations that rely on a random seed actually derive it from two seeds: the global and operation-level seeds. This sets the global seed."


In [117]:
not_shuffled

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4],
       [ 2,  5]])>

In [118]:
# Shuffle in the same order every time using the seed parameter (won't acutally be the same)
tf.random.shuffle(not_shuffled, seed=42)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4],
       [ 2,  5]])>

--------------------------------- Exercise -------------------------------

In [119]:
rd1 = tf.constant([[1,1,1],[2,2,2],[3,3,3]])
rd2 = tf.constant([[1,0,0],[0,1,0],[0,0,1]])
rd3 = tf.constant([[2,3,5],[7,11,13],[17,19,23]])
rd1 , rd2, rd3 

(<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
 array([[1, 1, 1],
        [2, 2, 2],
        [3, 3, 3]])>,
 <tf.Tensor: shape=(3, 3), dtype=int32, numpy=
 array([[1, 0, 0],
        [0, 1, 0],
        [0, 0, 1]])>,
 <tf.Tensor: shape=(3, 3), dtype=int32, numpy=
 array([[ 2,  3,  5],
        [ 7, 11, 13],
        [17, 19, 23]])>)

In [120]:
tf.random.shuffle(rd1, seed=23)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[3, 3, 3],
       [1, 1, 1],
       [2, 2, 2]])>

-------------------------------------------------------------------------------

In [121]:
tf.random.set_seed(42)
tf.random.shuffle(rd2)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[0, 1, 0],
       [0, 0, 1],
       [1, 0, 0]])>

### Other ways to make tensors

Though you might rarely use these (remember, many tensor operations are done behind the scenes for you), you can use [`tf.ones()`](https://www.tensorflow.org/api_docs/python/tf/ones) to create a tensor of all ones and [`tf.zeros()`](https://www.tensorflow.org/api_docs/python/tf/zeros) to create a tensor of all zeros.

In [122]:
# Make a tensor of all ones
tf.ones(shape=(3, 3))

<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]], dtype=float32)>

In [123]:
tf.zeros(shape=(3,3))

<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]], dtype=float32)>

You can also turn NumPy arrays in into tensors.

Remember, the main difference between tensors and NumPy arrays is that tensors can be run on GPUs.

> 🔑 **Note:** A matrix or tensor is typically represented by a capital letter (e.g. `X` or `A`) where as a vector is typically represented by a lowercase letter (e.g. `y` or `b`).

In [124]:
# Numpy arrays into tensors
import numpy as np
# Numpy array = NA
NA = np.arange(1,25, dtype= np.int32)
NA

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24])

In [125]:
# numpy to tensor
NtT = tf.constant(NA,shape=(4,3,2))
NtT

<tf.Tensor: shape=(4, 3, 2), dtype=int32, numpy=
array([[[ 1,  2],
        [ 3,  4],
        [ 5,  6]],

       [[ 7,  8],
        [ 9, 10],
        [11, 12]],

       [[13, 14],
        [15, 16],
        [17, 18]],

       [[19, 20],
        [21, 22],
        [23, 24]]])>

In [126]:
NtT.ndim

3

**Getting Information from Tensors: Shape, Rank, Size, and More**

When working with tensors, it's essential to understand how to extract valuable information from them. Here are some key concepts to grasp:

* **Shape**: The number of elements in each dimension of a tensor. Think of it as the length of each dimension.
* **Rank**: The number of dimensions in a tensor. For example, a scalar has a rank of 0, a vector has a rank of 1, a matrix has a rank of 2, and so on.
* **Axis** or **Dimension**: A specific dimension of a tensor. You can think of it as a particular direction or feature of the tensor.
* **Size**: The total number of elements in a tensor. It's the product of the lengths of all its dimensions.

Understanding these concepts is crucial when working with tensors, especially when aligning the shapes of your data with the shapes of your model. For instance, you might need to ensure that the shape of your image tensors matches the shape of your model's input layer.


In [127]:
# Creating a 4 rank tensors 
r4 = tf.zeros(shape=[2,5,3,3])
r4

<tf.Tensor: shape=(2, 5, 3, 3), dtype=float32, numpy=
array([[[[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]]],


       [[[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]]]], dtype=float32)>

In [128]:
r4[0]

<tf.Tensor: shape=(5, 3, 3), dtype=float32, numpy=
array([[[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]]], dtype=float32)>

In [129]:
r4.shape , r4.ndim , tf.size(r4)

(TensorShape([2, 5, 3, 3]), 4, <tf.Tensor: shape=(), dtype=int32, numpy=90>)

In [130]:
dim5 = tf.random.Generator.from_seed(42)
dim5 = dim5.normal(shape = (1,1,1,1))
dim5

<tf.Tensor: shape=(1, 1, 1, 1), dtype=float32, numpy=array([[[[-0.7565803]]]], dtype=float32)>

In [131]:
dim5.ndim, dim5.shape, tf.size(dim5)

(4, TensorShape([1, 1, 1, 1]), <tf.Tensor: shape=(), dtype=int32, numpy=1>)

In [132]:
# Get various attributes of our tensor
print("Datatype of every element:",r4.dtype)
print("Number of dimention:",r4.ndim)
print("Shape of tensor:",r4.shape)
print("Element along axis 0 of tensor:",r4.shape[0])
print("Element along the last axis of tensor:",r4.shape[-1])
print("Total number of elements:",tf.size(r4))
print("Total number of elements:",tf.size(r4).numpy())


Datatype of every element: <dtype: 'float32'>
Number of dimention: 4
Shape of tensor: (2, 5, 3, 3)
Element along axis 0 of tensor: 2
Element along the last axis of tensor: 3
Total number of elements: tf.Tensor(90, shape=(), dtype=int32)
Total number of elements: 90


## Indexing tensors
You can also index tensors just like Python lists.

In [133]:
random_list = [1,2,3,4]
random_list[:2]

[1, 2]

In [134]:
r4[:2,:2,:2,:2 ]

<tf.Tensor: shape=(2, 2, 2, 2), dtype=float32, numpy=
array([[[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]],


       [[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]]], dtype=float32)>

In [135]:
r4[:1,:1,:1,:]

<tf.Tensor: shape=(1, 1, 1, 3), dtype=float32, numpy=array([[[[0., 0., 0.]]]], dtype=float32)>

In [136]:
# r2 = tf.random.Generator.from_seed(42)
# r2= r2.normal(shape= (3,3))
r2 = tf.constant([[1,2],[3,4]])
r2

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[1, 2],
       [3, 4]])>

In [137]:
r2[:,-1]

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([2, 4])>

In [138]:
# Add extra dimention to our rank2 tensor
# [:,:,] alternative -->  ... - this is a notation to include all dimentions before the next one
# tf.newaxis = crates a new axis/dimention in the existing one
r3t = r2[...,tf.newaxis]
r3t

<tf.Tensor: shape=(2, 2, 1), dtype=int32, numpy=
array([[[1],
        [2]],

       [[3],
        [4]]])>

In [139]:
# alternative to tf.newaxis
tf.expand_dims(r2, axis=-1) # "-1" means expand the final axis


<tf.Tensor: shape=(2, 2, 1), dtype=int32, numpy=
array([[[1],
        [2]],

       [[3],
        [4]]])>

In [140]:
tf.expand_dims(r2, axis=1)

<tf.Tensor: shape=(2, 1, 2), dtype=int32, numpy=
array([[[1, 2]],

       [[3, 4]]])>

In [141]:
r2

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[1, 2],
       [3, 4]])>

### Manupulating tensors

Here are the main points:

**Basic Operations on Tensors**

* You can perform basic math operations on tensors using Python operators like `+`, `-`, and `*`.
* These operations create a new tensor with the result, leaving the original tensor unchanged.
* You can also use TensorFlow functions like `tf.multiply()` for operations, which can make your code faster.
* The original tensor remains unchanged no matter what operations you perform.

**Basic Operations on Tensors for Beginners**

**What are Tensors?**
A tensor is a multi-dimensional array of numbers. Think of it like a table with rows and columns.

**Basic Operations**
You can perform basic math operations on tensors using Python operators like `+`, `-`, and `*`.

**Adding Values**
You can add a number to a tensor using the `+` operator.
```
tensor = tf.constant([[10, 7], [3, 4]])
tensor + 10
```
This will add 10 to each number in the tensor, resulting in:
```
[[20, 17],
 [13, 14]]
```
**Important:** The original tensor remains unchanged. The operation creates a new tensor with the result.

**Original Tensor Unchanged**
```
tensor
```
This will still show the original tensor:
```
[[10, 7],
 [3, 4]]
```
**Multiplication (Element-wise)**
You can multiply a tensor by a number using the `*` operator.
```
tensor * 10
```
This will multiply each number in the tensor by 10, resulting in:
```
[[100, 70],
 [30, 40]]
```
**Subtraction**
You can subtract a number from a tensor using the `-` operator.
```
tensor - 10
```
This will subtract 10 from each number in the tensor, resulting in:
```
[[0, -3],
 [-7, -6]]
```
**Using TensorFlow Functions**
TensorFlow has its own functions for performing operations, like `tf.multiply()`.
```
tf.multiply(tensor, 10)
```
This will give the same result as the `*` operator:
```
[[100, 70],
 [30, 40]]
```
**Why Use TensorFlow Functions?**
Using TensorFlow functions can make your code faster when working with large datasets.

**Original Tensor Still Unchanged**
No matter what operations you perform, the original tensor remains unchanged.
```
tensor
```
This will still show the original tensor:
```
[[10, 7],
 [3, 4]]
```

Trying to matrix multiply two tensors with the shape `(3, 2)` errors because the inner dimensions don't match.

We need to either:
* Reshape X to `(2, 3)` so it's `(2, 3) @ (3, 2)`.
* Reshape Y to `(3, 2)` so it's `(3, 2) @ (2, 3)`.

We can do this with either:
* [`tf.reshape()`](https://www.tensorflow.org/api_docs/python/tf/reshape) - allows us to reshape a tensor into a defined shape.
* [`tf.transpose()`](https://www.tensorflow.org/api_docs/python/tf/transpose) - switches the dimensions of a given tensor.

![lining up dimensions for dot products](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/00-lining-up-dot-products.png)

Let's try `tf.reshape()` first.

In [142]:
# Create (3, 2) tensor
X = tf.constant([[1, 2],
                 [3, 4],
                 [5, 6]])

# Create another (3, 2) tensor
Y = tf.constant([[7, 8],
                 [9, 10],
                 [11, 12]])
X, Y

(<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[1, 2],
        [3, 4],
        [5, 6]])>,
 <tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[ 7,  8],
        [ 9, 10],
        [11, 12]])>)

In [143]:
# Changing the shape 
tf.reshape(Y, shape= (2,3))

<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[ 7,  8,  9],
       [10, 11, 12]])>

In [144]:
X @ tf.reshape(Y, shape =(2,3))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 27,  30,  33],
       [ 61,  68,  75],
       [ 95, 106, 117]])>

In [145]:
# You can achieve the same result with parameters
tf.matmul(a=X, b=Y, transpose_a=False, transpose_b=True)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 23,  29,  35],
       [ 53,  67,  81],
       [ 83, 105, 127]])>

It worked, let's try the same with a reshaped `X`, except this time we'll use [`tf.transpose()`](https://www.tensorflow.org/api_docs/python/tf/transpose) and `tf.matmul()`.

In [146]:
# Example of transpose (3, 2) -> (2, 3)
tf.transpose(X)

<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[1, 3, 5],
       [2, 4, 6]])>

In [147]:
# Try matrix multiplication 
tf.matmul(tf.transpose(X), Y)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 89,  98],
       [116, 128]])>

In [148]:
# You can achieve the same result with parameters
tf.matmul(a=X, b=Y, transpose_a=True, transpose_b=False)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 89,  98],
       [116, 128]])>

Notice the difference in the resulting shapes when tranposing `X` or reshaping `Y`.

This is because of the 2nd rule mentioned above:
 * `(3, 2) @ (2, 3)` -> `(3, 3)` done with `X @ tf.reshape(Y, shape=(2, 3))` 
 * `(2, 3) @ (3, 2)` -> `(2, 2)` done with `tf.matmul(tf.transpose(X), Y)`

This kind of data manipulation is a reminder: you'll spend a lot of your time in machine learning and working with neural networks reshaping data (in the form of tensors) to prepare it to be used with various operations (such as feeding it to a model).

### The dot product

Multiplying matrices by eachother is also referred to as the dot product.

You can perform the `tf.matmul()` operation using [`tf.tensordot()`](https://www.tensorflow.org/api_docs/python/tf/tensordot). 

In [149]:
# Perform the dot product on X and Y (requires X to be transposed)
tf.tensordot(tf.transpose(X), Y, axes=1)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 89,  98],
       [116, 128]])>

You might notice that although using both `reshape` and `tranpose` work, you get different results when using each.

Let's see an example, first with `tf.transpose()` then with `tf.reshape()`.

In [150]:
# Perform matrix multiplication between X and Y (transposed)
tf.matmul(X, tf.transpose(Y))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 23,  29,  35],
       [ 53,  67,  81],
       [ 83, 105, 127]])>

In [151]:
# Perform matrix multiplication between X and Y (reshaped)
tf.matmul(X, tf.reshape(Y, (2, 3)))

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 27,  30,  33],
       [ 61,  68,  75],
       [ 95, 106, 117]])>

Hmm... they result in different values.

Which is strange because when dealing with `Y` (a `(3x2)` matrix), reshaping to `(2, 3)` and tranposing it result in the same shape.

In [152]:
# Check shapes of Y, reshaped Y and tranposed Y
Y.shape, tf.reshape(Y, (2, 3)).shape, tf.transpose(Y).shape

(TensorShape([3, 2]), TensorShape([2, 3]), TensorShape([2, 3]))

But calling `tf.reshape()` and `tf.transpose()` on `Y` don't necessarily result in the same values.

In [153]:
# Check values of Y, reshape Y and tranposed Y
print("Normal Y:")
print(Y, "\n") # "\n" for newline

print("Y reshaped to (2, 3):")
print(tf.reshape(Y, (2, 3)), "\n")

print("Y transposed:")
print(tf.transpose(Y))

Normal Y:
tf.Tensor(
[[ 7  8]
 [ 9 10]
 [11 12]], shape=(3, 2), dtype=int32) 

Y reshaped to (2, 3):
tf.Tensor(
[[ 7  8  9]
 [10 11 12]], shape=(2, 3), dtype=int32) 

Y transposed:
tf.Tensor(
[[ 7  9 11]
 [ 8 10 12]], shape=(2, 3), dtype=int32)


So which should you use?

Again, most of the time these operations (when they need to be run, such as during the training a neural network, will be implemented for you).

But generally, whenever performing a matrix multiplication and the shapes of two matrices don't line up, you will transpose (not reshape) one of them in order to line them up.

### Matrix multiplication tidbits
* If we transposed `Y`, it would be represented as $\mathbf{Y}^\mathsf{T}$ (note the capital T for tranpose).
* Get an illustrative view of matrix multiplication [by Math is Fun](https://www.mathsisfun.com/algebra/matrix-multiplying.html).
* Try a hands-on demo of matrix multiplcation: http://matrixmultiplication.xyz/ (shown below).

![visual demo of matrix multiplication](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/00-matrix-multiply-crop.gif)

### Changing the datatype of a tensor

Sometimes you'll want to alter the default datatype of your tensor. 

This is common when you want to compute using less precision (e.g. 16-bit floating point numbers vs. 32-bit floating point numbers). 

Computing with less precision is useful on devices with less computing capacity such as mobile devices (because the less bits, the less space the computations require).

You can change the datatype of a tensor using [`tf.cast()`](https://www.tensorflow.org/api_docs/python/tf/cast).

In [154]:
# Create a new tensor with default datatype (float32)
B = tf.constant([1.7, 7.4])

# Create a new tensor with default datatype (int32)
C = tf.constant([1, 7])
B, C

(<tf.Tensor: shape=(2,), dtype=float32, numpy=array([1.7, 7.4], dtype=float32)>,
 <tf.Tensor: shape=(2,), dtype=int32, numpy=array([1, 7])>)

In [155]:
# Change from int32 to float32
C = tf.cast(C, dtype=tf.float32)
C

<tf.Tensor: shape=(2,), dtype=float32, numpy=array([1., 7.], dtype=float32)>