<a href="https://colab.research.google.com/github/mrdbourke/tensorflow-deep-learning/blob/main/00_tensorflow_fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 00. Getting started with TensorFlow: A guide to the fundamentals

## What is TensorFlow?

[TensorFlow](https://www.tensorflow.org/) is an open-source end-to-end machine learning library for preprocessing data, modelling data and serving models (getting them into the hands of others).

## Why use TensorFlow?

Rather than building machine learning and deep learning models from scratch, it's more likely you'll use a library such as TensorFlow. This is because it contains many of the most common machine learning functions you'll want to use.

## What we're going to cover

TensorFlow is vast. But the main premise is simple: turn data into numbers (tensors) and build machine learning algorithms to find patterns in them.

In this notebook we cover some of the most fundamental TensorFlow operations, more specifically:
* Introduction to tensors (creating tensors)
* Getting information from tensors (tensor attributes)
* Manipulating tensors (tensor operations)
* Tensors and NumPy
* Using @tf.function (a way to speed up your regular Python functions)
* Using GPUs with TensorFlow
* Exercises to try

Things to note:
* Many of the conventions here will happen automatically behind the scenes (when you build a model) but it's worth knowing so if you see any of these things, you know what's happening.
* For any TensorFlow function you see, it's important to be able to check it out in the documentation, for example, going to the Python API docs for all functions and searching for what you need: https://www.tensorflow.org/api_docs/python/ (don't worry if this seems overwhelming at first, with enough practice, you'll get used to navigating the documentation).



## Introduction to Tensors

If you've ever used NumPy, [tensors](https://www.tensorflow.org/guide/tensor) are kind of like NumPy arrays (we'll see more on this later).

For the sake of this notebook and going forward, you can think of a tensor as a multi-dimensional numerical representation (also referred to as n-dimensional, where n can be any number) of something. Where something can be almost anything you can imagine: 
* It could be numbers themselves (using tensors to represent the price of houses). 
* It could be an image (using tensors to represent the pixels of an image).
* It could be text (using tensors to represent words).
* Or it could be some other form of information (or data) you want to represent with numbers.

The main difference between tensors and NumPy arrays (also an n-dimensional array of numbers) is that tensors can be used on [GPUs (graphical processing units)](https://blogs.nvidia.com/blog/2009/12/16/whats-the-difference-between-a-cpu-and-a-gpu/) and [TPUs (tensor processing units)](https://en.wikipedia.org/wiki/Tensor_processing_unit). 

The benefit of being able to run on GPUs and TPUs is faster computation, this means, if we wanted to find patterns in the numerical representations of our data, we can generally find them faster using GPUs and TPUs.

Okay, we've been talking enough about tensors, let's see them.

The first thing we'll do is import TensorFlow under the common alias `tf`.

In [1]:
# Import libraries
import os

# Set the environment variable for hiding warnings
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

In [2]:
import tensorflow as tf
from tensorflow.keras import backend as K

# Clear any previous session
K.clear_session()

# Check if GPU is available
gpus = tf.config.experimental.list_physical_devices('GPU')
try:
    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
except RuntimeError as e:
    # Memory growth must be set before GPUs have been initialized
    print(e)
finally:
    # Find the version number (should be 2.x+)
    print(f"TensorFlow version: {tf.__version__}")

TensorFlow version: 2.12.0


### Creating Tensors with `tf.constant()`

As mentioned before, in general, you usually won't create tensors yourself. This is because TensorFlow has modules built-in (such as [`tf.io`](https://www.tensorflow.org/api_docs/python/tf/io) and [`tf.data`](https://www.tensorflow.org/guide/data)) which are able to read your data sources and automatically convert them to tensors and then later on, neural network models will process these for us.

But for now, because we're getting familar with tensors themselves and how to manipulate them, we'll see how we can create them ourselves.

We'll begin by using [`tf.constant()`](https://www.tensorflow.org/api_docs/python/tf/constant).

In [3]:
# Let's see the documentation for tf.constant
help(tf.constant)

Help on function constant in module tensorflow.python.framework.constant_op:

constant(value, dtype=None, shape=None, name='Const')
    Creates a constant tensor from a tensor-like object.
    
    Note: All eager `tf.Tensor` values are immutable (in contrast to
    `tf.Variable`). There is nothing especially _constant_ about the value
    returned from `tf.constant`. This function is not fundamentally different from
    `tf.convert_to_tensor`. The name `tf.constant` comes from the `value` being
    embedded in a `Const` node in the `tf.Graph`. `tf.constant` is useful
    for asserting that the value can be embedded that way.
    
    If the argument `dtype` is not specified, then the type is inferred from
    the type of `value`.
    
    >>> # Constant 1-D Tensor from a python list.
    >>> tf.constant([1, 2, 3, 4, 5, 6])
    <tf.Tensor: shape=(6,), dtype=int32,
        numpy=array([1, 2, 3, 4, 5, 6], dtype=int32)>
    >>> # Or a numpy array
    >>> a = np.array([[1, 2, 3], [4, 5, 6]

In [4]:
# Create a scalar (rank 0 tensor)
scalar = tf.constant(value=7, 
                     dtype=None, 
                     shape=None, 
                     name='scalar')
scalar

<tf.Tensor: shape=(), dtype=int32, numpy=7>

A scalar is known as a rank 0 tensor. Because it has no dimensions (it's just a number).

> 🔑 **Note:** For now, you don't need to know too much about the different ranks of tensors (but we will see more on this later). The important point is knowing tensors can have an unlimited range of dimensions (the exact amount will depend on what data you're representing).

In [5]:
type(scalar)

tensorflow.python.framework.ops.EagerTensor

In [6]:
# Check the methods available for `scalar`
for e in dir(scalar):
    if not e.startswith("_"):
        print(e)

OVERLOADABLE_OPERATORS
backing_device
consumers
cpu
device
dtype
eval
experimental_ref
get_shape
gpu
graph
is_packed
name
ndim
numpy
op
ref
set_shape
shape
value_index


In [7]:
# Check the number of dimensions of a tensor 
# ndim stands for number of dimensions
scalar.ndim

0

In [8]:
# Create a vector (more than 0 dimensions)
vector = tf.constant(value=[10, 10], 
                     dtype=None, 
                     shape=None, 
                     name='vector')
vector

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([10, 10], dtype=int32)>

In [9]:
type(vector)

tensorflow.python.framework.ops.EagerTensor

In [10]:
# Check the number of dimensions of our vector tensor
vector.ndim

1

In [11]:
# Create a matrix (more than 1 dimension)
matrix = tf.constant(value=[[10, 7],
                      [7, 10]], 
                      dtype=None, 
                      shape=None, 
                      name='matrix')
matrix

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[10,  7],
       [ 7, 10]], dtype=int32)>

In [12]:
matrix.ndim

2

By default, TensorFlow creates tensors with either an `int32` or `float32` datatype.

This is known as [32-bit precision](https://en.wikipedia.org/wiki/Precision_(computer_science) (the higher the number, the more precise the number, the more space it takes up on your computer).

In [13]:
# Create another matrix and define the datatype
# Specify the datatype with 'dtype'
another_matrix = tf.constant(value=[[10., 7.],
                              [3., 2.],
                              [8., 9.]], 
                              dtype=tf.float16, 
                              shape=None, 
                              name='another_matrix') 
another_matrix

<tf.Tensor: shape=(3, 2), dtype=float16, numpy=
array([[10.,  7.],
       [ 3.,  2.],
       [ 8.,  9.]], dtype=float16)>

In [14]:
# Even though another_matrix contains more numbers, its dimensions stay the same
another_matrix.ndim

2

In [15]:
# How about a tensor? (more than 2 dimensions, although, all of the above items are also technically tensors)
tensor = tf.constant(value=[[[1, 2, 3],
                       [4, 5, 6]],
                      [[7, 8, 9],
                       [10, 11, 12]],
                      [[13, 14, 15],
                       [16, 17, 18]]], 
                       dtype=None, 
                       shape=None, 
                       name='tensor')
tensor

<tf.Tensor: shape=(3, 2, 3), dtype=int32, numpy=
array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]],

       [[13, 14, 15],
        [16, 17, 18]]], dtype=int32)>

In [16]:
tensor.ndim

3

This is known as a rank 3 tensor (3-dimensions), however a tensor can have an arbitrary (unlimited) amount of dimensions.

For example, you might turn a series of images into tensors with shape (224, 224, 3, 32), where:
* 224, 224 (the first 2 dimensions) are the height and width of the images in pixels.
* 3 is the number of colour channels of the image (red, green blue).
* 32 is the batch size (the number of images a neural network sees at any one time).

All of the above variables we've created are actually tensors. But you may also hear them referred to as their different names (the ones we gave them):
* **scalar**: a single number.
* **vector**: a number with direction (e.g. wind speed with direction).
* **matrix**: a 2-dimensional array of numbers.
* **tensor**: an n-dimensional array of numbers (where n can be any number, a 0-dimension tensor is a scalar, a 1-dimension tensor is a vector). 

To add to the confusion, the terms matrix and tensor are often used interchangeably.

Going forward since we're using TensorFlow, everything we refer to and use will be tensors.

For more on the mathematical difference between scalars, vectors and matrices see the [visual algebra post by Math is Fun](https://www.mathsisfun.com/algebra/scalar-vector-matrix.html).

![difference between scalar, vector, matrix, tensor](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/00-scalar-vector-matrix-tensor.png)

### Creating Tensors with `tf.Variable()`

You can also (although you likely rarely will, because often, when working with data, tensors are created for you automatically) create tensors using [`tf.Variable()`](https://www.tensorflow.org/api_docs/python/tf/Variable).

The difference between `tf.Variable()` and `tf.constant()` is tensors created with `tf.constant()` are immutable (can't be changed, can only be used to create a new tensor), where as, tensors created with `tf.Variable()` are mutable (can be changed).

In [17]:
help(tf.Variable)

Help on class Variable in module tensorflow.python.ops.variables:

class Variable(tensorflow.python.trackable.base.Trackable)
 |  Variable(*args, **kwargs)
 |  
 |  See the [variable guide](https://tensorflow.org/guide/variable).
 |  
 |  A variable maintains shared, persistent state manipulated by a program.
 |  
 |  The `Variable()` constructor requires an initial value for the variable, which
 |  can be a `Tensor` of any type and shape. This initial value defines the type
 |  and shape of the variable. After construction, the type and shape of the
 |  variable are fixed. The value can be changed using one of the assign methods.
 |  
 |  >>> v = tf.Variable(1.)
 |  >>> v.assign(2.)
 |  <tf.Variable ... shape=() dtype=float32, numpy=2.0>
 |  >>> v.assign_add(0.5)
 |  <tf.Variable ... shape=() dtype=float32, numpy=2.5>
 |  
 |  The `shape` argument to `Variable`'s constructor allows you to construct a
 |  variable with a less defined shape than its `initial_value`:
 |  
 |  >>> v = tf.

In [18]:
# Create the same tensor with tf.Variable() and tf.constant()
changeable_tensor = tf.Variable(initial_value=[10, 7], 
                                trainable=None, 
                                validate_shape=True, 
                                caching_device=None, 
                                name='changeable_tensor', 
                                variable_def=None, 
                                dtype=None, 
                                import_scope=None, 
                                constraint=None, 
                                synchronization=tf.VariableSynchronization.AUTO, 
                                aggregation=tf.VariableAggregation.NONE)

unchangeable_tensor = tf.constant(value=[10, 7], 
                                  dtype=None, 
                                  shape=None, 
                                  name='unchangeable_tensor')

changeable_tensor, unchangeable_tensor

(<tf.Variable 'changeable_tensor:0' shape=(2,) dtype=int32, numpy=array([10,  7], dtype=int32)>,
 <tf.Tensor: shape=(2,), dtype=int32, numpy=array([10,  7], dtype=int32)>)

Now let's try to change one of the elements of the changeable tensor.

In [19]:
# Will error (requires the .assign() method)
try:
    changeable_tensor[0] = 7
except TypeError as e:
    print("Cannot modify a tf.Variable without using the .assign() method")
finally:
    changeable_tensor

Cannot modify a tf.Variable without using the .assign() method


To change an element of a `tf.Variable()` tensor requires the `assign()` method.

In [20]:
help(changeable_tensor.assign)

Help on method assign in module tensorflow.python.ops.resource_variable_ops:

assign(value, use_locking=None, name=None, read_value=True) method of tensorflow.python.ops.resource_variable_ops.ResourceVariable instance
    Assigns a new value to this variable.
    
    Args:
      value: A `Tensor`. The new value for this variable.
      use_locking: If `True`, use locking during the assignment.
      name: The name to use for the assignment.
      read_value: A `bool`. Whether to read and return the new value of the
        variable or not.
    
    Returns:
      If `read_value` is `True`, this method will return the new value of the
      variable after the assignment has completed. Otherwise, when in graph mode
      it will return the `Operation` that does the assignment, and when in eager
      mode it will return `None`.



In [21]:
# Won't error
changeable_tensor[0].assign(7, 
                            # use_locking=False, 
                            # read_value=True,
                            name=None)
                            
changeable_tensor

<tf.Variable 'changeable_tensor:0' shape=(2,) dtype=int32, numpy=array([7, 7], dtype=int32)>

Now let's try to change a value in a `tf.constant()` tensor.

In [22]:
# Will error (can't change tf.constant())
try:
    unchangeable_tensor[0].assign(7, 
                              # use_locking=False,
                              # read_value=True,
                              name=None)
except AttributeError as e:
    print("Cannot modify a tf.constant()!")
finally:
    unchangeable_tensor

Cannot modify a tf.constant()!


Which one should you use? `tf.constant()` or `tf.Variable()`?

It will depend on what your problem requires. However, most of the time, TensorFlow will automatically choose for you (when loading data or modelling data).

### Creating random tensors

Random tensors are tensors of some arbitrary size which contain random numbers.

Why would you want to create random tensors? 

This is what neural networks use to intialize their weights (patterns) that they're trying to learn in the data.

For example, the process of a neural network learning often involves taking a random n-dimensional array of numbers and refining them until they represent some kind of pattern (a compressed way to represent the original data).

**How a network learns**
![how a network learns](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/00-how-a-network-learns.png)
*A network learns by starting with random patterns (1) then going through demonstrative examples of data (2) whilst trying to update its random patterns to represent the examples (3).*

We can create random tensors by using the [`tf.random.Generator`](https://www.tensorflow.org/guide/random_numbers#the_tfrandomgenerator_class) class.

In [23]:
help(tf.random.Generator.from_seed)

Help on method from_seed in module tensorflow.python.ops.stateful_random_ops:

from_seed(seed, alg=None) method of builtins.type instance
    Creates a generator from a seed.
    
    A seed is a 1024-bit unsigned integer represented either as a Python
    integer or a vector of integers. Seeds shorter than 1024-bit will be
    padded. The padding, the internal structure of a seed and the way a seed
    is converted to a state are all opaque (unspecified). The only semantics
    specification of seeds is that two different seeds are likely to produce
    two independent generators (but no guarantee).
    
    Args:
      seed: the seed for the RNG.
      alg: (optional) the RNG algorithm. If None, it will be auto-selected. See
        `__init__` for its possible values.
    
    Returns:
      The new generator.



In [24]:
# Create two random (but the same) tensors
# Set the seed for reproducibility
random_generator_1 = tf.random.Generator.from_seed(seed=42, alg=None)

In [25]:
type(random_generator_1)

tensorflow.python.ops.stateful_random_ops.Generator

In [26]:
# Check the methods available for `random_generator_1`
for e in dir(random_generator_1):
    if not e.startswith("_"):
        print(e)

algorithm
binomial
from_key_counter
from_non_deterministic_state
from_seed
from_state
key
make_seeds
normal
reset
reset_from_key_counter
reset_from_seed
skip
split
state
truncated_normal
uniform
uniform_full_int


In [27]:
help(random_generator_1.normal)

Help on method normal in module tensorflow.python.ops.stateful_random_ops:

normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, name=None) method of tensorflow.python.ops.stateful_random_ops.Generator instance
    Outputs random values from a normal distribution.
    
    Args:
      shape: A 1-D integer Tensor or Python array. The shape of the output
        tensor.
      mean: A 0-D Tensor or Python value of type `dtype`. The mean of the normal
        distribution.
      stddev: A 0-D Tensor or Python value of type `dtype`. The standard
        deviation of the normal distribution.
      dtype: The type of the output.
      name: A name for the operation (optional).
    
    Returns:
      A tensor of the specified shape filled with random normal values.



In [28]:
# Create tensor of size 3 x 2 from a normal distribution
random_1 = random_generator_1.normal(shape=(3, 2), 
                                     mean=0.0, 
                                     stddev=1.0, 
                                     dtype=tf.float32, 
                                     name='random_1')  

random_generator_2 = tf.random.Generator.from_seed(seed=42, alg=None)
random_2 = random_generator_2.normal(shape=(3, 2), 
                                     mean=0.0, 
                                     stddev=1.0, 
                                     dtype=tf.float32, 
                                     name='random_2')

# Are they equal?
random_1, random_2, random_1 == random_2

(<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[-0.7565803 , -0.06854702],
        [ 0.07595026, -1.2573844 ],
        [-0.23193765, -1.8107855 ]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[-0.7565803 , -0.06854702],
        [ 0.07595026, -1.2573844 ],
        [-0.23193765, -1.8107855 ]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=bool, numpy=
 array([[ True,  True],
        [ True,  True],
        [ True,  True]])>)

The random tensors we've made are actually [pseudorandom numbers](https://www.computerhope.com/jargon/p/pseudo-random.htm) (they appear as random, but really aren't).

If we set a seed we'll get the same random numbers (if you've ever used NumPy, this is similar to `np.random.seed(42)`). 

Setting the seed says, "hey, create some random numbers, but flavour them with X" (X is the seed).

What do you think will happen when we change the seed?

In [29]:
# Create two random (and different) tensors
random_generator_3 = tf.random.Generator.from_seed(seed=42, alg=None)
random_3 = random_generator_3.normal(shape=(3, 2), 
                                     mean=0.0, 
                                     stddev=1.0, 
                                     dtype=tf.float32, 
                                     name='random_3')

random_generator_4 = tf.random.Generator.from_seed(seed=11, alg=None)
random_4 = random_generator_4.normal(shape=(3, 2), 
                                     mean=0.0, 
                                     stddev=1.0, 
                                     dtype=tf.float32, 
                                     name='random_4')

# Check the tensors and see if they are equal
random_3, random_4, random_1 == random_3, random_3 == random_4

(<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[-0.7565803 , -0.06854702],
        [ 0.07595026, -1.2573844 ],
        [-0.23193765, -1.8107855 ]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[ 0.2730574 , -0.29925638],
        [-0.3652325 ,  0.61883307],
        [-1.0130816 ,  0.2829171 ]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=bool, numpy=
 array([[ True,  True],
        [ True,  True],
        [ True,  True]])>,
 <tf.Tensor: shape=(3, 2), dtype=bool, numpy=
 array([[False, False],
        [False, False],
        [False, False]])>)

What if you wanted to shuffle the order of a tensor?

Wait, why would you want to do that?

Let's say you working with 15,000 images of cats and dogs and the first 10,000 images were of cats and the next 5,000 were of dogs. This order could effect how a neural network learns (it may overfit by learning the order of the data), instead, it might be a good idea to move your data around.

In [30]:
help(tf.random.shuffle)

Help on function random_shuffle in module tensorflow.python.ops.random_ops:

random_shuffle(value, seed=None, name=None)
    Randomly shuffles a tensor along its first dimension.
    
    The tensor is shuffled along dimension 0, such that each `value[j]` is mapped
    to one and only one `output[i]`. For example, a mapping that might occur for a
    3x2 tensor is:
    
    ```python
    [[1, 2],       [[5, 6],
     [3, 4],  ==>   [1, 2],
     [5, 6]]        [3, 4]]
    ```
    
    Args:
      value: A Tensor to be shuffled.
      seed: A Python integer. Used to create a random seed for the distribution.
        See
        `tf.random.set_seed`
        for behavior.
      name: A name for the operation (optional).
    
    Returns:
      A tensor of same shape and type as `value`, shuffled along its first
      dimension.



In [31]:
# Shuffle a tensor (valuable for when you want to shuffle your data)
not_shuffled = tf.constant(value=[[10, 7],
                            [3, 4],
                            [2, 5]], 
                            dtype=None, 
                            shape=None, 
                            name='not_shuffled')

# Gets different results each time
tf.random.shuffle(value=not_shuffled, 
                  seed=None, 
                  name=None)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[10,  7],
       [ 2,  5],
       [ 3,  4]], dtype=int32)>

In [32]:
# Shuffle in the same order every time using the seed parameter (won't acutally be the same)
tf.random.shuffle(value=not_shuffled, 
                  seed=42,
                  name=None)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[ 2,  5],
       [ 3,  4],
       [10,  7]], dtype=int32)>

Wait... why didn't the numbers come out the same?

It's due to rule #4 of the [`tf.random.set_seed()`](https://www.tensorflow.org/api_docs/python/tf/random/set_seed) documentation.

> "4. If both the global and the operation seed are set: Both seeds are used in conjunction to determine the random sequence."

`tf.random.set_seed(42)` sets the global seed, and the `seed` parameter in `tf.random.shuffle(seed=42)` sets the operation seed.

Because, "Operations that rely on a random seed actually derive it from two seeds: the global and operation-level seeds. This sets the global seed."


In [33]:
# Shuffle in the same order every time

# Set the global random seed
tf.random.set_seed(seed=42)

# Set the operation random seed
tf.random.shuffle(value=not_shuffled, 
                  seed=42,
                  name=None)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4],
       [ 2,  5]], dtype=int32)>

In [34]:
help(tf.random.set_seed)

Help on function set_seed in module tensorflow.python.framework.random_seed:

set_seed(seed)
    Sets the global random seed.
    
    Operations that rely on a random seed actually derive it from two seeds:
    the global and operation-level seeds. This sets the global seed.
    
    Its interactions with operation-level seeds is as follows:
    
      1. If neither the global seed nor the operation seed is set: A randomly
        picked seed is used for this op.
      2. If the global seed is set, but the operation seed is not:
        The system deterministically picks an operation seed in conjunction with
        the global seed so that it gets a unique random sequence. Within the
        same version of tensorflow and user code, this sequence is deterministic.
        However across different versions, this sequence might change. If the
        code depends on particular seeds to work, specify both global
        and operation-level seeds explicitly.
      3. If the operation seed

In [35]:
# Set the global random seed
tf.random.set_seed(seed=42)

# Set the operation random seed
tf.random.shuffle(value=not_shuffled, 
                  name=None,
                  seed=None)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[ 3,  4],
       [ 2,  5],
       [10,  7]], dtype=int32)>

### Other ways to make tensors

Though you might rarely use these (remember, many tensor operations are done behind the scenes for you), you can use [`tf.ones()`](https://www.tensorflow.org/api_docs/python/tf/ones) to create a tensor of all ones and [`tf.zeros()`](https://www.tensorflow.org/api_docs/python/tf/zeros) to create a tensor of all zeros.

In [36]:
help(tf.ones)

Help on function ones in module tensorflow.python.ops.array_ops:

ones(shape, dtype=tf.float32, name=None)
    Creates a tensor with all elements set to one (1).
    
    See also `tf.ones_like`, `tf.zeros`, `tf.fill`, `tf.eye`.
    
    This operation returns a tensor of type `dtype` with shape `shape` and
    all elements set to one.
    
    >>> tf.ones([3, 4], tf.int32)
    <tf.Tensor: shape=(3, 4), dtype=int32, numpy=
    array([[1, 1, 1, 1],
           [1, 1, 1, 1],
           [1, 1, 1, 1]], dtype=int32)>
    
    Args:
      shape: A `list` of integers, a `tuple` of integers, or
        a 1-D `Tensor` of type `int32`.
      dtype: Optional DType of an element in the resulting `Tensor`. Default is
        `tf.float32`.
      name: Optional string. A name for the operation.
    
    Returns:
      A `Tensor` with all elements set to one (1).



In [37]:
# Make a tensor of all ones
tf.ones(shape=(3, 2), 
        dtype=tf.float32, 
        name=None)

<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
array([[1., 1.],
       [1., 1.],
       [1., 1.]], dtype=float32)>

In [38]:
help(tf.zeros)

Help on function zeros in module tensorflow.python.ops.array_ops:

zeros(shape, dtype=tf.float32, name=None)
    Creates a tensor with all elements set to zero.
    
    See also `tf.zeros_like`, `tf.ones`, `tf.fill`, `tf.eye`.
    
    This operation returns a tensor of type `dtype` with shape `shape` and
    all elements set to zero.
    
    >>> tf.zeros([3, 4], tf.int32)
    <tf.Tensor: shape=(3, 4), dtype=int32, numpy=
    array([[0, 0, 0, 0],
           [0, 0, 0, 0],
           [0, 0, 0, 0]], dtype=int32)>
    
    Args:
      shape: A `list` of integers, a `tuple` of integers, or
        a 1-D `Tensor` of type `int32`.
      dtype: The DType of an element in the resulting `Tensor`.
      name: Optional string. A name for the operation.
    
    Returns:
      A `Tensor` with all elements set to zero.



In [39]:
# Make a tensor of all zeros
tf.zeros(shape=(3, 2), 
         dtype=tf.float32, 
         name=None)

<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
array([[0., 0.],
       [0., 0.],
       [0., 0.]], dtype=float32)>

You can also turn NumPy arrays in into tensors.

Remember, the main difference between tensors and NumPy arrays is that tensors can be run on GPUs.

> 🔑 **Note:** A matrix or tensor is typically represented by a capital letter (e.g. `X` or `A`) where as a vector is typically represented by a lowercase letter (e.g. `y` or `b`).

In [40]:
import numpy as np

# Create a NumPy array between 1 and 25
numpy_A = np.arange(start=1, 
                    stop=25, 
                    dtype=np.int32) 

# NOTE: The shape total (2*4*3) has to match the number of elements in the array
A = tf.constant(value=numpy_A,  
                shape=[2, 4, 3], 
                dtype=tf.int32, 
                name='A')
 
numpy_A, A

(array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19, 20, 21, 22, 23, 24], dtype=int32),
 <tf.Tensor: shape=(2, 4, 3), dtype=int32, numpy=
 array([[[ 1,  2,  3],
         [ 4,  5,  6],
         [ 7,  8,  9],
         [10, 11, 12]],
 
        [[13, 14, 15],
         [16, 17, 18],
         [19, 20, 21],
         [22, 23, 24]]], dtype=int32)>)

In [41]:
help(np.arange)

Help on built-in function arange in module numpy:

arange(...)
    arange([start,] stop[, step,], dtype=None, *, like=None)
    
    Return evenly spaced values within a given interval.
    
    ``arange`` can be called with a varying number of positional arguments:
    
    * ``arange(stop)``: Values are generated within the half-open interval
      ``[0, stop)`` (in other words, the interval including `start` but
      excluding `stop`).
    * ``arange(start, stop)``: Values are generated within the half-open
      interval ``[start, stop)``.
    * ``arange(start, stop, step)`` Values are generated within the half-open
      interval ``[start, stop)``, with spacing between values given by
      ``step``.
    
    For integer arguments the function is roughly equivalent to the Python
    built-in :py:class:`range`, but returns an ndarray rather than a ``range``
    instance.
    
    When using a non-integer step, such as 0.1, it is often better to use
    `numpy.linspace`.
    
    


## Getting information from tensors (shape, rank, size)

There will be times when you'll want to get different pieces of information from your tensors, in particular, you should know the following tensor vocabulary:
* **Shape:** The length (number of elements) of each of the dimensions of a tensor.
* **Rank:** The number of tensor dimensions. A scalar has rank 0, a vector has rank 1, a matrix is rank 2, a tensor has rank n.
* **Axis** or **Dimension:** A particular dimension of a tensor.
* **Size:** The total number of items in the tensor.

You'll use these especially when you're trying to line up the shapes of your data to the shapes of your model. For example, making sure the shape of your image tensors are the same shape as your models input layer.

We've already seen one of these before using the `ndim` attribute. Let's see the rest.

In [42]:
# Create a rank 4 tensor (4 dimensions)
rank_4_tensor = tf.zeros(shape=[2, 3, 4, 5], 
                         dtype=tf.float32, 
                         name='rank_4_tensor')
rank_4_tensor

<tf.Tensor: shape=(2, 3, 4, 5), dtype=float32, numpy=
array([[[[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]]],


       [[[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]]]], dtype=float32)>

In [43]:
help(tf.size)

Help on function size_v2 in module tensorflow.python.ops.array_ops:

size_v2(input, out_type=tf.int32, name=None)
    Returns the size of a tensor.
    
    See also `tf.shape`.
    
    Returns a 0-D `Tensor` representing the number of elements in `input`
    of type `out_type`. Defaults to tf.int32.
    
    For example:
    
    >>> t = tf.constant([[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]])
    >>> tf.size(t)
    <tf.Tensor: shape=(), dtype=int32, numpy=12>
    
    Args:
      input: A `Tensor` or `SparseTensor`.
      name: A name for the operation (optional).
      out_type: (Optional) The specified non-quantized numeric output type of the
        operation. Defaults to `tf.int32`.
    
    Returns:
      A `Tensor` of type `out_type`. Defaults to `tf.int32`.
    
    @compatibility(numpy)
    Equivalent to np.size()
    @end_compatibility



In [44]:
rank_4_tensor.shape, rank_4_tensor.ndim, tf.size(input=rank_4_tensor, out_type=tf.int32, name=None)

(TensorShape([2, 3, 4, 5]), 4, <tf.Tensor: shape=(), dtype=int32, numpy=120>)

In [45]:
# Get various attributes of tensor
print("Datatype of every element:", rank_4_tensor.dtype)
print("Number of dimensions (rank):", rank_4_tensor.ndim)
print("Shape of tensor:", rank_4_tensor.shape)
print("Elements along axis 0 of tensor:", rank_4_tensor.shape[0])
print("Elements along last axis of tensor:", rank_4_tensor.shape[-1])
# NOTE: .numpy() converts to NumPy array
print("Total number of elements (2*3*4*5):", tf.size(input=rank_4_tensor, out_type=tf.int32, name=None).numpy()) 

Datatype of every element: <dtype: 'float32'>
Number of dimensions (rank): 4
Shape of tensor: (2, 3, 4, 5)
Elements along axis 0 of tensor: 2
Elements along last axis of tensor: 5
Total number of elements (2*3*4*5): 120


You can also index tensors just like Python lists.

In [46]:
# Get the first 2 items of each dimension
rank_4_tensor[:2, :2, :2, :2]

<tf.Tensor: shape=(2, 2, 2, 2), dtype=float32, numpy=
array([[[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]],


       [[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]]], dtype=float32)>

In [47]:
# Get the dimension from each index except for the final one
rank_4_tensor[:1, :1, :1, :]

<tf.Tensor: shape=(1, 1, 1, 5), dtype=float32, numpy=array([[[[0., 0., 0., 0., 0.]]]], dtype=float32)>

In [48]:
# Create a rank 2 tensor (2 dimensions)
rank_2_tensor = tf.constant(value=[[10, 7],
                             [3, 4]], 
                             dtype=tf.float32, 
                             shape=None,
                             name='rank_2_tensor')

# Get the last item of each row
rank_2_tensor[:, -1]

<tf.Tensor: shape=(2,), dtype=float32, numpy=array([7., 4.], dtype=float32)>

You can also add dimensions to your tensor whilst keeping the same information present using `tf.newaxis`. 

In [49]:
help(tf.newaxis)

Help on NoneType object:

class NoneType(object)
 |  Methods defined here:
 |  
 |  __bool__(self, /)
 |      True if self else False
 |  
 |  __repr__(self, /)
 |      Return repr(self).
 |  
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.



In [50]:
# Add an extra dimension (to the end)
# NOTE: in Python "..." means "all dimensions prior to"
rank_3_tensor = rank_2_tensor[..., tf.newaxis] 
# shape (2, 2), shape (2, 2, 1)
rank_2_tensor, rank_3_tensor 

(<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
 array([[10.,  7.],
        [ 3.,  4.]], dtype=float32)>,
 <tf.Tensor: shape=(2, 2, 1), dtype=float32, numpy=
 array([[[10.],
         [ 7.]],
 
        [[ 3.],
         [ 4.]]], dtype=float32)>)

You can achieve the same using [`tf.expand_dims()`](https://www.tensorflow.org/api_docs/python/tf/expand_dims).

In [51]:
help(tf.expand_dims)

Help on function expand_dims_v2 in module tensorflow.python.ops.array_ops:

expand_dims_v2(input, axis, name=None)
    Returns a tensor with a length 1 axis inserted at index `axis`.
    
    Given a tensor `input`, this operation inserts a dimension of length 1 at the
    dimension index `axis` of `input`'s shape. The dimension index follows Python
    indexing rules: It's zero-based, a negative index it is counted backward
    from the end.
    
    This operation is useful to:
    
    * Add an outer "batch" dimension to a single element.
    * Align axes for broadcasting.
    * To add an inner vector length axis to a tensor of scalars.
    
    For example:
    
    If you have a single image of shape `[height, width, channels]`:
    
    >>> image = tf.zeros([10,10,3])
    
    You can add an outer `batch` axis by passing `axis=0`:
    
    >>> tf.expand_dims(image, axis=0).shape.as_list()
    [1, 10, 10, 3]
    
    The new axis location matches Python `list.insert(axis, 1)`:
   

In [52]:
# NOTE: "-1" means last axis
tf.expand_dims(input=rank_2_tensor, 
               axis=-1, 
               name=None) 

<tf.Tensor: shape=(2, 2, 1), dtype=float32, numpy=
array([[[10.],
        [ 7.]],

       [[ 3.],
        [ 4.]]], dtype=float32)>

## Manipulating tensors (tensor operations)

Finding patterns in tensors (numerical representation of data) requires manipulating them.

Again, when building models in TensorFlow, much of this pattern discovery is done for you.

### Basic operations

You can perform many of the basic mathematical operations directly on tensors using Python operators such as, `+`, `-`, `*`.

In [53]:
# You can add values to a tensor using the addition operator
tensor = tf.constant(value=[[10, 7], [3, 4]], 
                     dtype=tf.float32, 
                     shape=None,
                     name='tensor')
tensor + 10

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[20., 17.],
       [13., 14.]], dtype=float32)>

Since we used `tf.constant()`, the original tensor is unchanged (the addition gets done on a copy).

In [54]:
# Original tensor unchanged
tensor

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[10.,  7.],
       [ 3.,  4.]], dtype=float32)>

Other operators also work.

In [55]:
# Multiplication (known as element-wise multiplication)
tensor * 10

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[100.,  70.],
       [ 30.,  40.]], dtype=float32)>

In [56]:
# Subtraction
tensor - 10

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[ 0., -3.],
       [-7., -6.]], dtype=float32)>

You can also use the equivalent TensorFlow function. Using the TensorFlow function (where possible) has the advantage of being sped up later down the line when running as part of a [TensorFlow graph](https://www.tensorflow.org/tensorboard/graphs).

In [57]:
help(tf.multiply)

Help on function multiply in module tensorflow.python.ops.math_ops:

multiply(x, y, name=None)
    Returns an element-wise x * y.
    
    For example:
    
    >>> x = tf.constant(([1, 2, 3, 4]))
    >>> tf.math.multiply(x, x)
    <tf.Tensor: shape=(4,), dtype=..., numpy=array([ 1,  4,  9, 16], dtype=int32)>
    
    Since `tf.math.multiply` will convert its arguments to `Tensor`s, you can also
    pass in non-`Tensor` arguments:
    
    >>> tf.math.multiply(7,6)
    <tf.Tensor: shape=(), dtype=int32, numpy=42>
    
    If `x.shape` is not the same as `y.shape`, they will be broadcast to a
    compatible shape. (More about broadcasting
    [here](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html).)
    
    For example:
    
    >>> x = tf.ones([1, 2]);
    >>> y = tf.ones([2, 1]);
    >>> x * y  # Taking advantage of operator overriding
    <tf.Tensor: shape=(2, 2), dtype=float32, numpy=
    array([[1., 1.],
         [1., 1.]], dtype=float32)>
    
    The reduction ver

In [58]:
# Use the tensorflow function equivalent of the '*' (multiply) operator
tf.multiply(x=tensor, 
            y=10, 
            name=None)

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[100.,  70.],
       [ 30.,  40.]], dtype=float32)>

In [59]:
# The original tensor is still unchanged
tensor

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[10.,  7.],
       [ 3.,  4.]], dtype=float32)>

### Matrix mutliplication

One of the most common operations in machine learning algorithms is [matrix multiplication](https://www.mathsisfun.com/algebra/matrix-multiplying.html).

TensorFlow implements this matrix multiplication functionality in the [`tf.matmul()`](https://www.tensorflow.org/api_docs/python/tf/linalg/matmul) method.

The main two rules for matrix multiplication to remember are:
1. The inner dimensions must match:
  * `(3, 5) @ (3, 5)` won't work
  * `(5, 3) @ (3, 5)` will work
  * `(3, 5) @ (5, 3)` will work
2. The resulting matrix has the shape of the outer dimensions:
 * `(5, 3) @ (3, 5)` -> `(5, 5)`
 * `(3, 5) @ (5, 3)` -> `(3, 3)`

> 🔑 **Note:** '`@`' in Python is the symbol for matrix multiplication.

In [60]:
help(tf.matmul)

Help on function matmul in module tensorflow.python.ops.math_ops:

matmul(a, b, transpose_a=False, transpose_b=False, adjoint_a=False, adjoint_b=False, a_is_sparse=False, b_is_sparse=False, output_type=None, name=None)
    Multiplies matrix `a` by matrix `b`, producing `a` * `b`.
    
    The inputs must, following any transpositions, be tensors of rank >= 2
    where the inner 2 dimensions specify valid matrix multiplication dimensions,
    and any further outer dimensions specify matching batch size.
    
    Both matrices must be of the same type. The supported types are:
    `bfloat16`, `float16`, `float32`, `float64`, `int32`, `int64`,
    `complex64`, `complex128`.
    
    Either matrix can be transposed or adjointed (conjugated and transposed) on
    the fly by setting one of the corresponding flag to `True`. These are `False`
    by default.
    
    If one or both of the matrices contain a lot of zeros, a more efficient
    multiplication algorithm can be used by setting the 

In [61]:
# Matrix multiplication in TensorFlow
print(tensor)
tf.matmul(a=tensor, 
          b=tensor, 
          transpose_a=False, 
          transpose_b=False, 
          adjoint_a=False, 
          adjoint_b=False, 
          a_is_sparse=False, 
          b_is_sparse=False, 
          name=None)

tf.Tensor(
[[10.  7.]
 [ 3.  4.]], shape=(2, 2), dtype=float32)


<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[121.,  98.],
       [ 42.,  37.]], dtype=float32)>

In [62]:
# Matrix multiplication with Python operator '@'
tensor @ tensor

<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[121.,  98.],
       [ 42.,  37.]], dtype=float32)>

Both of these examples work because our `tensor` variable is of shape (2, 2).

What if we created some tensors which had mismatched shapes?

In [63]:
# Create (3, 2) tensor
X = tf.constant(value=[[1, 2],
                 [3, 4],
                 [5, 6]], 
                 dtype=None, 
                 shape=None,
                 name='X')

# Create another (3, 2) tensor
Y = tf.constant(value=[[7, 8],
                 [9, 10],
                 [11, 12]], 
                 dtype=None, 
                 shape=None,
                 name='Y')
X, Y

(<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[1, 2],
        [3, 4],
        [5, 6]], dtype=int32)>,
 <tf.Tensor: shape=(3, 2), dtype=int32, numpy=
 array([[ 7,  8],
        [ 9, 10],
        [11, 12]], dtype=int32)>)

In [64]:
# Try to matrix multiply them (will error)
from tensorflow.errors import InvalidArgumentError

try:
    X @ Y
except InvalidArgumentError as e:
    print(f"Error: {e}")

Error: {{function_node __wrapped__MatMul_device_/job:localhost/replica:0/task:0/device:CPU:0}} Matrix size-incompatible: In[0]: [3,2], In[1]: [3,2] [Op:MatMul]


Trying to matrix multiply two tensors with the shape `(3, 2)` errors because the inner dimensions don't match.

We need to either:
* Reshape X to `(2, 3)` so it's `(2, 3) @ (3, 2)`.
* Reshape Y to `(3, 2)` so it's `(3, 2) @ (2, 3)`.

We can do this with either:
* [`tf.reshape()`](https://www.tensorflow.org/api_docs/python/tf/reshape) - allows us to reshape a tensor into a defined shape.
* [`tf.transpose()`](https://www.tensorflow.org/api_docs/python/tf/transpose) - switches the dimensions of a given tensor.

![lining up dimensions for dot products](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/00-lining-up-dot-products.png)

Let's try `tf.reshape()` first.

In [65]:
help(tf.reshape)

Help on function reshape in module tensorflow.python.ops.array_ops:

reshape(tensor, shape, name=None)
    Reshapes a tensor.
    
    Given `tensor`, this operation returns a new `tf.Tensor` that has the same
    values as `tensor` in the same order, except with a new shape given by
    `shape`.
    
    >>> t1 = [[1, 2, 3],
    ...       [4, 5, 6]]
    >>> print(tf.shape(t1).numpy())
    [2 3]
    >>> t2 = tf.reshape(t1, [6])
    >>> t2
    <tf.Tensor: shape=(6,), dtype=int32,
      numpy=array([1, 2, 3, 4, 5, 6], dtype=int32)>
    >>> tf.reshape(t2, [3, 2])
    <tf.Tensor: shape=(3, 2), dtype=int32, numpy=
      array([[1, 2],
             [3, 4],
             [5, 6]], dtype=int32)>
    
    The `tf.reshape` does not change the order of or the total number of elements
    in the tensor, and so it can reuse the underlying data buffer. This makes it
    a fast operation independent of how big of a tensor it is operating on.
    
    >>> tf.reshape([1, 2, 3], [2, 2])
    Traceback (mos

In [66]:
# Example of reshape (3, 2) -> (2, 3)
tf.reshape(tensor=Y, 
           shape=(2, 3), 
           name=None)

<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[ 7,  8,  9],
       [10, 11, 12]], dtype=int32)>

In [67]:
# Try matrix multiplication with reshaped Y
X @ tf.reshape(tensor=Y, shape=(2, 3), name=None)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 27,  30,  33],
       [ 61,  68,  75],
       [ 95, 106, 117]], dtype=int32)>

It worked, let's try the same with a reshaped `X`, except this time we'll use [`tf.transpose()`](https://www.tensorflow.org/api_docs/python/tf/transpose) and `tf.matmul()`.

In [68]:
help(tf.transpose)

Help on function transpose_v2 in module tensorflow.python.ops.array_ops:

transpose_v2(a, perm=None, conjugate=False, name='transpose')
    Transposes `a`, where `a` is a Tensor.
    
    Permutes the dimensions according to the value of `perm`.
    
    The returned tensor's dimension `i` will correspond to the input dimension
    `perm[i]`. If `perm` is not given, it is set to (n-1...0), where n is the rank
    of the input tensor. Hence, by default, this operation performs a regular
    matrix transpose on 2-D input Tensors.
    
    If conjugate is `True` and `a.dtype` is either `complex64` or `complex128`
    then the values of `a` are conjugated and transposed.
    
    @compatibility(numpy)
    In `numpy` transposes are memory-efficient constant time operations as they
    simply return a new view of the same data with adjusted `strides`.
    
    TensorFlow does not support strides, so `transpose` returns a new tensor with
    the items permuted.
    @end_compatibility
    
   

In [69]:
# Example of transpose (3, 2) -> (2, 3)
tf.transpose(a=X, 
             perm=None, 
             conjugate=False, 
             name=None)

<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[1, 3, 5],
       [2, 4, 6]], dtype=int32)>

In [70]:
# Try matrix multiplication 
tf.matmul(a=tf.transpose(a=X, perm=None, conjugate=False, name=None), 
          b=Y, 
          transpose_a=False, 
          transpose_b=False, 
          adjoint_a=False, 
          adjoint_b=False, 
          a_is_sparse=False, 
          b_is_sparse=False, 
          name=None)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 89,  98],
       [116, 128]], dtype=int32)>

In [71]:
# You can achieve the same result with parameters
tf.matmul(a=X, 
          b=Y, 
          transpose_a=True, 
          transpose_b=False, 
          adjoint_a=False, 
          adjoint_b=False, 
          a_is_sparse=False, 
          b_is_sparse=False, 
          name=None)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 89,  98],
       [116, 128]], dtype=int32)>

Notice the difference in the resulting shapes when tranposing `X` or reshaping `Y`.

This is because of the 2nd rule mentioned above:
 * `(3, 2) @ (2, 3)` -> `(3, 3)` done with `X @ tf.reshape(Y, shape=(2, 3))` 
 * `(2, 3) @ (3, 2)` -> `(2, 2)` done with `tf.matmul(tf.transpose(X), Y)`

This kind of data manipulation is a reminder: you'll spend a lot of your time in machine learning and working with neural networks reshaping data (in the form of tensors) to prepare it to be used with various operations (such as feeding it to a model).

### The dot product

Multiplying matrices by each other is also referred to as the dot product.

You can perform the `tf.matmul()` operation using [`tf.tensordot()`](https://www.tensorflow.org/api_docs/python/tf/tensordot). 

In [72]:
help(tf.tensordot)

Help on function tensordot in module tensorflow.python.ops.math_ops:

tensordot(a, b, axes, name=None)
    Tensor contraction of a and b along specified axes and outer product.
    
    Tensordot (also known as tensor contraction) sums the product of elements
    from `a` and `b` over the indices specified by `axes`.
    
    This operation corresponds to `numpy.tensordot(a, b, axes)`.
    
    Example 1: When `a` and `b` are matrices (order 2), the case `axes=1`
    is equivalent to matrix multiplication.
    
    Example 2: When `a` and `b` are matrices (order 2), the case
    `axes = [[1], [0]]` is equivalent to matrix multiplication.
    
    Example 3: When `a` and `b` are matrices (order 2), the case `axes=0` gives
    the outer product, a tensor of order 4.
    
    Example 4: Suppose that \\(a_{ijk}\\) and \\(b_{lmn}\\) represent two
    tensors of order 3. Then, `contract(a, b, [[0], [2]])` is the order 4 tensor
    \\(c_{jklm}\\) whose entry
    corresponding to the indices \

In [73]:
# Perform the dot product on X and Y (requires X to be transposed)
tf.tensordot(a=tf.transpose(a=X, perm=None, conjugate=False, name=None), 
             b=Y, 
             axes=1, 
             name=None)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 89,  98],
       [116, 128]], dtype=int32)>

You might notice that although using both `reshape` and `tranpose` work, you get different results when using each.

Let's see an example, first with `tf.transpose()` then with `tf.reshape()`.

In [74]:
# Perform matrix multiplication between X and Y (transposed)
tf.matmul(a=X, 
          b=tf.transpose(a=Y, perm=None, conjugate=False, name=None), 
          transpose_a=False, 
          transpose_b=False, 
          adjoint_a=False, 
          adjoint_b=False, 
          a_is_sparse=False, 
          b_is_sparse=False, 
          name=None)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 23,  29,  35],
       [ 53,  67,  81],
       [ 83, 105, 127]], dtype=int32)>

In [75]:
# Perform matrix multiplication between X and Y (reshaped)
tf.matmul(a=X, 
          b=tf.reshape(tensor=Y, shape=(2, 3), name=None), 
          transpose_a=False, 
          transpose_b=False, 
          adjoint_a=False, 
          adjoint_b=False, 
          a_is_sparse=False, 
          b_is_sparse=False, 
          name=None)

<tf.Tensor: shape=(3, 3), dtype=int32, numpy=
array([[ 27,  30,  33],
       [ 61,  68,  75],
       [ 95, 106, 117]], dtype=int32)>

Hmm... they result in different values.

Which is strange because when dealing with `Y` (a `(3x2)` matrix), reshaping to `(2, 3)` and tranposing it result in the same shape.

In [76]:
# Check shapes of Y, reshaped Y and tranposed Y
Y.shape, tf.reshape(tensor=Y, shape=(2, 3), name=None).shape, tf.transpose(a=Y, perm=None, conjugate=False, name=None).shape

(TensorShape([3, 2]), TensorShape([2, 3]), TensorShape([2, 3]))

But calling `tf.reshape()` and `tf.transpose()` on `Y` don't necessarily result in the same values.

In [77]:
# Check values of Y, reshape Y and tranposed Y
print("Normal Y:")
print(Y, end="\n") # "\n" for newline

print("Y reshaped to (2, 3):")
print(tf.reshape(tensor=Y, shape=(2, 3), name=None), end="\n")

print("Y transposed:")
print(tf.transpose(a=Y, perm=None, conjugate=False, name=None))

Normal Y:
tf.Tensor(
[[ 7  8]
 [ 9 10]
 [11 12]], shape=(3, 2), dtype=int32)
Y reshaped to (2, 3):
tf.Tensor(
[[ 7  8  9]
 [10 11 12]], shape=(2, 3), dtype=int32)
Y transposed:
tf.Tensor(
[[ 7  9 11]
 [ 8 10 12]], shape=(2, 3), dtype=int32)


As you can see, the outputs of `tf.reshape()` and `tf.transpose()` when called on `Y`, even though they have the same shape, are different.

This can be explained by the default behaviour of each method:
* [`tf.reshape()`](https://www.tensorflow.org/api_docs/python/tf/reshape) - change the shape of the given tensor (first) and then insert values in order they appear (in our case, 7, 8, 9, 10, 11, 12).
* [`tf.transpose()`](https://www.tensorflow.org/api_docs/python/tf/transpose) - swap the order of the axes, by default the last axis becomes the first, however the order can be changed using the [`perm` parameter](https://www.tensorflow.org/api_docs/python/tf/transpose).

So which should you use?

Again, most of the time these operations (when they need to be run, such as during the training a neural network, will be implemented for you).

But generally, whenever performing a matrix multiplication and the shapes of two matrices don't line up, you will transpose (not reshape) one of them in order to line them up.

### Matrix multiplication tidbits
* If we transposed `Y`, it would be represented as $\mathbf{Y}^\mathsf{T}$ (note the capital T for tranpose).
* Get an illustrative view of matrix multiplication [by Math is Fun](https://www.mathsisfun.com/algebra/matrix-multiplying.html).
* Try a hands-on demo of matrix multiplcation: http://matrixmultiplication.xyz/ (shown below).

![visual demo of matrix multiplication](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/00-matrix-multiply-crop.gif)

### Changing the datatype of a tensor

Sometimes you'll want to alter the default datatype of your tensor. 

This is common when you want to compute using less precision (e.g. 16-bit floating point numbers vs. 32-bit floating point numbers). 

Computing with less precision is useful on devices with less computing capacity such as mobile devices (because the less bits, the less space the computations require).

You can change the datatype of a tensor using [`tf.cast()`](https://www.tensorflow.org/api_docs/python/tf/cast).

In [78]:
help(tf.cast)

Help on function cast in module tensorflow.python.ops.math_ops:

cast(x, dtype, name=None)
    Casts a tensor to a new type.
    
    The operation casts `x` (in case of `Tensor`) or `x.values`
    (in case of `SparseTensor` or `IndexedSlices`) to `dtype`.
    
    For example:
    
    >>> x = tf.constant([1.8, 2.2], dtype=tf.float32)
    >>> tf.cast(x, tf.int32)
    <tf.Tensor: shape=(2,), dtype=int32, numpy=array([1, 2], dtype=int32)>
    
    Notice `tf.cast` has an alias `tf.dtypes.cast`:
    
    >>> x = tf.constant([1.8, 2.2], dtype=tf.float32)
    >>> tf.dtypes.cast(x, tf.int32)
    <tf.Tensor: shape=(2,), dtype=int32, numpy=array([1, 2], dtype=int32)>
    
    The operation supports data types (for `x` and `dtype`) of
    `uint8`, `uint16`, `uint32`, `uint64`, `int8`, `int16`, `int32`, `int64`,
    `float16`, `float32`, `float64`, `complex64`, `complex128`, `bfloat16`.
    In case of casting from complex types (`complex64`, `complex128`) to real
    types, only the real part o

In [79]:
# Create a new tensor with default datatype (float32)
B = tf.constant(value=[1.7, 7.4], 
                dtype=tf.float32, 
                shape=None,
                name='B')

# Create a new tensor with default datatype (int32)
C = tf.constant(value=[1, 7], 
                dtype=tf.int32,
                shape=None, 
                name='C')
B, C

(<tf.Tensor: shape=(2,), dtype=float32, numpy=array([1.7, 7.4], dtype=float32)>,
 <tf.Tensor: shape=(2,), dtype=int32, numpy=array([1, 7], dtype=int32)>)

In [80]:
# Change from float32 to float16 (reduced precision)
B = tf.cast(x=B, 
            dtype=tf.float16, 
            name=None)
B

<tf.Tensor: shape=(2,), dtype=float16, numpy=array([1.7, 7.4], dtype=float16)>

In [81]:
# Change from int32 to float32
C = tf.cast(x=C, 
            dtype=tf.float32, 
            name=None)
C

<tf.Tensor: shape=(2,), dtype=float32, numpy=array([1., 7.], dtype=float32)>

### Getting the absolute value
Sometimes you'll want the absolute values (all values are positive) of elements in your tensors.

To do so, you can use [`tf.abs()`](https://www.tensorflow.org/api_docs/python/tf/math/abs).

In [82]:
help(tf.abs)

Help on function abs in module tensorflow.python.ops.math_ops:

abs(x, name=None)
    Computes the absolute value of a tensor.
    
    Given a tensor of integer or floating-point values, this operation returns a
    tensor of the same type, where each element contains the absolute value of the
    corresponding element in the input.
    
    Given a tensor `x` of complex numbers, this operation returns a tensor of type
    `float32` or `float64` that is the absolute value of each element in `x`. For
    a complex number \\(a + bj\\), its absolute value is computed as
    \\(\sqrt{a^2 + b^2}\\).
    
    For example:
    
    >>> # real number
    >>> x = tf.constant([-2.25, 3.25])
    >>> tf.abs(x)
    <tf.Tensor: shape=(2,), dtype=float32,
    numpy=array([2.25, 3.25], dtype=float32)>
    
    >>> # complex number
    >>> x = tf.constant([[-2.25 + 4.75j], [-3.25 + 5.75j]])
    >>> tf.abs(x)
    <tf.Tensor: shape=(2, 1), dtype=float64, numpy=
    array([[5.25594901],
           [6.604

In [83]:
# Create tensor with negative values
D = tf.constant(value=[-7, -10], 
                dtype=tf.float32, 
                shape=None,
                name='D')
D

<tf.Tensor: shape=(2,), dtype=float32, numpy=array([ -7., -10.], dtype=float32)>

In [84]:
# Get the absolute values
tf.abs(x=D, name=None)

<tf.Tensor: shape=(2,), dtype=float32, numpy=array([ 7., 10.], dtype=float32)>

### Finding the min, max, mean, sum (aggregation)

You can quickly aggregate (perform a calculation on a whole tensor) tensors to find things like the minimum value, maximum value, mean and sum of all the elements.

To do so, aggregation methods typically have the syntax `reduce()_[action]`, such as:
* [`tf.reduce_min()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_min) - find the minimum value in a tensor.
* [`tf.reduce_max()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_max) - find the maximum value in a tensor (helpful for when you want to find the highest prediction probability).
* [`tf.reduce_mean()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_mean) - find the mean of all elements in a tensor.
* [`tf.reduce_sum()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_sum) - find the sum of all elements in a tensor.
* **Note:** typically, each of these is under the `math` module, e.g. `tf.math.reduce_min()` but you can use the alias `tf.reduce_min()`.

Let's see them in action.

In [85]:
# Create a tensor with 50 random values between 0 and 100
E = tf.constant(value=np.random.randint(low=0, high=100, size=50), 
                dtype=None, 
                shape=None,
                name='E')
E

<tf.Tensor: shape=(50,), dtype=int64, numpy=
array([64, 10, 45, 26,  7, 27, 32, 90,  1, 95, 71, 68,  8,  4, 27, 84, 46,
       43, 78, 20, 72, 93, 81, 99, 99, 90, 57, 54, 90, 73, 96, 55, 56,  2,
       80, 31, 79, 68,  2, 96, 57,  0, 96, 77,  2, 40, 81, 93, 35, 51])>

In [86]:
help(tf.reduce_min)

Help on function reduce_min in module tensorflow.python.ops.math_ops:

reduce_min(input_tensor, axis=None, keepdims=False, name=None)
    Computes the `tf.math.minimum` of elements across dimensions of a tensor.
    
    This is the reduction operation for the elementwise `tf.math.minimum` op.
    
    Reduces `input_tensor` along the dimensions given in `axis`.
    Unless `keepdims` is true, the rank of the tensor is reduced by 1 for each
    of the entries in `axis`, which must be unique. If `keepdims` is true, the
    reduced dimensions are retained with length 1.
    
    If `axis` is None, all dimensions are reduced, and a
    tensor with a single element is returned.
    
    For example:
    
    >>> a = tf.constant([
    ...   [[1, 2], [3, 4]],
    ...   [[1, 2], [3, 4]]
    ... ])
    >>> tf.reduce_min(a)
    <tf.Tensor: shape=(), dtype=int32, numpy=1>
    
    Choosing a specific axis returns minimum element in the given axis:
    
    >>> b = tf.constant([[1, 2, 3], [4, 5, 6

In [87]:
# Find the minimum
tf.reduce_min(input_tensor=E, 
              axis=None, 
              keepdims=False, 
              name=None)

<tf.Tensor: shape=(), dtype=int64, numpy=0>

In [88]:
help(tf.reduce_max)

Help on function reduce_max in module tensorflow.python.ops.math_ops:

reduce_max(input_tensor, axis=None, keepdims=False, name=None)
    Computes `tf.math.maximum` of elements across dimensions of a tensor.
    
    This is the reduction operation for the elementwise `tf.math.maximum` op.
    
    Reduces `input_tensor` along the dimensions given in `axis`.
    Unless `keepdims` is true, the rank of the tensor is reduced by 1 for each
    of the entries in `axis`, which must be unique. If `keepdims` is true, the
    reduced dimensions are retained with length 1.
    
    If `axis` is None, all dimensions are reduced, and a
    tensor with a single element is returned.
    
    Usage example:
    
      >>> x = tf.constant([5, 1, 2, 4])
      >>> tf.reduce_max(x)
      <tf.Tensor: shape=(), dtype=int32, numpy=5>
      >>> x = tf.constant([-5, -1, -2, -4])
      >>> tf.reduce_max(x)
      <tf.Tensor: shape=(), dtype=int32, numpy=-1>
      >>> x = tf.constant([4, float('nan')])
      >>>

In [89]:
# Find the maximum
tf.reduce_max(input_tensor=E, 
              axis=None, 
              keepdims=False, 
              name=None)

<tf.Tensor: shape=(), dtype=int64, numpy=99>

In [90]:
help(tf.reduce_mean)

Help on function reduce_mean in module tensorflow.python.ops.math_ops:

reduce_mean(input_tensor, axis=None, keepdims=False, name=None)
    Computes the mean of elements across dimensions of a tensor.
    
    Reduces `input_tensor` along the dimensions given in `axis` by computing the
    mean of elements across the dimensions in `axis`.
    Unless `keepdims` is true, the rank of the tensor is reduced by 1 for each
    of the entries in `axis`, which must be unique. If `keepdims` is true, the
    reduced dimensions are retained with length 1.
    
    If `axis` is None, all dimensions are reduced, and a tensor with a single
    element is returned.
    
    For example:
    
    >>> x = tf.constant([[1., 1.], [2., 2.]])
    >>> tf.reduce_mean(x)
    <tf.Tensor: shape=(), dtype=float32, numpy=1.5>
    >>> tf.reduce_mean(x, 0)
    <tf.Tensor: shape=(2,), dtype=float32, numpy=array([1.5, 1.5], dtype=float32)>
    >>> tf.reduce_mean(x, 1)
    <tf.Tensor: shape=(2,), dtype=float32, numpy=a

In [91]:
# Find the mean
tf.reduce_mean(input_tensor=E, 
               axis=None, 
               keepdims=False, 
               name=None)

<tf.Tensor: shape=(), dtype=int64, numpy=55>

In [92]:
help(tf.reduce_sum)

Help on function reduce_sum in module tensorflow.python.ops.math_ops:

reduce_sum(input_tensor, axis=None, keepdims=False, name=None)
    Computes the sum of elements across dimensions of a tensor.
    
    This is the reduction operation for the elementwise `tf.math.add` op.
    
    Reduces `input_tensor` along the dimensions given in `axis`.
    Unless `keepdims` is true, the rank of the tensor is reduced by 1 for each
    of the entries in `axis`, which must be unique. If `keepdims` is true, the
    reduced dimensions are retained with length 1.
    
    If `axis` is None, all dimensions are reduced, and a
    tensor with a single element is returned.
    
    For example:
    
      >>> # x has a shape of (2, 3) (two rows and three columns):
      >>> x = tf.constant([[1, 1, 1], [1, 1, 1]])
      >>> x.numpy()
      array([[1, 1, 1],
             [1, 1, 1]], dtype=int32)
      >>> # sum all the elements
      >>> # 1 + 1 + 1 + 1 + 1+ 1 = 6
      >>> tf.reduce_sum(x).numpy()
      

In [93]:
# Find the sum
tf.reduce_sum(input_tensor=E, 
              axis=None, 
              keepdims=False, 
              name=None)

<tf.Tensor: shape=(), dtype=int64, numpy=2751>

You can also find the standard deviation ([`tf.reduce_std()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_std)) and variance ([`tf.reduce_variance()`](https://www.tensorflow.org/api_docs/python/tf/math/reduce_variance)) of elements in a tensor using similar methods.

### Finding the positional maximum and minimum

How about finding the position a tensor where the maximum value occurs?

This is helpful when you want to line up your labels (say `['Green', 'Blue', 'Red']`) with your prediction probabilities tensor (e.g. `[0.98, 0.01, 0.01]`).

In this case, the predicted label (the one with the highest prediction probability) would be `'Green'`.

You can do the same for the minimum (if required) with the following:
* [`tf.argmax()`](https://www.tensorflow.org/api_docs/python/tf/math/argmax) - find the position of the maximum element in a given tensor.
* [`tf.argmin()`](https://www.tensorflow.org/api_docs/python/tf/math/argmin) - find the position of the minimum element in a given tensor.

In [94]:
# Create a tensor with 50 values between 0 and 1
F = tf.constant(value=np.random.random(size=50), 
                dtype=None, 
                shape=None,
                name='F')
F

<tf.Tensor: shape=(50,), dtype=float64, numpy=
array([0.46703722, 0.30769016, 0.49587373, 0.33098161, 0.86840723,
       0.01577022, 0.21208407, 0.14066191, 0.60157283, 0.95039157,
       0.16855741, 0.62989022, 0.65358194, 0.96354637, 0.11921644,
       0.30905916, 0.55257737, 0.23852791, 0.23095991, 0.95104391,
       0.81443307, 0.60805415, 0.17777842, 0.47697977, 0.36780903,
       0.58869448, 0.84112553, 0.18924058, 0.56969802, 0.89971207,
       0.39067203, 0.10505641, 0.42397971, 0.87468515, 0.64692981,
       0.42156974, 0.12415688, 0.39777026, 0.16953324, 0.17180182,
       0.97093791, 0.53140518, 0.70695798, 0.36147223, 0.86551351,
       0.17990564, 0.83544696, 0.99521514, 0.33659175, 0.25469952])>

In [95]:
help(tf.argmax)

Help on function argmax_v2 in module tensorflow.python.ops.math_ops:

argmax_v2(input, axis=None, output_type=tf.int64, name=None)
    Returns the index with the largest value across axes of a tensor.
    
    In case of identity returns the smallest index.
    
    For example:
    
    >>> A = tf.constant([2, 20, 30, 3, 6])
    >>> tf.math.argmax(A)  # A[2] is maximum in tensor A
    <tf.Tensor: shape=(), dtype=int64, numpy=2>
    >>> B = tf.constant([[2, 20, 30, 3, 6], [3, 11, 16, 1, 8],
    ...                  [14, 45, 23, 5, 27]])
    >>> tf.math.argmax(B, 0)
    <tf.Tensor: shape=(5,), dtype=int64, numpy=array([2, 2, 0, 2, 2])>
    >>> tf.math.argmax(B, 1)
    <tf.Tensor: shape=(3,), dtype=int64, numpy=array([2, 2, 1])>
    >>> C = tf.constant([0, 0, 0, 0])
    >>> tf.math.argmax(C) # Returns smallest index in case of ties
    <tf.Tensor: shape=(), dtype=int64, numpy=0>
    
    Args:
      input: A `Tensor`.
      axis: An integer, the axis to reduce across. Default to 0.
     

In [96]:
# Find the maximum element position of F
tf.argmax(input=F, 
          axis=None, 
          output_type=tf.int64, 
          name=None)

<tf.Tensor: shape=(), dtype=int64, numpy=47>

In [97]:
# Find the minimum element position of F
tf.argmin(input=F, 
          axis=None, 
          output_type=tf.int64, 
          name=None)

<tf.Tensor: shape=(), dtype=int64, numpy=5>

In [98]:
# Find the maximum element position of F
print(f"The maximum value of F is at position: {tf.argmax(input=F, axis=None, output_type=None, name=None).numpy()}") 
print(f"The maximum value of F is: {tf.reduce_max(input_tensor=F, axis=None, keepdims=False, name=None).numpy()}") 
print(f"Using tf.argmax() to index F, the maximum value of F is: {F[tf.argmax(input=F, axis=None, output_type=None, name=None)].numpy()}")
print(f"Are the two max values the same (they should be)? {F[tf.argmax(input=F, axis=None, output_type=None, name=None)].numpy() == tf.reduce_max(input_tensor=F, axis=None, keepdims=False, name=None).numpy()}")

The maximum value of F is at position: 47
The maximum value of F is: 0.9952151418596201
Using tf.argmax() to index F, the maximum value of F is: 0.9952151418596201
Are the two max values the same (they should be)? True


### Squeezing a tensor (removing all single dimensions)

If you need to remove single-dimensions from a tensor (dimensions with size 1), you can use `tf.squeeze()`.

* [`tf.squeeze()`](https://www.tensorflow.org/api_docs/python/tf/squeeze) - remove all dimensions of 1 from a tensor.


In [99]:
help(tf.squeeze)

Help on function squeeze_v2 in module tensorflow.python.ops.array_ops:

squeeze_v2(input, axis=None, name=None)
    Removes dimensions of size 1 from the shape of a tensor.
    
    Given a tensor `input`, this operation returns a tensor of the same type with
    all dimensions of size 1 removed. If you don't want to remove all size 1
    dimensions, you can remove specific size 1 dimensions by specifying
    `axis`.
    
    For example:
    
    ```python
    # 't' is a tensor of shape [1, 2, 1, 3, 1, 1]
    tf.shape(tf.squeeze(t))  # [2, 3]
    ```
    
    Or, to remove specific size 1 dimensions:
    
    ```python
    # 't' is a tensor of shape [1, 2, 1, 3, 1, 1]
    tf.shape(tf.squeeze(t, [2, 4]))  # [1, 2, 3, 1]
    ```
    
    Unlike the older op `tf.compat.v1.squeeze`, this op does not accept a
    deprecated `squeeze_dims` argument.
    
    Note: if `input` is a `tf.RaggedTensor`, then this operation takes `O(N)`
    time, where `N` is the number of elements in the squeeze

In [100]:
# Create a rank 5 (5 dimensions) tensor of 50 numbers between 0 and 100
G = tf.constant(value=np.random.randint(0, 100, 50), 
                shape=(1, 1, 1, 1, 50), 
                dtype=None, 
                name='G')
G.shape, G.ndim

(TensorShape([1, 1, 1, 1, 50]), 5)

In [101]:
# Squeeze tensor G (remove all 1 dimensions)
G_squeezed = tf.squeeze(input=G, 
                        axis=None, 
                        name='G_squeezed')
G_squeezed.shape, G_squeezed.ndim

(TensorShape([50]), 1)

### One-hot encoding

If you have a tensor of indicies and would like to one-hot encode it, you can use [`tf.one_hot()`](https://www.tensorflow.org/api_docs/python/tf/one_hot).

You should also specify the `depth` parameter (the level which you want to one-hot encode to).

In [102]:
help(tf.one_hot)

Help on function one_hot in module tensorflow.python.ops.array_ops:

one_hot(indices, depth, on_value=None, off_value=None, axis=None, dtype=None, name=None)
    Returns a one-hot tensor.
    
    See also `tf.fill`, `tf.eye`.
    
    The locations represented by indices in `indices` take value `on_value`,
    while all other locations take value `off_value`.
    
    `on_value` and `off_value` must have matching data types. If `dtype` is also
    provided, they must be the same data type as specified by `dtype`.
    
    If `on_value` is not provided, it will default to the value `1` with type
    `dtype`
    
    If `off_value` is not provided, it will default to the value `0` with type
    `dtype`
    
    If the input `indices` is rank `N`, the output will have rank `N+1`. The
    new axis is created at dimension `axis` (default: the new axis is appended
    at the end).
    
    If `indices` is a scalar the output shape will be a vector of length `depth`
    
    If `indices` is 

In [103]:
# Create a list of indices
some_list = [0, 1, 2, 3]

# One hot encode them
tf.one_hot(indices=some_list, 
           depth=4, 
           on_value=None, 
           off_value=None, 
           axis=None, 
           dtype=None, 
           name=None)

<tf.Tensor: shape=(4, 4), dtype=float32, numpy=
array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]], dtype=float32)>

You can also specify values for `on_value` and `off_value` instead of the default `0` and `1`.

In [104]:
# Specify custom values for on and off encoding
tf.one_hot(indices=some_list, 
           depth=4, 
           on_value="We're live!", 
           off_value="Offline", 
           dtype=tf.string, 
           name=None)

<tf.Tensor: shape=(4, 4), dtype=string, numpy=
array([[b"We're live!", b'Offline', b'Offline', b'Offline'],
       [b'Offline', b"We're live!", b'Offline', b'Offline'],
       [b'Offline', b'Offline', b"We're live!", b'Offline'],
       [b'Offline', b'Offline', b'Offline', b"We're live!"]], dtype=object)>

### Squaring, log, square root

Many other common mathematical operations you'd like to perform at some stage, probably exist.

Let's take a look at:
* [`tf.square()`](https://www.tensorflow.org/api_docs/python/tf/math/square) - get the square of every value in a tensor. 
* [`tf.sqrt()`](https://www.tensorflow.org/api_docs/python/tf/math/sqrt) - get the squareroot of every value in a tensor (**note:** the elements need to be floats or this will error).
* [`tf.math.log()`](https://www.tensorflow.org/api_docs/python/tf/math/log) - get the natural log of every value in a tensor (elements need to floats).

In [105]:
# Create a new tensor
H = tf.constant(value=np.arange(start=1, stop=10), 
                dtype=None, 
                shape=None,
                name='H')
H

<tf.Tensor: shape=(9,), dtype=int64, numpy=array([1, 2, 3, 4, 5, 6, 7, 8, 9])>

In [106]:
help(tf.square)

Help on function square in module tensorflow.python.ops.gen_math_ops:

square(x, name=None)
    Computes square of x element-wise.
    
    I.e., \\(y = x * x = x^2\\).
    
    >>> tf.math.square([-2., 0., 3.])
    <tf.Tensor: shape=(3,), dtype=float32, numpy=array([4., 0., 9.], dtype=float32)>
    
    Args:
      x: A `Tensor`. Must be one of the following types: `bfloat16`, `half`, `float32`, `float64`, `int8`, `int16`, `int32`, `int64`, `uint8`, `uint16`, `uint32`, `uint64`, `complex64`, `complex128`.
      name: A name for the operation (optional).
    
    Returns:
      A `Tensor`. Has the same type as `x`.
    
      If `x` is a `SparseTensor`, returns
      `SparseTensor(x.indices, tf.math.square(x.values, ...), x.dense_shape)`



In [107]:
# Square it
tf.square(x=H, name=None)

<tf.Tensor: shape=(9,), dtype=int64, numpy=array([ 1,  4,  9, 16, 25, 36, 49, 64, 81])>

In [108]:
help(tf.sqrt)

Help on function sqrt in module tensorflow.python.ops.math_ops:

sqrt(x, name=None)
    Computes element-wise square root of the input tensor.
    
    Note: This operation does not support integer types.
    
    >>> x = tf.constant([[4.0], [16.0]])
    >>> tf.sqrt(x)
    <tf.Tensor: shape=(2, 1), dtype=float32, numpy=
      array([[2.],
             [4.]], dtype=float32)>
    >>> y = tf.constant([[-4.0], [16.0]])
    >>> tf.sqrt(y)
    <tf.Tensor: shape=(2, 1), dtype=float32, numpy=
      array([[nan],
             [ 4.]], dtype=float32)>
    >>> z = tf.constant([[-1.0], [16.0]], dtype=tf.complex128)
    >>> tf.sqrt(z)
    <tf.Tensor: shape=(2, 1), dtype=complex128, numpy=
      array([[0.0+1.j],
             [4.0+0.j]])>
    
    Note: In order to support complex type, please provide an input tensor
    of `complex64` or `complex128`.
    
    Args:
      x: A `tf.Tensor` of type `bfloat16`, `half`, `float32`, `float64`,
        `complex64`, `complex128`
      name: A name for the o

In [110]:
# Find the squareroot (will error), needs to be non-integer
try:
    tf.sqrt(x=H, name=None)
except InvalidArgumentError as e:
    print(f"Error: {e}")

Error: Value for attr 'T' of int64 is not in the list of allowed values: bfloat16, half, float, double, complex64, complex128
	; NodeDef: {{node Sqrt}}; Op<name=Sqrt; signature=x:T -> y:T; attr=T:type,allowed=[DT_BFLOAT16, DT_HALF, DT_FLOAT, DT_DOUBLE, DT_COMPLEX64, DT_COMPLEX128]> [Op:Sqrt]


In [111]:
# Change H to float32
H = tf.cast(x=H, dtype=tf.float32, name='H')
H

<tf.Tensor: shape=(9,), dtype=float32, numpy=array([1., 2., 3., 4., 5., 6., 7., 8., 9.], dtype=float32)>

In [112]:
# Find the square root
tf.sqrt(x=H, name=None)

<tf.Tensor: shape=(9,), dtype=float32, numpy=
array([1.       , 1.4142135, 1.7320508, 2.       , 2.2360678, 2.4494896,
       2.6457512, 2.828427 , 3.       ], dtype=float32)>

In [113]:
help(tf.math.log)

Help on function log in module tensorflow.python.ops.gen_math_ops:

log(x, name=None)
    Computes natural logarithm of x element-wise.
    
    I.e., \\(y = \log_e x\\).
    
    Example:
    >>> x = tf.constant([0, 0.5, 1, 5])
    >>> tf.math.log(x)
    <tf.Tensor: shape=(4,), dtype=float32, numpy=array([      -inf, -0.6931472,  0.       ,  1.609438 ], dtype=float32)>
    
    See: https://en.wikipedia.org/wiki/Logarithm
    
    Args:
      x: A `Tensor`. Must be one of the following types: `bfloat16`, `half`, `float32`, `float64`, `complex64`, `complex128`.
      name: A name for the operation (optional).
    
    Returns:
      A `Tensor`. Has the same type as `x`.



In [114]:
# Find the log (input also needs to be float)
tf.math.log(x=H, name=None)

<tf.Tensor: shape=(9,), dtype=float32, numpy=
array([0.       , 0.6931472, 1.0986123, 1.3862944, 1.609438 , 1.7917595,
       1.9459102, 2.0794415, 2.1972246], dtype=float32)>

### Manipulating `tf.Variable` tensors

Tensors created with `tf.Variable()` can be changed in place using methods such as:

* [`.assign()`](https://www.tensorflow.org/api_docs/python/tf/Variable#assign) - assign a different value to a particular index of a variable tensor.
* [`.add_assign()`](https://www.tensorflow.org/api_docs/python/tf/Variable#assign_add) - add to an existing value and reassign it at a particular index of a variable tensor.


In [115]:
help(tf.Variable)

Help on class Variable in module tensorflow.python.ops.variables:

class Variable(tensorflow.python.trackable.base.Trackable)
 |  Variable(*args, **kwargs)
 |  
 |  See the [variable guide](https://tensorflow.org/guide/variable).
 |  
 |  A variable maintains shared, persistent state manipulated by a program.
 |  
 |  The `Variable()` constructor requires an initial value for the variable, which
 |  can be a `Tensor` of any type and shape. This initial value defines the type
 |  and shape of the variable. After construction, the type and shape of the
 |  variable are fixed. The value can be changed using one of the assign methods.
 |  
 |  >>> v = tf.Variable(1.)
 |  >>> v.assign(2.)
 |  <tf.Variable ... shape=() dtype=float32, numpy=2.0>
 |  >>> v.assign_add(0.5)
 |  <tf.Variable ... shape=() dtype=float32, numpy=2.5>
 |  
 |  The `shape` argument to `Variable`'s constructor allows you to construct a
 |  variable with a less defined shape than its `initial_value`:
 |  
 |  >>> v = tf.

In [116]:
# Create a variable tensor
I = tf.Variable(initial_value=np.arange(start=0, stop=5), 
                trainable=True, 
                validate_shape=True, 
                caching_device=None, 
                name='I', 
                variable_def=None, 
                dtype=tf.int32, 
                import_scope=None,
                constraint=None,
                synchronization=tf.VariableSynchronization.AUTO,
                experimental_enable_variable_lifting=True)
I

<tf.Variable 'I:0' shape=(5,) dtype=int32, numpy=array([0, 1, 2, 3, 4], dtype=int32)>

In [117]:
help(I.assign)

Help on method assign in module tensorflow.python.ops.resource_variable_ops:

assign(value, use_locking=None, name=None, read_value=True) method of tensorflow.python.ops.resource_variable_ops.ResourceVariable instance
    Assigns a new value to this variable.
    
    Args:
      value: A `Tensor`. The new value for this variable.
      use_locking: If `True`, use locking during the assignment.
      name: The name to use for the assignment.
      read_value: A `bool`. Whether to read and return the new value of the
        variable or not.
    
    Returns:
      If `read_value` is `True`, this method will return the new value of the
      variable after the assignment has completed. Otherwise, when in graph mode
      it will return the `Operation` that does the assignment, and when in eager
      mode it will return `None`.



In [118]:
# Assign the final value a new value of 50
I.assign(value=[0, 1, 2, 3, 50], 
         use_locking=None, 
         name=None, 
         read_value=True)

<tf.Variable 'UnreadVariable' shape=(5,) dtype=int32, numpy=array([ 0,  1,  2,  3, 50], dtype=int32)>

In [119]:
# The change happens in place (the last value is now 50, not 4)
I

<tf.Variable 'I:0' shape=(5,) dtype=int32, numpy=array([ 0,  1,  2,  3, 50], dtype=int32)>

In [120]:
help(I.assign_add)

Help on method assign_add in module tensorflow.python.ops.resource_variable_ops:

assign_add(delta, use_locking=None, name=None, read_value=True) method of tensorflow.python.ops.resource_variable_ops.ResourceVariable instance
    Adds a value to this variable.
    
    Args:
      delta: A `Tensor`. The value to add to this variable.
      use_locking: If `True`, use locking during the operation.
      name: The name to use for the operation.
      read_value: A `bool`. Whether to read and return the new value of the
        variable or not.
    
    Returns:
      If `read_value` is `True`, this method will return the new value of the
      variable after the assignment has completed. Otherwise, when in graph mode
      it will return the `Operation` that does the assignment, and when in eager
      mode it will return `None`.



In [121]:
# Add 10 to every element in I
I.assign_add(delta=[10, 10, 10, 10, 10], 
             use_locking=None, 
             name=None, 
             read_value=True)

<tf.Variable 'UnreadVariable' shape=(5,) dtype=int32, numpy=array([10, 11, 12, 13, 60], dtype=int32)>

In [122]:
# Again, the change happens in place
I

<tf.Variable 'I:0' shape=(5,) dtype=int32, numpy=array([10, 11, 12, 13, 60], dtype=int32)>

## Tensors and NumPy

We've seen some examples of tensors interact with NumPy arrays, such as, using NumPy arrays to create tensors. 

Tensors can also be converted to NumPy arrays using:

* `np.array()` - pass a tensor to convert to an ndarray (NumPy's main datatype).
* `tensor.numpy()` - call on a tensor to convert to an ndarray.

Doing this is helpful as it makes tensors iterable as well as allows us to use any of NumPy's methods on them.

In [123]:
# Create a tensor from a NumPy array
J = tf.constant(value=np.array([3., 7., 10.]), 
                dtype=None, 
                shape=None,
                name='J')
J

<tf.Tensor: shape=(3,), dtype=float64, numpy=array([ 3.,  7., 10.])>

In [124]:
# Convert tensor J to NumPy with np.array()
np.array(J), type(np.array(J))

(array([ 3.,  7., 10.]), numpy.ndarray)

In [125]:
help(J.numpy)

Help on method numpy in module tensorflow.python.framework.ops:

numpy() method of tensorflow.python.framework.ops.EagerTensor instance
    Copy of the contents of this Tensor into a NumPy array or scalar.
    
    Unlike NumPy arrays, Tensors are immutable, so this method has to copy
    the contents to ensure safety. Use `memoryview` to get a readonly
    view of the contents without doing a copy:
    
    >>> t = tf.constant([42])
    >>> np.array(memoryview(t))
    array([42], dtype=int32)
    
    Note that `memoryview` is only zero-copy for Tensors on CPU. If a Tensor
    is on GPU, it will have to be transferred to CPU first in order for
    `memoryview` to work.
    
    Returns:
      A NumPy array of the same shape and dtype or a NumPy scalar, if this
      Tensor has rank 0.
    
    Raises:
      ValueError: If the dtype of this Tensor does not have a compatible
        NumPy dtype.



In [126]:
# Convert tensor J to NumPy with .numpy()
J.numpy(), type(J.numpy())

(array([ 3.,  7., 10.]), numpy.ndarray)

By default tensors have `dtype=float32`, where as NumPy arrays have `dtype=float64`.

This is because neural networks (which are usually built with TensorFlow) can generally work very well with less precision (32-bit rather than 64-bit).

In [127]:
# Create a tensor from NumPy and from an array
# NOTE: Will be float64 (due to NumPy)
numpy_J = tf.constant(value=np.array([3., 7., 10.]), dtype=None, shape=None, name='numpy_J')
# note: will be float32 (due to being TensorFlow default)
tensor_J = tf.constant(value=[3., 7., 10.], dtype=None, shape=None, name=None) 
numpy_J.dtype, tensor_J.dtype

(tf.float64, tf.float32)

## Using `@tf.function`

In your TensorFlow adventures, you might come across Python functions which have the decorator [`@tf.function`](https://www.tensorflow.org/api_docs/python/tf/function).

If you aren't sure what Python decorators do, [read RealPython's guide on them](https://realpython.com/primer-on-python-decorators/).

But in short, decorators modify a function in one way or another.

In the `@tf.function` decorator case, it turns a Python function into a callable TensorFlow graph. Which is a fancy way of saying, if you've written your own Python function, and you decorate it with `@tf.function`, when you export your code (to potentially run on another device), TensorFlow will attempt to convert it into a fast(er) version of itself (by making it part of a computation graph).

For more on this, read the [Better performnace with tf.function](https://www.tensorflow.org/guide/function) guide.

In [128]:
# Create a simple function
def function(x, y):
  return x ** 2 + y

x = tf.constant(value=np.arange(0, 10), dtype=None, shape=None, name='x')
y = tf.constant(value=np.arange(10, 20), dtype=None, shape=None, name='y')
function(x, y)

<tf.Tensor: shape=(10,), dtype=int64, numpy=array([ 10,  12,  16,  22,  30,  40,  52,  66,  82, 100])>

In [129]:
# Create the same function and decorate it with tf.function
@tf.function
def tf_function(x, y):
  return x ** 2 + y

tf_function(x, y)

<tf.Tensor: shape=(10,), dtype=int64, numpy=array([ 10,  12,  16,  22,  30,  40,  52,  66,  82, 100])>

If you noticed no difference between the above two functions (the decorated one and the non-decorated one) you'd be right.

Much of the difference happens behind the scenes. One of the main ones being potential code speed-ups where possible.

## Finding access to GPUs

We've mentioned GPUs plenty of times throughout this notebook.

So how do you check if you've got one available?

You can check if you've got access to a GPU using [`tf.config.list_physical_devices()`](https://www.tensorflow.org/guide/gpu).

In [130]:
print(tf.config.list_physical_devices('GPU'))

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:2', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:3', device_type='GPU')]


If the above outputs an empty array (or nothing), it means you don't have access to a GPU (or at least TensorFlow can't find it).

If you're running in Google Colab, you can access a GPU by going to *Runtime -> Change Runtime Type -> Select GPU* (**note:** after doing this your notebook will restart and any variables you've saved will be lost).

Once you've changed your runtime type, run the cell below.

In [None]:
# import tensorflow as tf
# print(tf.config.list_physical_devices('GPU'))

If you've got access to a GPU, the cell above should output something like:

`[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]`

You can also find information about your GPU using `!nvidia-smi`.

In [131]:
!nvidia-smi

/bin/bash: /home/steeve/Anaconda3/lib/libtinfo.so.6: no version information available (required by /bin/bash)
Tue Apr  4 15:40:57 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.89.02    Driver Version: 525.89.02    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  NVIDIA GeForce ...  Off  | 00000000:05:00.0  On |                  N/A |
| 41%   72C    P2    70W / 250W |   1298MiB / 11264MiB |      8%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  Off  | 00000000:06:00.0 Off |  

> 🔑 **Note:** If you have access to a GPU, TensorFlow will automatically use it whenever possible.

## 🛠 Exercises

1. Create a vector, scalar, matrix and tensor with values of your choosing using `tf.constant()`.
2. Find the shape, rank and size of the tensors you created in 1.
3. Create two tensors containing random values between 0 and 1 with shape `[5, 300]`.
4. Multiply the two tensors you created in 3 using matrix multiplication.
5. Multiply the two tensors you created in 3 using dot product.
6. Create a tensor with random values between 0 and 1 with shape `[224, 224, 3]`.
7. Find the min and max values of the tensor you created in 6.
8. Created a tensor with random values of shape `[1, 224, 224, 3]` then squeeze it to change the shape to `[224, 224, 3]`.
9. Create a tensor with shape `[10]` using your own choice of values, then find the index which has the maximum value.
10. One-hot encode the tensor you created in 9.

In [132]:
# 1: Create a scalar, vector, matrix and tensor with tf.constant
my_scalar = tf.constant(value=1, dtype=None, shape=None, name='my_scalar')
my_vector = tf.constant(value=[1, 2, 3], dtype=None, shape=None, name='my_vector')
my_matrix = tf.constant(value=[[1, 2, 3], [4, 5, 6]], dtype=None, shape=None, name='my_matrix')
my_tensor = tf.constant(value=tf.random.normal(shape=(3, 3, 3)), dtype=None, shape=None, name='my_tensor')

In [133]:
print("my_scalar:", end="\n")
print(my_scalar)
print("my_vector:", end="\n")
print(my_vector)
print("my_matrix:", end="\n")
print(my_matrix)
print("my_tensor:", end="\n")
print(my_tensor)

my_scalar:
tf.Tensor(1, shape=(), dtype=int32)
my_vector:
tf.Tensor([1 2 3], shape=(3,), dtype=int32)
my_matrix:
tf.Tensor(
[[1 2 3]
 [4 5 6]], shape=(2, 3), dtype=int32)
my_tensor:
tf.Tensor(
[[[ 8.4224582e-02 -8.6090374e-01  3.7812304e-01]
  [-5.1962738e-03 -4.9453196e-01  6.1781919e-01]
  [-3.3082047e-01 -1.3840806e-03 -4.2373410e-01]]

 [[-1.3872087e+00 -1.5488191e+00 -5.3198391e-01]
  [-4.4756433e-01 -2.0115814e+00 -5.7926011e-01]
  [ 5.7938927e-01  1.3041967e+00  6.7720258e-01]]

 [[-7.4587613e-01  1.0378964e+00  1.3820479e+00]
  [ 1.4319172e+00 -3.7643117e-01  9.8158473e-01]
  [-2.3597860e-01 -3.3763257e-01 -8.9593250e-01]]], shape=(3, 3, 3), dtype=float32)


In [134]:
# 2: Find the shape, rank and size of the tensors you created in 1.
print(f"name: my_scalar, shape: {my_scalar.shape}, rank: {tf.rank(input=my_scalar, name=None)}, size: {tf.size(input=my_scalar, name=None, out_type=tf.int32)}")
print(f"name: my_vector, shape: {my_vector.shape}, rank: {tf.rank(input=my_vector, name=None)}, size: {tf.size(input=my_vector, name=None, out_type=tf.int32)}")
print(f"name: my_matrix, shape: {my_matrix.shape}, rank: {tf.rank(input=my_matrix, name=None)}, size: {tf.size(input=my_matrix, name=None, out_type=tf.int32)}")
print(f"name: my_tensor, shape: {my_tensor.shape}, rank: {tf.rank(input=my_tensor, name=None)}, size: {tf.size(input=my_tensor, name=None, out_type=tf.int32)}")

name: my_scalar, shape: (), rank: 0, size: 1
name: my_vector, shape: (3,), rank: 1, size: 3
name: my_matrix, shape: (2, 3), rank: 2, size: 6
name: my_tensor, shape: (3, 3, 3), rank: 3, size: 27


In [135]:
# 3: Create two tensors containing random values between 0 and 1 with shape [5, 300]
random_gen = tf.random.Generator.from_seed(seed=42)
tensor_1 = random_gen.uniform(shape=(5, 300), minval=0, maxval=1, dtype=tf.float32, name='tensor_1')
tensor_2 = random_gen.uniform(shape=(5, 300), minval=0, maxval=1, dtype=tf.float32, name='tensor_2')
print("tensor_1:", end="\n")
print(tensor_1)
print("tensor_2:", end="\n")
print(tensor_2)

tensor_1:
tf.Tensor(
[[0.7493447  0.73561966 0.45230794 ... 0.5816356  0.5627874  0.7491298 ]
 [0.6438937  0.6938418  0.04408407 ... 0.04825139 0.5099728  0.26470542]
 [0.21373153 0.6683699  0.78474844 ... 0.19658887 0.22030771 0.3766911 ]
 [0.68190825 0.29304636 0.5415933  ... 0.37111604 0.76053166 0.7538099 ]
 [0.8011551  0.48830473 0.13867617 ... 0.20301867 0.8378159  0.19984365]], shape=(5, 300), dtype=float32)
tensor_2:
tf.Tensor(
[[0.6932683  0.79913497 0.381521   ... 0.7023829  0.9509897  0.2274468 ]
 [0.03644454 0.6579336  0.27609527 ... 0.6680317  0.42852902 0.17845905]
 [0.28606403 0.3063997  0.25755703 ... 0.6562083  0.7018373  0.76578414]
 [0.71172917 0.7854034  0.6951294  ... 0.03848457 0.6082274  0.23704243]
 [0.40945458 0.49275637 0.46033657 ... 0.06898165 0.74632454 0.56117034]], shape=(5, 300), dtype=float32)


In [136]:
# 4: Multiply the two tensors you created in 3 using matrix multiplication
tf.matmul(a=tensor_1, b=tensor_2, transpose_a=False, transpose_b=True)

<tf.Tensor: shape=(5, 5), dtype=float32, numpy=
array([[76.06833 , 73.937325, 79.66014 , 72.811935, 80.26073 ],
       [73.61879 , 69.42697 , 73.68848 , 67.24058 , 72.295715],
       [72.67236 , 73.270164, 73.09715 , 70.96236 , 72.99701 ],
       [75.63144 , 73.94716 , 75.9094  , 72.79953 , 75.467705],
       [76.59262 , 73.65219 , 77.345085, 68.49897 , 77.61454 ]],
      dtype=float32)>

In [137]:
# 5: Multiply the two tensors you created in 3 using dot product
tf.tensordot(a=tensor_1, 
             b=tf.transpose(a=tensor_2, perm=None, conjugate=False, name=None),
             axes=1,
             name=None)

<tf.Tensor: shape=(5, 5), dtype=float32, numpy=
array([[76.06833 , 73.937325, 79.66014 , 72.811935, 80.26073 ],
       [73.61879 , 69.42697 , 73.68848 , 67.24058 , 72.295715],
       [72.67236 , 73.270164, 73.09715 , 70.96236 , 72.99701 ],
       [75.63144 , 73.94716 , 75.9094  , 72.79953 , 75.467705],
       [76.59262 , 73.65219 , 77.345085, 68.49897 , 77.61454 ]],
      dtype=float32)>

In [138]:
# 6: Create a tensor with random values between 0 and 1 with shape `[224, 224, 3]`.
tensor_3 = random_gen.uniform(shape=(224, 224, 3), minval=0, maxval=1, dtype=tf.float32, name='tensor_3')
print("tensor_3:", end="\n")
print(tensor_3.shape)

tensor_3:
(224, 224, 3)


In [139]:
# 7. Find the min and max values of the tensor you created in 6.
print(f"min: {tf.reduce_min(input_tensor=tensor_3, axis=None, keepdims=False, name=None)}\nmax: {tf.reduce_max(input_tensor=tensor_3, axis=None, keepdims=False, name=None)}")

min: 2.9802322387695312e-06
max: 0.9999908208847046


In [140]:
# 8. Created a tensor with random values of shape `[1, 224, 224, 3]` then squeeze it to change the shape to `[224, 224, 3]`.
tensor_4 = random_gen.uniform(shape=(1, 224, 224, 3), minval=0, maxval=1, dtype=tf.float32, name='tensor_4')
tensor_4_squeezed = tf.squeeze(input=tensor_4, axis=None, name='tensor_4_squeezed')
print(f"tensor_4 shape: {tensor_4.shape}")
print(f"tensor_4_squeezed shape: {tensor_4_squeezed.shape}")

tensor_4 shape: (1, 224, 224, 3)
tensor_4_squeezed shape: (224, 224, 3)


In [141]:
# 9. Create a tensor with shape `[10]` using your own choice of values, then find the index which has the maximum value.
tensor_5 = tf.constant(value=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], dtype=None, shape=[10], name='tensor_5')
print(f"Maximum value index: {tf.argmax(input=tensor_5, axis=None, output_type=tf.int64, name=None)}")

Maximum value index: 9


In [142]:
# 10. One-hot encode the tensor you created in 9.
tf.one_hot(indices=tensor_5, depth=10, on_value=None, off_value=None, axis=None, dtype=None, name=None)

<tf.Tensor: shape=(10, 10), dtype=float32, numpy=
array([[0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32)>

In [143]:
import gc

gc.collect()

505

## 📖 Extra-curriculum

* Read through the [list of TensorFlow Python APIs](https://www.tensorflow.org/api_docs/python/), pick one we haven't gone through in this notebook, reverse engineer it (write out the documentation code for yourself) and figure out what it does.
* Try to create a series of tensor functions to calculate your most recent grocery bill (it's okay if you don't use the names of the items, just the price in numerical form).
  * How would you calculate your grocery bill for the month and for the year using tensors?
* Go through the [TensorFlow 2.x quick start for beginners](https://www.tensorflow.org/tutorials/quickstart/beginner) tutorial (be sure to type out all of the code yourself, even if you don't understand it).
  * Are there any functions we used in here that match what's used in there? Which are the same? Which haven't you seen before?
* Watch the video ["What's a tensor?"](https://www.youtube.com/watch?v=f5liqUk0ZTw) - a great visual introduction to many of the concepts we've covered in this notebook.