# <h1 style="text-align:center;color:DodgerBlue;"> Introduction to Tensors </h1> 

#### In TensorFlow, a tensor is a multi-dimensional array or a list of numbers, which can be used to represent data in a variety of formats, such as images, audio, video, and text. Tensors are the fundamental building blocks of TensorFlow computations.

##### Tensors have several important properties, including their data type, shape, and rank:

- Data type: Each element of a tensor has a data type, such as float32, int32, or bool. TensorFlow supports a wide range of data types, and the data type of a tensor is determined when the tensor is created.

- Shape: The shape of a tensor defines the number of dimensions and the size of each dimension. For example, a 3x4 matrix has a shape of (3, 4), while a 3x4x2 tensor has a shape of (3, 4, 2). The shape of a tensor can be changed using functions like tf.reshape().

- Rank: The rank of a tensor is the number of dimensions it has. For example, a scalar (single number) has a rank of 0, a vector has a rank of 1, a matrix has a rank of 2, and so on.

Tensors in TensorFlow can be created using various functions, including tf.constant(), tf.Variable(), tf.placeholder(), and tf.random. Once created, tensors can be manipulated and transformed using a wide range of TensorFlow operations, such as element-wise operations, matrix operations, and convolutional operations.

In [1]:
#Importing tensorflow
import tensorflow as tf

In [2]:
print(tf.__version__)

2.10.0


## Creating Tensors with tf.constant()
<p> As mentioned before, in general, you usually won't create tensors yourself. This is because TensorFlow has modules built-in (such as tf.io and tf.data) which are able to read your data sources and automatically convert them to tensors and then later on, neural network models will process these for us.

But for now, because we're getting familar with tensors themselves and how to manipulate them, we'll see how we can create them ourselves.</p>

<p> We'll begin by using tf.constant().</p>

In [4]:
# Create a scalar (rank 0 tensor)
scalar = tf.constant(10)
scalar

<tf.Tensor: shape=(), dtype=int32, numpy=10>

A scalar is known as a rank 0 tensor. Because it has no dimensions (it's just a number).

In [5]:
# Check the number of dimensions of a tensor (ndim stands for number of dimensions)
scalar.ndim

0

In [7]:
# Create a vector (more than 0 dimensions)
vector = tf.constant([10, 10])
vector

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([10, 10])>

In [8]:
# Check the number of dimensions of our vector tensor
vector.ndim

1

In [9]:
# Create a matrix (more than 1 dimension)
matrix = tf.constant([[10, 7],
                      [7, 10]])
matrix

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[10,  7],
       [ 7, 10]])>

In [10]:
matrix.ndim

2

By default, TensorFlow creates tensors with either an int32 or float32 datatype.

This is known as [32-bit precision](https://en.wikipedia.org/wiki/Precision_(computer_science) (the higher the number, the more precise the number, the more space it takes up on your computer).

In [11]:
# Create another matrix and define the datatype
another_matrix = tf.constant([[10., 7.],
                              [3., 2.],
                              [8., 9.]], dtype=tf.float16) # specify the datatype with 'dtype'
another_matrix

<tf.Tensor: shape=(3, 2), dtype=float16, numpy=
array([[10.,  7.],
       [ 3.,  2.],
       [ 8.,  9.]], dtype=float16)>

In [12]:
# Even though another_matrix contains more numbers, its dimensions stay the same
another_matrix.ndim

2

In [13]:
# How about a tensor? (more than 2 dimensions, although, all of the above items are also technically tensors)
tensor = tf.constant([[[1, 2, 3],
                       [4, 5, 6]],
                      [[7, 8, 9],
                       [10, 11, 12]],
                      [[13, 14, 15],
                       [16, 17, 18]]])
tensor
     

<tf.Tensor: shape=(3, 2, 3), dtype=int32, numpy=
array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]],

       [[13, 14, 15],
        [16, 17, 18]]])>

In [14]:
tensor.ndim

3

<p>This is known as a rank 3 tensor (3-dimensions), however a tensor can have an arbitrary (unlimited) amount of dimensions.</p>

<p> For example, you might turn a series of images into tensors with shape (224, 224, 3, 32), where: </p>
<ul>
<li>224, 224 (the first 2 dimensions) are the height and width of the images in pixels.</li>
<li>3 is the number of colour channels of the image (red, green blue).</li>
<li>32 is the batch size (the number of images a neural network sees at any one time).</li>
</ul>
<p>All of the above variables we've created are actually tensors. But you may also hear them referred to as their different names (the ones we gave them):</p>

<ul>
<li> scalar: a single number.</li> 
<li> vector: a number with direction (e.g. wind speed with direction).</li> 
<li> matrix: a 2-dimensional array of numbers.</li> 
<li> tensor: an n-dimensional arrary of numbers (where n can be any number, a 0-dimension tensor is a scalar, a 1-dimension tensor is a vector).</li> 
</ul>
<p>To add to the confusion, the terms matrix and tensor are often used interchangably.</p>

<p> Going forward since we're using TensorFlow, everything we refer to and use will be tensors.</p>

## Creating Tensors with tf.Variable()

<p>The difference between tf.Variable() and tf.constant() is tensors created with tf.constant() are immutable (can't be changed, can only be used to create a new tensor), where as, tensors created with tf.Variable() are mutable (can be changed).

In [18]:
# Create the same tensor with tf.Variable() and tf.constant()
changeable_tensor = tf.Variable([10, 7])
unchangeable_tensor = tf.constant([10, 7])
changeable_tensor, unchangeable_tensor

(<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([10,  7])>,
 <tf.Tensor: shape=(2,), dtype=int32, numpy=array([10,  7])>)

Now let's try to change one of the elements of the changable tensor.

In [19]:
# ASSIGN is used to change element in tensor
changeable_tensor[0].assign(7)
changeable_tensor

<tf.Variable 'Variable:0' shape=(2,) dtype=int32, numpy=array([7, 7])>

In [21]:
# Will error (can't change tf.constant())
unchangeable_tensor[0].assign(7)
unchangeable_tensor

AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'assign'

Which one should you use? tf.constant() or tf.Variable()?

It will depend on what your problem requires. However, most of the time, TensorFlow will automatically choose for you (when loading data or modelling data).

## Creating random tensors
Random tensors are tensors of some abitrary size which contain random numbers.

Why would you want to create random tensors?

This is what neural networks use to intialize their weights (patterns) that they're trying to learn in the data.

For example, the process of a neural network learning often involves taking a random n-dimensional array of numbers and refining them until they represent some kind of pattern (a compressed way to represent the original data).


We can create random tensors by using the tf.random.Generator class.

In [22]:
# Create two random (but the same) tensors
random_1 = tf.random.Generator.from_seed(42) # set the seed for reproducibility
random_1 = random_1.normal(shape=(3, 2)) # create tensor from a normal distribution 
random_2 = tf.random.Generator.from_seed(42)
random_2 = random_2.normal(shape=(3, 2))

# Are they equal?
random_1, random_2, random_1 == random_2
     

(<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[-0.7565803 , -0.06854702],
        [ 0.07595026, -1.2573844 ],
        [-0.23193765, -1.8107855 ]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[-0.7565803 , -0.06854702],
        [ 0.07595026, -1.2573844 ],
        [-0.23193765, -1.8107855 ]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=bool, numpy=
 array([[ True,  True],
        [ True,  True],
        [ True,  True]])>)

The random tensors we've made are actually pseudorandom numbers (they appear as random, but really aren't).

If we set a seed we'll get the same random numbers (if you've ever used NumPy, this is similar to np.random.seed(42)).

Setting the seed says, "hey, create some random numbers, but flavour them with X" (X is the seed).

What do you think will happen when we change the seed?

In [23]:
# Create two random (and different) tensors
random_3 = tf.random.Generator.from_seed(42)
random_3 = random_3.normal(shape=(3, 2))
random_4 = tf.random.Generator.from_seed(11)
random_4 = random_4.normal(shape=(3, 2))

# Check the tensors and see if they are equal
random_3, random_4, random_1 == random_3, random_3 == random_4
     

(<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[-0.7565803 , -0.06854702],
        [ 0.07595026, -1.2573844 ],
        [-0.23193765, -1.8107855 ]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=float32, numpy=
 array([[ 0.2730574 , -0.29925638],
        [-0.3652325 ,  0.61883307],
        [-1.0130816 ,  0.2829171 ]], dtype=float32)>,
 <tf.Tensor: shape=(3, 2), dtype=bool, numpy=
 array([[ True,  True],
        [ True,  True],
        [ True,  True]])>,
 <tf.Tensor: shape=(3, 2), dtype=bool, numpy=
 array([[False, False],
        [False, False],
        [False, False]])>)

What if you wanted to shuffle the order of a tensor?

Wait, why would you want to do that?

Let's say you working with 15,000 images of cats and dogs and the first 10,000 images of were of cats and the next 5,000 were of dogs. This order could effect how a neural network learns (it may overfit by learning the order of the data), instead, it might be a good idea to move your data around.

In [24]:
# Shuffle a tensor (valuable for when you want to shuffle your data)
not_shuffled = tf.constant([[10, 7],
                            [3, 4],
                            [2, 5]])
# Gets different results each time
tf.random.shuffle(not_shuffled)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[10,  7],
       [ 2,  5],
       [ 3,  4]])>

In [25]:
# Shuffle in the same order every time using the seed parameter (won't acutally be the same)
tf.random.shuffle(not_shuffled, seed=42)

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[ 2,  5],
       [ 3,  4],
       [10,  7]])>

In [26]:
# Shuffle in the same order every time

# Set the global random seed
tf.random.set_seed(42)

# Set the operation random seed
tf.random.shuffle(not_shuffled, seed=42)   

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4],
       [ 2,  5]])>

In [30]:
# Set the global random seed
tf.random.set_seed(42) # if you comment this out you'll get different results

# Set the operation random seed
tf.random.shuffle(not_shuffled, seed=21)
     

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[ 3,  4],
       [ 2,  5],
       [10,  7]])>

## Other ways to make tensors
Though you might rarely use these (remember, many tensor operations are done behind the scenes for you), you can use tf.ones() to create a tensor of all ones and tf.zeros() to create a tensor of all zeros.

In [31]:
# Make a tensor of all ones
tf.ones(shape=(3, 2, 10, 20)) #4D TENSOR

<tf.Tensor: shape=(3, 2, 10, 20), dtype=float32, numpy=
array([[[[1., 1., 1., ..., 1., 1., 1.],
         [1., 1., 1., ..., 1., 1., 1.],
         [1., 1., 1., ..., 1., 1., 1.],
         ...,
         [1., 1., 1., ..., 1., 1., 1.],
         [1., 1., 1., ..., 1., 1., 1.],
         [1., 1., 1., ..., 1., 1., 1.]],

        [[1., 1., 1., ..., 1., 1., 1.],
         [1., 1., 1., ..., 1., 1., 1.],
         [1., 1., 1., ..., 1., 1., 1.],
         ...,
         [1., 1., 1., ..., 1., 1., 1.],
         [1., 1., 1., ..., 1., 1., 1.],
         [1., 1., 1., ..., 1., 1., 1.]]],


       [[[1., 1., 1., ..., 1., 1., 1.],
         [1., 1., 1., ..., 1., 1., 1.],
         [1., 1., 1., ..., 1., 1., 1.],
         ...,
         [1., 1., 1., ..., 1., 1., 1.],
         [1., 1., 1., ..., 1., 1., 1.],
         [1., 1., 1., ..., 1., 1., 1.]],

        [[1., 1., 1., ..., 1., 1., 1.],
         [1., 1., 1., ..., 1., 1., 1.],
         [1., 1., 1., ..., 1., 1., 1.],
         ...,
         [1., 1., 1., ..., 1., 1., 1.],


In [32]:
# Make a tensor of all zeros
tf.zeros(shape=(3, 2))


<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
array([[0., 0.],
       [0., 0.],
       [0., 0.]], dtype=float32)>

## You can also turn NumPy arrays in into tensors.

Remember, the main difference between tensors and NumPy arrays is that tensors can be run on GPUs.

In [33]:
import numpy as np
numpy_A = np.arange(1, 25, dtype=np.int32) # create a NumPy array between 1 and 25
A = tf.constant(numpy_A,  
                shape=[2, 4, 3]) # note: the shape total (2*4*3) has to match the number of elements in the array
numpy_A, A

(array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19, 20, 21, 22, 23, 24]),
 <tf.Tensor: shape=(2, 4, 3), dtype=int32, numpy=
 array([[[ 1,  2,  3],
         [ 4,  5,  6],
         [ 7,  8,  9],
         [10, 11, 12]],
 
        [[13, 14, 15],
         [16, 17, 18],
         [19, 20, 21],
         [22, 23, 24]]])>)

## Getting information from tensors (shape, rank, size)
There will be times when you'll want to get different pieces of information from your tensors, in particuluar, you should know the following tensor vocabulary:
<ul>
<li>Shape: The length (number of elements) of each of the dimensions of a tensor.
<li>Rank: The number of tensor dimensions. A scalar has rank 0, a vector has rank 1, a matrix is rank 2, a tensor has rank n.
<li>Axis or Dimension: A particular dimension of a tensor.
<li>Size: The total number of items in the tensor.
</ul>
You'll use these especially when you're trying to line up the shapes of your data to the shapes of your model. For example, making sure the shape of your image tensors are the same shape as your models input layer.

We've already seen one of these before using the ndim attribute. Let's see the rest.

In [34]:
# Create a rank 4 tensor (4 dimensions)
rank_4_tensor = tf.zeros([2, 3, 4, 5])
rank_4_tensor

<tf.Tensor: shape=(2, 3, 4, 5), dtype=float32, numpy=
array([[[[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]]],


       [[[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]],

        [[0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.],
         [0., 0., 0., 0., 0.]]]], dtype=float32)>

In [35]:
rank_4_tensor.shape, rank_4_tensor.ndim, tf.size(rank_4_tensor)

(TensorShape([2, 3, 4, 5]), 4, <tf.Tensor: shape=(), dtype=int32, numpy=120>)

In [36]:
# Get various attributes of tensor
print("Datatype of every element:", rank_4_tensor.dtype)
print("Number of dimensions (rank):", rank_4_tensor.ndim)
print("Shape of tensor:", rank_4_tensor.shape)
print("Elements along axis 0 of tensor:", rank_4_tensor.shape[0])
print("Elements along last axis of tensor:", rank_4_tensor.shape[-1])
print("Total number of elements (2*3*4*5):", tf.size(rank_4_tensor).numpy()) # .numpy() converts to NumPy array

Datatype of every element: <dtype: 'float32'>
Number of dimensions (rank): 4
Shape of tensor: (2, 3, 4, 5)
Elements along axis 0 of tensor: 2
Elements along last axis of tensor: 5
Total number of elements (2*3*4*5): 120


In [37]:
# Get the first 2 items of each dimension
rank_4_tensor[:2, :2, :2, :2]

<tf.Tensor: shape=(2, 2, 2, 2), dtype=float32, numpy=
array([[[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]],


       [[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]]], dtype=float32)>

In [38]:
# Get the dimension from each index except for the final one
rank_4_tensor[:1, :1, :1, :]

<tf.Tensor: shape=(1, 1, 1, 5), dtype=float32, numpy=array([[[[0., 0., 0., 0., 0.]]]], dtype=float32)>

In [39]:
# Create a rank 2 tensor (2 dimensions)
rank_2_tensor = tf.constant([[10, 7],
                             [3, 4]])

# Get the last item of each row
rank_2_tensor[:, -1]    

<tf.Tensor: shape=(2,), dtype=int32, numpy=array([7, 4])>

In [40]:
# Add an extra dimension (to the end)
rank_3_tensor = rank_2_tensor[..., tf.newaxis] # in Python "..." means "all dimensions prior to"
rank_2_tensor, rank_3_tensor # shape (2, 2), shape (2, 2, 1)


(<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
 array([[10,  7],
        [ 3,  4]])>,
 <tf.Tensor: shape=(2, 2, 1), dtype=int32, numpy=
 array([[[10],
         [ 7]],
 
        [[ 3],
         [ 4]]])>)

You can achieve the same using tf.expand_dims().

In [43]:
tf.expand_dims(rank_2_tensor, axis=-1) # "-1" means last axis

<tf.Tensor: shape=(2, 2, 1), dtype=int32, numpy=
array([[[10],
        [ 7]],

       [[ 3],
        [ 4]]])>

## Manipulating tensors (tensor operations)
Finding patterns in tensors (numberical representation of data) requires manipulating them.

Again, when building models in TensorFlow, much of this pattern discovery is done for you.

### Basic operations
You can perform many of the basic mathematical operations directly on tensors using Python operators such as, +, -, *.

In [45]:
# You can add values to a tensor using the addition operator
tensor = tf.constant([[10, 7], [3, 4]])
tensor + 10

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[20, 17],
       [13, 14]])>

Since we used tf.constant(), the original tensor is unchanged (the addition gets done on a copy).

In [46]:
# Original tensor unchanged
tensor

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4]])>

In [47]:
# Multiplication (known as element-wise multiplication)
tensor * 10

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[100,  70],
       [ 30,  40]])>

In [49]:
# subtract
tensor - 10

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[ 0, -3],
       [-7, -6]])>

You can also use the equivalent TensorFlow function. Using the TensorFlow function (where possible) has the advantage of being speed up later down the line when running as part of a TensorFlow graph.

In [50]:
# Use the tensorflow function equivalent of the '*' (multiply) operator
tf.multiply(tensor, 10)

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[100,  70],
       [ 30,  40]])>

In [51]:
# The original tensor is still unchanged
tensor

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[10,  7],
       [ 3,  4]])>

## Matrix mutliplication
One of the most common operations in machine learning algorithms is matrix multiplication.

TensorFlow implements this matrix multiplication functionality in the tf.matmul() method.

The main two rules for matrix multiplication to remember are:
<ul>
<li>
The inner dimensions must match:
<li>
The resulting matrix has the shape of the outer dimensions:
</ul>

In [53]:
# Matrix multiplication in TensorFlow
print(tensor)
tf.matmul(tensor, tensor)

tf.Tensor(
[[10  7]
 [ 3  4]], shape=(2, 2), dtype=int32)


<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[121,  98],
       [ 42,  37]])>

In [55]:
# Matrix multiplication with Python operator '@'
tensor @ tensor

<tf.Tensor: shape=(2, 2), dtype=int32, numpy=
array([[121,  98],
       [ 42,  37]])>

## Reshaping
<ul>
<li>
tf.reshape() - allows us to reshape a tensor into a defined shape.
<li>
tf.transpose() - switches the dimensions of a given tensor.

In [58]:
x = tf.constant([[1,3], 
                 [1,4],
                 [2,3]])
x

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[1, 3],
       [1, 4],
       [2, 3]])>

In [60]:
tf.reshape(x,shape=(1,2,3))

<tf.Tensor: shape=(1, 2, 3), dtype=int32, numpy=
array([[[1, 3, 1],
        [4, 2, 3]]])>

In [61]:
tf.reshape(x,shape=(1,3,2))

<tf.Tensor: shape=(1, 3, 2), dtype=int32, numpy=
array([[[1, 3],
        [1, 4],
        [2, 3]]])>

In [62]:
tf.reshape(x,shape=(1,1,6))

<tf.Tensor: shape=(1, 1, 6), dtype=int32, numpy=array([[[1, 3, 1, 4, 2, 3]]])>

In [64]:
# transpose (3,2) >> (2,3)
tf.transpose(x)

<tf.Tensor: shape=(2, 3), dtype=int32, numpy=
array([[1, 1, 2],
       [3, 4, 3]])>

In [65]:
# transpose in matmul
a = tf.random.normal(shape=(3,5))
b = tf.random.normal(shape=(4,5))

tf.matmul(a, b, transpose_a=False, transpose_b=True)

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[ 2.0620718 ,  4.389797  ,  4.01282   , -2.121596  ],
       [ 0.444628  , -1.5793631 , -1.9213755 , -1.9311769 ],
       [ 0.1918219 ,  0.8305587 ,  6.957326  , -0.83640313]],
      dtype=float32)>

## The dot product

Multiplying matrices by eachother is also referred to as the dot product.

You can perform the tf.matmul() operation using tf.tensordot().

- The dot product is a special case of matrix multiplication where both matrices are vectors (i.e., one-dimensional arrays). The dot product of two vectors can be computed as the sum of the element-wise products of the vectors.

- Matrix multiplication, on the other hand, is a more general operation that can be applied to matrices of any shape. It involves multiplying the rows of the first matrix by the columns of the second matrix, and summing the products.

- One key difference between dot product and matrix multiplication is the output shape. The dot product of two vectors results in a scalar, whereas the result of matrix multiplication is a matrix whose shape depends on the shape of the input matrices.

- Another difference is that matrix multiplication is not commutative, whereas the dot product is. In other words, if you have two matrices A and B, the result of A * B may not be the same as the result of B * A, whereas the dot product of two vectors a and b is the same as the dot product of b and a.

- In deep learning and machine learning, matrix multiplication is commonly used to represent linear transformations between layers of a neural network. The dot product is used in various operations, such as computing the similarity between two vectors or in attention mechanisms.


- **tf.matmul** and **tf.tensordot** are both TensorFlow functions for performing tensor multiplication, but they differ in how they perform the multiplication.

- tf.matmul is used for matrix multiplication between two tensors, and it only works on tensors with rank >= 2. It performs matrix multiplication according to the standard mathematical rules, which involve multiplying the rows of the first matrix by the columns of the second matrix. The two input tensors must have compatible shapes for matrix multiplication (i.e., the inner dimensions must match).

Here's an example of using tf.matmul:

In [66]:
import tensorflow as tf

# create two matrices
a = tf.constant([[1, 2], [3, 4]])
b = tf.constant([[5, 6], [7, 8]])

# compute matrix multiplication using tf.matmul
c = tf.matmul(a, b)

print(c.numpy())  # [[19 22]
                  #  [43 50]]


[[19 22]
 [43 50]]


tf.tensordot, on the other hand, is a more general function that can be used for various types of tensor multiplication, including matrix multiplication. It allows you to specify which axes to multiply along and how to combine the remaining axes. The two input tensors can have any rank.

Here's an example of using tf.tensordot for matrix multiplication:

In [67]:
import tensorflow as tf

# create two matrices
a = tf.constant([[1, 2], [3, 4]])
b = tf.constant([[5, 6], [7, 8]])

# compute matrix multiplication using tf.tensordot
c = tf.tensordot(a, b, axes=[[1], [0]])

print(c.numpy())  # [[19 22]
                  #  [43 50]]


[[19 22]
 [43 50]]


In this example, we specify that we want to multiply the second axis of a with the first axis of b, which correspond to the rows and columns, respectively, in matrix multiplication. The resulting tensor has the remaining axes of the two input tensors, which is what we want for matrix multiplication.

## Changing the datatype of a tensor
Sometimes you'll want to alter the default datatype of your tensor.

This is common when you want to compute using less precision (e.g. 16-bit floating point numbers vs. 32-bit floating point numbers).

Computing with less precision is useful on devices with less computing capacity such as mobile devices (because the less bits, the less space the computations require).

You can change the datatype of a tensor using tf.cast().

In [68]:
# Create a new tensor with default datatype (float32)
B = tf.constant([1.7, 7.4])

# Create a new tensor with default datatype (int32)
C = tf.constant([1, 7])
B, C

(<tf.Tensor: shape=(2,), dtype=float32, numpy=array([1.7, 7.4], dtype=float32)>,
 <tf.Tensor: shape=(2,), dtype=int32, numpy=array([1, 7])>)

In [69]:
# Change from float32 to float16 (reduced precision)
B = tf.cast(B, dtype=tf.float16)
B

<tf.Tensor: shape=(2,), dtype=float16, numpy=array([1.7, 7.4], dtype=float16)>

In [70]:
# Change from int32 to float32
C = tf.cast(C, dtype=tf.float32)
C

<tf.Tensor: shape=(2,), dtype=float32, numpy=array([1., 7.], dtype=float32)>

## Getting the absolute value
Sometimes you'll want the absolute values (all values are positive) of elements in your tensors.

To do so, you can use tf.abs().

In [73]:
# Create tensor with negative values
D = tf.constant([-7.5, -10])
D

<tf.Tensor: shape=(2,), dtype=float32, numpy=array([ -7.5, -10. ], dtype=float32)>

In [74]:
# Get the absolute values
tf.abs(D)

<tf.Tensor: shape=(2,), dtype=float32, numpy=array([ 7.5, 10. ], dtype=float32)>

## Finding the min, max, mean, sum (aggregation)
You can quickly aggregate (perform a calculation on a whole tensor) tensors to find things like the minimum value, maximum value, mean and sum of all the elements.

To do so, aggregation methods typically have the syntax reduce()_[action], such as:
<ul>
<li>tf.reduce_min() - find the minimum value in a tensor.
<li>tf.reduce_max() - find the maximum value in a tensor (helpful for when you want to find the highest prediction probability).
<li>tf.reduce_mean() - find the mean of all elements in a tensor.
<li>tf.reduce_sum() - find the sum of all elements in a tensor.
</ul>
Note: typically, each of these is under the math module, e.g. tf.math.reduce_min() but you can use the alias tf.reduce_min().
Let's see them in action.

In [75]:
# Create a tensor with 50 random values between 0 and 100
E = tf.constant(np.random.randint(low=0, high=100, size=50))
E

<tf.Tensor: shape=(50,), dtype=int32, numpy=
array([16, 46, 19, 38, 49, 10, 33, 26, 24, 17, 14, 28, 39, 15, 67, 29, 40,
       24, 82, 75, 87, 75, 85, 33, 95, 43, 45, 91, 79,  5, 71, 67, 72,  5,
       21, 68, 36, 83, 13, 90, 97,  8, 13,  9, 32, 56, 88, 50, 76, 54])>

In [76]:
# Find the minimum
tf.reduce_min(E)

<tf.Tensor: shape=(), dtype=int32, numpy=5>

In [77]:
# Find the maximum
tf.reduce_max(E)

<tf.Tensor: shape=(), dtype=int32, numpy=97>

In [78]:
# Find the mean
tf.reduce_mean(E)

<tf.Tensor: shape=(), dtype=int32, numpy=46>

In [79]:
# Find the sum
tf.reduce_sum(E)   

<tf.Tensor: shape=(), dtype=int32, numpy=2338>

You can also find the standard deviation (tf.reduce_std()) and variance (tf.reduce_variance()) of elements in a tensor using similar methods.


## Finding the positional maximum and minimum
How about finding the position a tensor where the maximum value occurs?

This is helpful when you want to line up your labels (say ['Green', 'Blue', 'Red']) with your prediction probabilities tensor (e.g. [0.98, 0.01, 0.01]).

In this case, the predicted label (the one with the highest prediction probability) would be 'Green'.

You can do the same for the minimum (if required) with the following:
<ul>
<li>tf.argmax() - find the position of the maximum element in a given tensor.
<li>tf.argmin() - find the position of the minimum element in a given tensor.

In [84]:
# Create a tensor with 50 values between 0 and 1
F = tf.constant(np.random.random((50)))
F

<tf.Tensor: shape=(50,), dtype=float64, numpy=
array([0.43256084, 0.58404309, 0.8520027 , 0.52542086, 0.24696952,
       0.19221685, 0.26257971, 0.20144448, 0.13381178, 0.56722572,
       0.97093021, 0.15858283, 0.05143708, 0.73613208, 0.68998962,
       0.36672821, 0.64479611, 0.02219078, 0.76275184, 0.33243255,
       0.82880131, 0.93376971, 0.80755642, 0.84896572, 0.2924872 ,
       0.09784432, 0.35448187, 0.31792348, 0.57771887, 0.00904141,
       0.09170314, 0.60717953, 0.60001214, 0.3526583 , 0.6320709 ,
       0.79486236, 0.37648065, 0.27198851, 0.48593869, 0.51071111,
       0.39142415, 0.31623211, 0.494381  , 0.2140529 , 0.85118019,
       0.41643056, 0.15296566, 0.15743329, 0.2458069 , 0.61907841])>

In [85]:
# Find the maximum element position of F
tf.argmax(F)

<tf.Tensor: shape=(), dtype=int64, numpy=10>

In [86]:
# Find the minimum element position of F
tf.argmin(F)

<tf.Tensor: shape=(), dtype=int64, numpy=29>

In [87]:
# Find the maximum element position of F
print(f"The maximum value of F is at position: {tf.argmax(F).numpy()}") 
print(f"The maximum value of F is: {tf.reduce_max(F).numpy()}") 
print(f"Using tf.argmax() to index F, the maximum value of F is: {F[tf.argmax(F)].numpy()}")
print(f"Are the two max values the same (they should be)? {F[tf.argmax(F)].numpy() == tf.reduce_max(F).numpy()}")

The maximum value of F is at position: 10
The maximum value of F is: 0.9709302080226296
Using tf.argmax() to index F, the maximum value of F is: 0.9709302080226296
Are the two max values the same (they should be)? True


## Squeezing a tensor (removing all single dimensions)
If you need to remove single-dimensions from a tensor (dimensions with size 1), you can use tf.squeeze().

tf.squeeze() - remove all dimensions of 1 from a tensor.

In [88]:
# Create a rank 5 (5 dimensions) tensor of 50 numbers between 0 and 100
G = tf.constant(np.random.randint(0, 100, 50), shape=(1, 1, 1, 1, 50))
G.shape, G.ndim

(TensorShape([1, 1, 1, 1, 50]), 5)

In [89]:
# Squeeze tensor G (remove all 1 dimensions)
G_squeezed = tf.squeeze(G)
G_squeezed.shape, G_squeezed.ndim

(TensorShape([50]), 1)

## One-hot encoding
If you have a tensor of indicies and would like to one-hot encode it, you can use tf.one_hot().

You should also specify the depth parameter (the level which you want to one-hot encode to).

In [90]:
# Create a list of indices
some_list = [0, 1, 2, 3]

# One hot encode them
tf.one_hot(some_list, depth=4)
     

<tf.Tensor: shape=(4, 4), dtype=float32, numpy=
array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]], dtype=float32)>

You can also specify values for on_value and off_value instead of the default 0 and 1.

In [91]:
# Specify custom values for on and off encoding
tf.one_hot(some_list, depth=4, on_value="alive", off_value="dead")

<tf.Tensor: shape=(4, 4), dtype=string, numpy=
array([[b'alive', b'dead', b'dead', b'dead'],
       [b'dead', b'alive', b'dead', b'dead'],
       [b'dead', b'dead', b'alive', b'dead'],
       [b'dead', b'dead', b'dead', b'alive']], dtype=object)>

## Squaring, log, square root

Many other common mathematical operations you'd like to perform at some stage, probably exist.

Let's take a look at:
<ul>
<li>tf.square() - get the square of every value in a tensor.
<li>tf.sqrt() - get the squareroot of every value in a tensor (note: the elements need to be floats or this will error).
<li>tf.math.log() - get the natural log of every value in a tensor (elements need to floats).

In [92]:
# Create a new tensor
H = tf.constant(np.arange(1, 10))
H

<tf.Tensor: shape=(9,), dtype=int32, numpy=array([1, 2, 3, 4, 5, 6, 7, 8, 9])>

In [93]:
# Square it
tf.square(H)

<tf.Tensor: shape=(9,), dtype=int32, numpy=array([ 1,  4,  9, 16, 25, 36, 49, 64, 81])>

In [94]:
# Find the squareroot (will error), needs to be non-integer
tf.sqrt(H)

InvalidArgumentError: Value for attr 'T' of int32 is not in the list of allowed values: bfloat16, half, float, double, complex64, complex128
	; NodeDef: {{node Sqrt}}; Op<name=Sqrt; signature=x:T -> y:T; attr=T:type,allowed=[DT_BFLOAT16, DT_HALF, DT_FLOAT, DT_DOUBLE, DT_COMPLEX64, DT_COMPLEX128]> [Op:Sqrt]

In [96]:
# Change H to float32
H = tf.cast(H, dtype=tf.float32)
H  

<tf.Tensor: shape=(9,), dtype=float32, numpy=array([1., 2., 3., 4., 5., 6., 7., 8., 9.], dtype=float32)>

In [98]:
# Find the squareroot (will error), needs to be non-integer
tf.sqrt(H)

<tf.Tensor: shape=(9,), dtype=float32, numpy=
array([1.       , 1.4142135, 1.7320508, 2.       , 2.2360678, 2.4494896,
       2.6457512, 2.828427 , 3.       ], dtype=float32)>

In [97]:
# Find the log (input also needs to be float)
tf.math.log(H)    

<tf.Tensor: shape=(9,), dtype=float32, numpy=
array([0.       , 0.6931472, 1.0986123, 1.3862944, 1.609438 , 1.7917595,
       1.9459102, 2.0794415, 2.1972246], dtype=float32)>

## Tensors and NumPy
We've seen some examples of tensors interact with NumPy arrays, such as, using NumPy arrays to create tensors.

Tensors can also be converted to NumPy arrays using:
<ul>
<li>np.array() - pass a tensor to convert to an ndarray (NumPy's main datatype).
<li>tensor.numpy() - call on a tensor to convert to an ndarray.
</ul>
Doing this is helpful as it makes tensors iterable as well as allows us to use any of NumPy's methods on them.

In [99]:
# Create a tensor from a NumPy array
J = tf.constant(np.array([3., 7., 10.]))
J

<tf.Tensor: shape=(3,), dtype=float64, numpy=array([ 3.,  7., 10.])>

In [100]:
# Convert tensor J to NumPy with np.array()
np.array(J), type(np.array(J))

(array([ 3.,  7., 10.]), numpy.ndarray)

In [101]:
# Convert tensor J to NumPy with .numpy()
J.numpy(), type(J.numpy())

(array([ 3.,  7., 10.]), numpy.ndarray)

By default tensors have dtype=float32, where as NumPy arrays have dtype=float64.

This is because neural networks (which are usually built with TensorFlow) can generally work very well with less precision (32-bit rather than 64-bit).

In [103]:
# Create a tensor from NumPy and from an array
numpy_J = tf.constant(np.array([3., 7., 10.])) # will be float64 (due to NumPy)
tensor_J = tf.constant([3., 7., 10.]) # will be float32 (due to being TensorFlow default)
numpy_J.dtype, tensor_J.dtype  

(tf.float64, tf.float32)

## Using @tf.function
In your TensorFlow adventures, you might come across Python functions which have the decorator @tf.function.

In short, decorators modify a function in one way or another.

In the @tf.function decorator case, it turns a Python function into a callable TensorFlow graph. Which is a fancy way of saying, if you've written your own Python function, and you decorate it with @tf.function, when you export your code (to potentially run on another device), TensorFlow will attempt to convert it into a fast(er) version of itself (by making it part of a computation graph).

For more on this, read the Better performnace with tf.function guide.

In [104]:
# Create a simple function
def function(x, y):
  return x ** 2 + y

x = tf.constant(np.arange(0, 10))
y = tf.constant(np.arange(10, 20))
function(x, y)

<tf.Tensor: shape=(10,), dtype=int32, numpy=array([ 10,  12,  16,  22,  30,  40,  52,  66,  82, 100])>

In [105]:
# Create the same function and decorate it with tf.function
@tf.function
def tf_function(x, y):
  return x ** 2 + y

tf_function(x, y)

<tf.Tensor: shape=(10,), dtype=int32, numpy=array([ 10,  12,  16,  22,  30,  40,  52,  66,  82, 100])>

If you noticed no difference between the above two functions (the decorated one and the non-decorated one) you'd be right.

Much of the difference happens behind the scenes. One of the main ones being potential code speed-ups where possible.

# Finding access to GPUs
We've mentioned GPUs plenty of times throughout this notebook.

So how do you check if you've got one available?

You can check if you've got access to a GPU using tf.config.list_physical_devices().

In [106]:
print(tf.config.list_physical_devices('GPU'))

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]


If the above outputs an empty array (or nothing), it means you don't have access to a GPU (or at least TensorFlow can't find it).

In [107]:
!nvidia-smi

Fri Apr 28 05:26:44 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 531.41                 Driver Version: 531.41       CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                      TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  NVIDIA GeForce GTX 1650 Ti    WDDM | 00000000:01:00.0 Off |                  N/A |
| N/A   51C    P8                3W /  N/A|   2476MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    