**Creating tensors**
Tensors are the fundamental building block of machine learning.

Their job is to represent data in a numerical way.

Documentation: https://pytorch.org/docs/stable/tensors.html


In [5]:
import torch

# Scalar
scalar = torch.tensor(7)
scalar


tensor(7)

In [4]:
scalar.ndim

0

What if we wanted to retrieve the number from the tensor?

As in, turn it from torch.Tensor to a Python integer?

To do we can use the item() method.

In [5]:
scalar.item()

7

In [7]:
# Vector
vector = torch.tensor([7, 7])
vector
#vector.ndim


1

In [8]:
# Check shape of vector
vector.shape

torch.Size([2])

In [9]:
# Matrix
MATRIX = torch.tensor([[7, 8], 
                       [9, 10]])
MATRIX
# Check number of dimensions
#MATRIX.ndim

tensor([[ 7,  8],
        [ 9, 10]])

In [11]:
#MATRIX.shape

In [13]:
# Tensor
TENSOR = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]]])
TENSOR
#TENSOR.ndim
# Check shape of TENSOR
#TENSOR.shape

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

![image.png](attachment:image.png)

In [15]:
# Create a random tensor of size (3, 4)
random_tensor = torch.rand(size=(3, 4))
random_tensor, 
#random_tensor.dtype

(tensor([[0.5608, 0.8214, 0.0660, 0.3444],
         [0.8853, 0.6883, 0.3517, 0.6094],
         [0.4315, 0.8742, 0.1701, 0.0757]]),)

# For example,  
a random tensor in the common image shape of [224, 224, 3] ([height, width, color_channels]). Representing Image


In [7]:
random_image_size_tensor = torch.rand(size=(224, 224, 3))
random_image_size_tensor, 
#random_image_size_tensor.ndim

(tensor([[[0.7395, 0.9769, 0.9269],
          [0.2547, 0.0650, 0.1380],
          [0.6989, 0.3342, 0.7010],
          ...,
          [0.2595, 0.3053, 0.7220],
          [0.9202, 0.6624, 0.8070],
          [0.7343, 0.3835, 0.6708]],
 
         [[0.9675, 0.6572, 0.2313],
          [0.6191, 0.9749, 0.1855],
          [0.8814, 0.4898, 0.0592],
          ...,
          [0.4718, 0.5347, 0.5764],
          [0.1984, 0.7946, 0.4984],
          [0.1337, 0.0788, 0.8051]],
 
         [[0.9649, 0.6731, 0.6636],
          [0.4646, 0.4863, 0.8434],
          [0.0721, 0.1569, 0.9641],
          ...,
          [0.9877, 0.0229, 0.9622],
          [0.3896, 0.4723, 0.9866],
          [0.4069, 0.7986, 0.6498]],
 
         ...,
 
         [[0.2809, 0.7262, 0.5796],
          [0.1890, 0.0400, 0.5373],
          [0.4468, 0.9347, 0.6918],
          ...,
          [0.9699, 0.3231, 0.5619],
          [0.0098, 0.4598, 0.8293],
          [0.8136, 0.4976, 0.7101]],
 
         [[0.8323, 0.7111, 0.6224],
          [0

In [None]:
# Create a tensor of all zeros and another of all ones and print datatype

# Creating a range and tensors 
like Sometimes you might want a range of numbers, such as 1 to 10 or 0 to 100.

You can use torch.arange(start, end, step) to do so.

Where:

start = start of range (e.g. 0)
end = end of range (e.g. 10)
step = how many steps in between each value (e.g. 1)

In [8]:
# Use torch.arange(), torch.range() is deprecated 
zero_to_ten_deprecated = torch.range(0, 10) # Note: this may return an error in the future

# Create a range of values 0 to 10
#zero_to_ten = torch.arange(start=0, end=10, step=1)
#zero_to_ten
zero_2_ten=torch.arange(start=0, end=11,step=2)
zero_2_ten

  zero_to_ten_deprecated = torch.range(0, 10) # Note: this may return an error in the future


tensor([ 0,  2,  4,  6,  8, 10])

In [10]:
# Can also create a tensor of zeros similar to another tensor
ten_zeros = torch.rand_like(input=zero_2_ten) # will have same shape
ten_zeros
#torch_rand = torch.rand_like(input=zero_2_ten) # why we get error? 
#torch_rand = torch.rand_like(input=zero_2_ten.float())
#torch_rand




RuntimeError: "check_uniform_bounds" not implemented for 'Long'

In [13]:
# Can also create a tensor of zeros similar to another tensor
ten_zeros = torch.rand_like(input=(zero_2_ten).float()) # will have same shape
ten_zeros
#torch_rand = torch.rand_like(input=zero_2_ten) # why we get error? 
#torch_rand




tensor([0.5234, 0.7313, 0.1277, 0.8907, 0.6102, 0.8699])

# Tensor datatypes
There are many different tensor datatypes available in PyTorch.

Some are specific for CPU and some are better for GPU.
Generally if you see torch.cuda anywhere, the tensor is being used for GPU (since Nvidia GPUs use a computing toolkit called CUDA).

The most common type (and generally the default) is torch.float32 or torch.float.

This is referred to as "32-bit floating point".

But there's also 16-bit floating point (torch.float16 or torch.half) and 64-bit floating point (torch.float64 or torch.double).

In [28]:
# Default datatype for tensors is float32
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=None, # defaults to None, which is torch.float32 or whatever datatype is passed
                               device=None, # defaults to None, which uses the default tensor type
                               requires_grad=False) # if True, operations performed on the tensor are recorded 

float_32_tensor.shape, float_32_tensor.dtype, float_32_tensor.device

(torch.Size([3]), torch.float32, device(type='cpu'))

In [29]:
float_16_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=torch.float16) # torch.half would also work

float_16_tensor.dtype

torch.float16

# Manipulating tensors (tensor operations)
These operations are often a wonderful dance between:
Addition
Substraction
Multiplication (element-wise)
Division
Matrix multiplication

In [30]:
# Create a tensor of values and add a number to it
tensor = torch.tensor([1, 2, 3])
tensor + 10

tensor([11, 12, 13])

In [31]:
# Multiply it by 10
tensor * 10

tensor([10, 20, 30])

In [None]:
# Tensors don't change unless reassigned
tensor

In [None]:
# Subtract and reassign
tensor = tensor - 10
tensor

In [None]:
# Add and reassign
tensor = tensor + 10
tensor

In [None]:
# Can also use torch functions
torch.multiply(tensor, 10)

In [None]:
# Original tensor is still unchanged 
tensor

In [32]:
# Element-wise multiplication (each element multiplies its equivalent, index 0->0, 1->1, 2->2)
print(tensor, "*", tensor)
print("Equals:", tensor * tensor)

tensor([1, 2, 3]) * tensor([1, 2, 3])
Equals: tensor([1, 4, 9])


# Matrix multiplication (is all you need)
The inner dimensions must match:
(3, 2) * (3, 2) won't work
(2, 3) * (3, 2) will work
(3, 2) * (2, 3) will work
The resulting matrix has the shape of the outer dimensions:
(2, 3) * (3, 2) -> (2, 2)
(3, 2) * (2, 3) -> (3, 3)

In [15]:
import torch
tensor = torch.tensor([1, 2, 3])
tensor.shape

torch.Size([3])

The difference between element-wise multiplication and matrix multiplication is the addition of values.

For our tensor variable with values [1, 2, 3]:
![image.png](attachment:image.png)

In [None]:
# Element-wise matrix multiplication
tensor * tensor

In [None]:
# Matrix multiplication
torch.matmul(tensor, tensor)
# # Can also use the "@" symbol for matrix multiplication, though not recommended
#tensor @ tensor

# %%time
In Python, especially in Jupyter notebooks, %%time is a magic command used to measure the execution time of a code block. It reports two types of time:

Wall time: This is the real-world time it takes for the code to run from start to finish, also known as elapsed time.
CPU time: The time the CPU actually spends processing the code (which excludes time spent waiting for I/O operations or time spent in other processes).

In [16]:
%%time
# Matrix multiplication by hand 
# (avoid doing operations with for loops at all cost, they are computationally expensive)
value = 0
for i in range(len(tensor)):
  value += tensor[i] * tensor[i]
value

CPU times: total: 0 ns
Wall time: 5.36 ms


tensor(14)

In [34]:
%%time
torch.matmul(tensor, tensor)

CPU times: total: 0 ns
Wall time: 7.73 ms


tensor(14)

# One of the most common errors in deep learning (shape errors)

In [28]:
# Shapes need to be in the right way  
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]], dtype=torch.float32)

tensor_B = torch.tensor([[7, 10],
                         [8, 11], 
                         [9, 12]], dtype=torch.float32)
tensor_A.shape, tensor_B.shape
tensor_B_Transpose = tensor_B.T
tensor_B_Transpose.shape, tensor_B.shape

torch.matmul(tensor_A, tensor_B_Transpose ) # (this will error)

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

# Transpose

In [37]:
# View tensor_A and tensor_B
print(tensor_A)
print(tensor_B)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7., 10.],
        [ 8., 11.],
        [ 9., 12.]])


In [38]:
# View tensor_A and tensor_B.T
print(tensor_A)
print(tensor_B.T)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])


In [39]:
# The operation works when tensor_B is transposed
print(f"Original shapes: tensor_A = {tensor_A.shape}, tensor_B = {tensor_B.shape}\n")
print(f"New shapes: tensor_A = {tensor_A.shape} (same as above), tensor_B.T = {tensor_B.T.shape}\n")
print(f"Multiplying: {tensor_A.shape} * {tensor_B.T.shape} <- inner dimensions match\n")
print("Output:\n")
output = torch.matmul(tensor_A, tensor_B.T)
print(output) 
print(f"\nOutput shape: {output.shape}")

Original shapes: tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([3, 2])

New shapes: tensor_A = torch.Size([3, 2]) (same as above), tensor_B.T = torch.Size([2, 3])

Multiplying: torch.Size([3, 2]) * torch.Size([2, 3]) <- inner dimensions match

Output:

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

Output shape: torch.Size([3, 3])


In [40]:
# torch.mm is a shortcut for matmul
torch.mm(tensor_A, tensor_B.T)

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

# Neural networks are full of matrix multiplications and dot products.
The torch.nn.Linear() module (we'll see this in action later on), also known as a feed-forward layer or fully connected layer, implements a matrix multiplication between an input x and a weights matrix A.
![image.png](attachment:image.png)
x is the input to the layer (deep learning is a stack of layers like torch.nn.Linear() and others on top of each other).
A is the weights matrix created by the layer, this starts out as random numbers that get adjusted as a neural network learns to better represent patterns in the data (notice the "T", that's because the weights matrix gets transposed).
Note: You might also often see W or another letter like X used to showcase the weights matrix.
b is the bias term used to slightly offset the weights and inputs.
y is the output (a manipulation of the input in the hopes to discover patterns in it).

In [42]:
# Since the linear layer starts with a random weights matrix, let's make it reproducible (more on this later)
torch.manual_seed(42)
# This uses matrix multiplication
linear = torch.nn.Linear(in_features=2, # in_features = matches inner dimension of input 
                         out_features=6) # out_features = describes outer value 
x = tensor_A
output = linear(x)
print(f"Input shape: {x.shape}\n")
print(f"Output:\n{output}\n\nOutput shape: {output.shape}")

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x6)