In [1]:
import torch
torch.__version__

'2.0.0+cu118'

# Creating tensors
A scalar is a single number and in tensor-speak its a zero dimension tensor



In [2]:
#scalar 
scalar  = torch.tensor(3)
scalar.ndim

0

In [3]:
# Scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [4]:
#retrieve the number from the tensor
scalar.item()

7

# Vectors
A vector is a single dimension tensor but can contain many numbers
e.g [3,2] to describe [bedrooms, bathrooms] in your house or we can use [3, 2, 2] to represent [bedrooms, bathrooms, car_parks] in your house

In [5]:
#vector 
vector = torch.tensor([7,7])
vector.size, vector.dtype, vector.ndim, vector.shape

(<function Tensor.size>, torch.int64, 1, torch.Size([2]))

In [7]:
#lets create a matrix
a = [7,8]
b = [9, 10]
matrix = torch.tensor([a, b])
#we can just create it as 
matrix1 = torch.tensor([[ 7,  8],[ 9, 10]])
matrix, matrix1

(tensor([[ 7,  8],
         [ 9, 10]]),
 tensor([[ 7,  8],
         [ 9, 10]]))

In [9]:
matrix.shape

torch.Size([2, 2])

In [8]:
matrix.shape, matrix.size, matrix.dtype, matrix.ndim

(torch.Size([2, 2]), <function Tensor.size>, torch.int64, 2)

## Creating tensors

In [14]:
#tensor
tensor = torch.tensor([[[1,2,3], [4,5,6],[7,8,9]]])
tensor

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [15]:
#lets look at the attributes of the tensor
tensor.shape, tensor.ndim, tensor.size, type(tensor)

(torch.Size([1, 3, 3]), 3, <function Tensor.size>, torch.Tensor)

Machine Learning models such as neural netowrks manipulate and seek patterns within tensors.
As Data Scientist, we can define how the machine learning model starts(initialization), looks at data (representation) and updates (optimization) its random numbers.

## lets create random tensors

In [16]:
#creating random tensors
random_tensor = torch.rand(size=(3,4))
random_tensor, random_tensor.shape, random_tensor.dtype

(tensor([[0.8629, 0.5938, 0.2233, 0.5777],
         [0.0574, 0.2485, 0.5512, 0.7176],
         [0.8925, 0.6530, 0.3150, 0.0309]]),
 torch.Size([3, 4]),
 torch.float32)

The beauty at torch.rand() is that we can adjust the size to be whatever we want

In [17]:
random_image_size_tensor = torch.rand(size=(224,224,3))
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

# lets create zeros and ones tensors
sometimes you'll just want to fill tensors with zeros or ones 


In [18]:
#create a tensor of all zeros
zeros = torch.zeros(size=(3,4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [19]:
#create a tensor of all ones
ones = torch.ones(size=(3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

sometimes you wish to to create a range of numbers such as 1 to 20, you can use the torch.arange(start, end, step) function

In [20]:
#create a range of values between 0 and 10
zero_to_ten = torch.arange(0,10,1)
zero_to_ten

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [22]:
#create a range of values between 0 and 10
zero_to_ten = torch.arange(start=0,end=10,step=2)
zero_to_ten

tensor([0, 2, 4, 6, 8])

In [23]:
# Default datatype for tensors is float32
float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=None, # defaults to None, which is torch.float32 or whatever datatype is passed
                               device=None, # defaults to None, which uses the default tensor type
                               requires_grad=False) # if True, operations performed on the tensor are recorded 

float_32_tensor.shape, float_32_tensor.dtype, float_32_tensor.device

(torch.Size([3]), torch.float32, device(type='cpu'))

In [24]:
float_16_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=torch.float16) # torch.half would also work

float_16_tensor.dtype

torch.float16

## Getting information from tensors
once we create a tensor, you might want to get some information from them such as:
- shape: what shape is the tensor?
- dtype : what datatype are the element within the tensor stored in ?
- device :  what deveice is the tensor stored on? (usually GPU or CPU)

In [25]:
some_tensor = torch.rand(3,4)
some_tensor, some_tensor.shape, some_tensor.dtype, some_tensor.device

(tensor([[0.9696, 0.0815, 0.6653, 0.7652],
         [0.6458, 0.6294, 0.4224, 0.8588],
         [0.2069, 0.9628, 0.5189, 0.7421]]),
 torch.Size([3, 4]),
 torch.float32,
 device(type='cpu'))

## Manipulating Tensors
In deep learning, data (images, text, video, audio, protein structures etc) get represented as tensors. A
A model learns by investigating those tensors and performing a series of operation on tensors to create a representation of the patterns in the input data
- Addition
- substraction
- Multiplication (element-wise)
- Division
- matric multiplication

## Basic Operations on tensors

In [26]:
tensor = torch.tensor([1,2,3])
tensor + 5

tensor([6, 7, 8])

In [27]:
#multiply it by 5
tensor * 5

tensor([ 5, 10, 15])

In [28]:
tensor - 10

tensor([-9, -8, -7])

In [29]:
tensor

tensor([1, 2, 3])

In [30]:
#add and resassign
tensor = tensor +10
tensor

tensor([11, 12, 13])

In [31]:
tensor

tensor([11, 12, 13])

Pytorch has built-in functions like torch.mul() or torch.add() to perform basics operations

In [32]:
torch.mul(tensor, 10)

tensor([110, 120, 130])

In [33]:
print("Equals: ", tensor * tensor)

Equals:  tensor([121, 144, 169])


## Matrix multiplication (is all you need)
The main two rules for matrix multiplication to remember are:

1.The inner dimensions must match:
- (3, 2) @ (3, 2) won't work

- (3, 2) @ (3, 2) won't work
- (2, 3) @ (3, 2) will work

2. The resulting matrix has the shape of the outer dimensions:
- (2, 3) @ (3, 2) -> (2, 2)
- (3, 2) @ (2, 3) -> (3, 3) 

In [34]:
tensor = torch.tensor([1,2,3])
tensor.shape

torch.Size([3])

In [37]:
#Element-wise matrix multiplication
tensor * tensor # [1, 2, 3] * [1, 2, 3] = [1*1, 2*2, 3*3] = [1, 4,9 ]

tensor([1, 4, 9])

In [38]:
#Matrix Multiplication
torch.matmul(tensor, tensor)

tensor(14)

In [39]:
# you can also you @ to perform matrix multiplication 
tensor @ tensor

tensor(14)

In [41]:
tensor_A = torch.tensor([[1,2], [3,4],[5,6]], dtype=torch.float32)
tensor_B = torch.tensor([[7,10],[8,11],[9,12]], dtype=torch.float32)

torch.matmul(tensor_A, tensor_B)

RuntimeError: ignored

From the above we see that we can not perform matrix multiplication with tensors whose shape are not the same. 

to resolve this we have to transpose the tensor B

In [42]:
tensor_A.shape, tensor_B.shape

(torch.Size([3, 2]), torch.Size([3, 2]))

lete transpose B by using tensor_B.T

In [43]:
tensor_B.T.shape

torch.Size([2, 3])

In [44]:
# The operation works when tensor_B is transposed
print(f"Original shapes: tensor_A = {tensor_A.shape}, tensor_B = {tensor_B.shape}\n")
print(f"New shapes: tensor_A = {tensor_A.shape} (same as above), tensor_B.T = {tensor_B.T.shape}\n")
print(f"Multiplying: {tensor_A.shape} * {tensor_B.T.shape} <- inner dimensions match\n")
print("Output:\n")
output = torch.matmul(tensor_A, tensor_B.T)
print(output) 
print(f"\nOutput shape: {output.shape}")

Original shapes: tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([3, 2])

New shapes: tensor_A = torch.Size([3, 2]) (same as above), tensor_B.T = torch.Size([2, 3])

Multiplying: torch.Size([3, 2]) * torch.Size([2, 3]) <- inner dimensions match

Output:

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

Output shape: torch.Size([3, 3])


In [45]:
# torch.mm is a shortcut for matmul
torch.mm(tensor_A, tensor_B.T)

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])


Neural networks are full of matrix multiplications and dot products.

The torch.nn.Linear() module (we'll see this in action later on), also known as a feed-forward layer or fully connected layer, implements a matrix multiplication between an input x and a weights matrix A.

y
=
x
⋅
A
T
+
b
Where:

- x is the input to the layer (deep learning is a stack of layers like torch.nn.Linear() and others on top of each other).
- A is the weights matrix created by the layer, this starts out as random numbers that get adjusted as a neural network learns to better represent patterns in the data (notice the "T", that's because the weights matrix gets transposed).
-- Note: You might also often see W or another letter like X used to showcase the weights matrix.
- b is the bias term used to slightly offset the weights and inputs.
- y is the output (a manipulation of the input in the hopes to discover patterns in it).

## Finding the min, max, mean, sum etc of a tensor

In [46]:
#create a tensor
x = torch.arange(0,100,10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [50]:
print(f"Minimum: {x.min()}")
print(f"Maximum: {x.max()}")
print(f"Mean: {x.type(torch.float32).mean()}")
print(f"Sum: {x.sum()}")

Minimum: 0
Maximum: 90
Mean: 45.0
Sum: 450


In [51]:
torch.min(x), torch.max(x), torch.sum(x), torch.mean(x.type(torch.float32))

(tensor(0), tensor(90), tensor(450), tensor(45.))

In [56]:
# Positional min/max
# we can find the index of a tensor where the max or minimum occurs with torch.argmax() or torch.argmin() respectively
tensor = torch.arange(10,100,10)
tensor, tensor.argmax(), tensor.argmin()

(tensor([10, 20, 30, 40, 50, 60, 70, 80, 90]), tensor(8), tensor(0))

In [57]:
# Create a tensor
import torch
x = torch.arange(1., 8.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7.]), torch.Size([7]))

In [None]:
# Add an extra dimension
x_reshaped = x.reshape(1, 7)
x_reshaped, x_reshaped.shape

In [59]:
# Change view (keeps same data as original but changes view)
# See more: https://stackoverflow.com/a/54507446/7900723
z = x.view(1, 7)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7]))

In [60]:
# Changing z changes x
z[:, 0] = 5
z, x

(tensor([[5., 2., 3., 4., 5., 6., 7.]]), tensor([5., 2., 3., 4., 5., 6., 7.]))