# Introduction to Torch Tensors

- In order to do machine learning, objects in real life must be transformed to numbers.
- This numbers are usually grouped in structures, so we need to store them together in some particular order

## Tensors
A tensor is a multidimensional array.
- Is the basic building block of most deep learning frameworks

Tensors can have from 0 to n dimensions.

In [None]:
import torch

# scalars
t = torch.tensor(3)
t, t.shape

In [None]:
# 1D tensor

t = torch.tensor([2, 3, 4])
t, t.shape

In [None]:
t = torch.tensor([2, 3, 4, 5, 6, 7, 8])
t, t.shape

In [None]:
# 2D tensor

t = torch.tensor(
    [
        [2, 3, 4], 
        [6, 7, 8]
    ])
t, t.shape

In [None]:
# Higher dimensional tensors
t = torch.randn((3, 4, 5))
t, t.shape

In [None]:
# Higher dimensional tensors
t = torch.randn((2, 3, 4, 5))
t, t.shape

Tensors are homogeneous, so they allow to have a very compact representation in memory, and allow to perform really fast operations.

## Indexing tensors

In [None]:
t = torch.tensor([
    [1, 2, 3, 4, 5],
    [6, 7, 8, 9, 10]
    ])
t.shape

In [None]:
t[0]

In [None]:
t[1]

In [None]:
t[:,0]

In [None]:
t[:,1]

In [None]:
t[:,1:3]

In [None]:
t[1,::2]

You can declare the type of a tensor. If not, Torch will infer it

In [None]:
torch.tensor([2, 3, 4]).dtype

In [None]:
torch.tensor([2.5, 3, 4]).dtype

In [None]:
torch.tensor([2, 3, 4], dtype=torch.float16)

In [None]:
# you can convert the tensor type using to() method
torch.tensor([2, 3, 4]).to(torch.double)

There are many methods you can use to operate tensors. Some examples:

In [None]:
a = torch.arange(0, 12)
a

In [None]:
a.view((6, 2))

In [None]:
a.view((2, 6))

In [None]:
t = a.view((-1, 4))
t

In [None]:
t.transpose(0, 1)

In [None]:
display(t)
display(t.unsqueeze(0))
t.shape, t.unsqueeze(0).shape

In [None]:
display(t)
display(t.unsqueeze(1))
t.shape, t.unsqueeze(1).shape

In [None]:
display(t)
display(t.unsqueeze(2))
t.shape, t.unsqueeze(2).shape

In [None]:
display(t)
display(t.unsqueeze(0))
display(t.unsqueeze(0).squeeze(0))

Most of these operations in tensors do not change the memory representation of the data, so they are very fast.

# Real world data representation using tensors

## Tabular data

In [None]:
! head data/heart.csv

In [None]:
import pandas as pd

# Load the CSV file into a DataFrame
df = pd.read_csv('data/heart.csv')

df = pd.get_dummies(df).astype(float)

# Convert the DataFrame to a PyTorch tensor
data_tensor = torch.tensor(df.values, dtype=torch.float32)

data_tensor.shape, data_tensor


## Images

In [None]:
import cv2
import matplotlib.pyplot as plt

image = cv2.imread('images/noidea.jpg')
plt.imshow(image)

In [None]:
image.shape

In [None]:
g_channel = image[:,:,0]
g_channel.shape

In [None]:
plt.imshow(g_channel, cmap='gray')

In [None]:
plt.imshow(image[:,:,1], cmap='gray')

In [None]:
plt.imshow(image[:,:,2], cmap='gray')

For machine learning, we have the convention of storing the channels first, and then the rows, and then the columns

In [None]:
img = torch.from_numpy(image)
out = img.permute(2, 0, 1)
out.shape

Note, this now representation cannot be directly visualized, because it uses a different convention

In [None]:
plt.imshow(out.numpy())

What if I want to store a batch of images?
- Use another dimension for storing the batch

In [None]:
batch = out.unsqueeze(0)
batch.shape

## Storing videos in tensors

In [None]:
def read_video_frames(video_path, k):
    # Open the video file
    cap = cv2.VideoCapture(video_path)
    
    # Check if the video opened successfully
    if not cap.isOpened():
        raise IOError("Could not open video file")
    
    frames = []
    frame_count = 0
    
    while frame_count < k:
        ret, frame = cap.read()
        if not ret:
            break  # Break if there are no more frames to read
        
        # Convert frame from BGR (OpenCV format) to RGB
        frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        
        # Append the frame to the list
        frames.append(frame)
        frame_count += 1
    
    # Release the video capture object
    cap.release()
    
    # Convert the list of frames to a 4D tensor (N, H, W, C) and then to (N, C, H, W)
    frames_tensor = torch.tensor(frames, dtype=torch.float32).permute(0, 3, 1, 2)
    
    return frames_tensor

# Example usage
video_path = 'images/funny.mp4'
k = 10
frames_tensor = read_video_frames(video_path, k)


In [None]:
frames_tensor.shape

You need to master torch tensor transformations

In [None]:
# Normalize all the values in the images
frames_tensor[0,1]

In [None]:
normalized = frames_tensor / 255.0
normalized[0,1]

In [None]:
# transform all the frames to grayscale
frames_tensor.shape

In [None]:
frames_tensor.mean(dim=1).shape

In [None]:
frames_tensor.mean(dim=1).unsqueeze(1).shape

In [None]:
# In the normalized tensor, convert all the values below a threshold to zero
normalized[0,1]

In [None]:
threshold = 0.3
normalized[normalized < threshold] = 0
normalized[0,1]

**Note** Conventions are really important