# Torch Fundamentals

## Table of Contents

[Torch Fundamentals](#torch-fundamentals)

- [Torch Primitive Comps](##torch-primitive-comps)

- [Tensor Reshaping](##tensor-reshaping)

- [Torch Datasets](##torch-datasets)

- [Torch Neural Layers](##torch-neural-layers)

In [2]:
import torch

## Torch Primitive Comps

In [3]:
a: torch.Tensor = torch.tensor([[2, 3], [4, 5]], dtype=torch.int32)
b: torch.Tensor = torch.tensor([[7, 8], [9, 10]], dtype=torch.int32)

c: torch.Tensor = torch.tensor([[2], [3]], dtype=torch.int32)

torch.add(a, b)

tensor([[ 9, 11],
        [13, 15]], dtype=torch.int32)

In [4]:
torch.mul(a, b)

tensor([[14, 24],
        [36, 50]], dtype=torch.int32)

In [5]:
torch.matmul(a, c)

tensor([[13],
        [23]], dtype=torch.int32)

In [6]:
torch.add(a, 5)

tensor([[ 7,  8],
        [ 9, 10]], dtype=torch.int32)

In [7]:
torch.add(a, c)

tensor([[4, 5],
        [7, 8]], dtype=torch.int32)

In [8]:
torch.mul(a, c)

tensor([[ 4,  6],
        [12, 15]], dtype=torch.int32)

In [9]:
puli: torch.Tensor = torch.tensor([[2, 3], [4, 5]], dtype=torch.int32)

In [10]:
puli

tensor([[2, 3],
        [4, 5]], dtype=torch.int32)

## Tensor Reshaping

In [11]:
orig: torch.Tensor = torch.tensor([[1, 2, 3], [4, 5, 6]], dtype=torch.int32)
orig # shape [2, 3]

tensor([[1, 2, 3],
        [4, 5, 6]], dtype=torch.int32)

In [12]:
reshaped: torch.Tensor = orig.view(3, 2)
reshaped

tensor([[1, 2],
        [3, 4],
        [5, 6]], dtype=torch.int32)

In [13]:
flattened: torch.Tensor = orig.view(-1) # [6] is shape, 1-dim
flattened

tensor([1, 2, 3, 4, 5, 6], dtype=torch.int32)

In [14]:
flattened.shape

torch.Size([6])

```python
orig.view(-1, 1) # or 6, 1 both give [6, 1] shape a col vec
orig.view(1, 6) # gives [1, 6] a row vec
```

In [15]:
flat: torch.Tensor = orig.view(6, -1)

## Torch Datasets

Now defining datasets using PyTorch Tensors. These datasets are called TensorDatasets and are a very vital feature of the PyTorch library. In this lesson, you will convert an array into a tensor, create a TensorDataset, use DataLoader for dividing the dataset into batches, and iterate through the batches. Let's dive right into it!
What is a TensorDataset?

As you might already know, PyTorch's primary unit of data storage is a tensor. But what if you have more than one tensor of data and you need to keep it collected? That's when TensorDataset comes into play.

A TensorDataset is a dataset that wraps multiple tensors. Each sample is a tuple of tensors where each tensor in the tuple corresponds to a level of the dataset. In simpler terms, it is a way to keep your tensors of input and output data organized together. Using TensorDataset makes it very easy to provide and manage your tensors of varying data types.

While it’s not always necessary to use TensorDataset, it can be very convenient, especially if you want to use a DataLoader for batching and shuffling your data. The major advantage here is that using TensorDataset, PyTorch can efficiently store and access the data, which is crucial while working with large datasets.

In [16]:
import numpy as np

X: np.ndarray = np.array([[1., 2.], [2., 1.], [3., 4.], [4., 3.]])

y: np.ndarray = np.array([0, 1, 0, 1])

X_tensor: torch.Tensor = torch.tensor(X, dtype=torch.float32)
y_tensor: torch.Tensor = torch.tensor(y, dtype=torch.int32)

just defined them as numpy arrays. We now have to convert them into PyTorch tensors.

The conversion code is straightforward, the torch.tensor function helps us transform our numpy array into tensors, and, with the use of the dtype parameter, we can specify them as floating point and integer numbers.

Now can build TensorDataset, the input to TensorDataset is the tensors we created above. TensorDataset will bundle or rather, wrap these tensors together into a single dataset.

In [17]:
from torch.utils.data import TensorDataset, DataLoader

dataset: TensorDataset = TensorDataset(X_tensor, y_tensor)

for i in range(len(dataset)):
    X_sample, y_sample = dataset[i]
    print(f"X[{i}]: {X_sample}, y[{i}]: {y_sample}")

X[0]: tensor([1., 2.]), y[0]: 0
X[1]: tensor([2., 1.]), y[1]: 1
X[2]: tensor([3., 4.]), y[2]: 0
X[3]: tensor([4., 3.]), y[3]: 1


Next, helping in the effective management of large datasets and easier iterating over data batches, PyTorch provides a tool named DataLoader. It allows efficient access to data and can really speed up your model training process. Both TensorDataset and DataLoader if iterated both are a tuple of (data, target).

DataLoader takes in a dataset and other parameters like batch_size, which defines the number of samples to work with per batch, and shuffle, which indicates to shuffle the data every epoch when set to True.

Using a TensorDataset with DataLoader is highly convenient as it allows for seamless handling of inputs and targets together in batches.

In [18]:
dataloader: DataLoader = DataLoader(dataset, batch_size=2, shuffle=True)

for (batch_X, batch_y) in dataloader:
    print(f"Batch X: {batch_X}, Batch y: {batch_y}")

Batch X: tensor([[3., 4.],
        [1., 2.]]), Batch y: tensor([0, 0], dtype=torch.int32)
Batch X: tensor([[2., 1.],
        [4., 3.]]), Batch y: tensor([1, 1], dtype=torch.int32)


So used the DataLoader and iterate through our dataset in batches. This process is fundamental in training Machine Learning models, as it allows the model to generalize better and also enables us to work with larger datasets by fitting only a batch of data in the memory at a time.

This output illustrates how DataLoader allows us to shuffle and batch our data efficiently. Due to the shuffling, the presented batches and their order might vary each time the code is executed. This is beneficial for model generalization during training.

Now have a good understanding of defining PyTorch tensors and the convenience of using TensorDataset especially when paired with DataLoader. We also looked at iterating the DataLoader.
Such situations with datasets commonly arise in a Machine Learning Engineer's daily work, hence proficiency in these skills is of utmost importance.

In [19]:
X: np.ndarray = np.array([[1., 2.], [2., 1.], [3., 4.], [4., 3.]])
y: np.ndarray = np.array([0, 1, 0, 1])

X_tensor: torch.Tensor = torch.tensor(X, dtype=torch.float32)
y_tensor: torch.Tensor = torch.tensor(y, dtype=torch.int32)

dataset: TensorDataset = TensorDataset(X_tensor, y_tensor)

dataloader: DataLoader = DataLoader(dataset, batch_size=2, shuffle=True)

for (batch_X, batch_y) in dataloader:
    print(f"Batch X: {batch_X}, Batch y: {batch_y}")


Batch X: tensor([[1., 2.],
        [3., 4.]]), Batch y: tensor([0, 0], dtype=torch.int32)
Batch X: tensor([[4., 3.],
        [2., 1.]]), Batch y: tensor([1, 1], dtype=torch.int32)


If you make batch size 3 here, since there are only 4 examples, will get one batch with 3, other with only one.

## Torch Neural Layers

Continue exploring tensor processing in PyTorch by discussing and implementing the crucial concepts of Linear Layers and Activation Functions.

When working with tensors in neural networks, it is essential to understand that they are processed through various layers. A layer in a neural network refers to a collection of neurons (nodes) operating together at the same depth level within the network. PyTorch provides us with the `torch.nn` module, an easy and powerful tool for creating and organizing these layers.

A vital part of most neural networks is the linear layer, which performs a linear transformation on its input data. A linear layer operates via the formula:

$y = Wx + b$

Where $y$ is the output, $W$ represents the weight matrix, $x$ is the input vector and $b$ is the bias vector. The weight matrix scales the input data, and the bias vector then shifts it, thereby producing the output. One of the powerful aspects of linear layers is their ability to transform the shape of the output as desired. By specifying the number of input and output features, you can control the dimensions of the tensor output from the layer. This flexibility allows the neural network to adapt to a variety of input shapes and deliver outputs that fit the requirements of the subsequent layers in the network.

In [20]:
import torch.nn as nn

inp: torch.Tensor = torch.tensor([[1., 2.]], dtype=torch.float32)
layer: nn.Linear = nn.Linear(in_features=2, out_features=3)

out: torch.Tensor = layer(inp)

In [21]:
print(f"Input: {inp}")
print(f"Layer: {layer}")
print(f"Output: {out}")

Input: tensor([[1., 2.]])
Layer: Linear(in_features=2, out_features=3, bias=True)
Output: tensor([[ 1.4698,  0.2164, -1.3393]], grad_fn=<AddmmBackward0>)


The output tensor displayed above results from passing the input tensor through the linear layer. This layer applies a weighted sum of the input tensor values and adds a bias term to each output. The weights and biases are initialized randomly, so the exact output can vary. The `grad_fn=<AddmmBackward0>` in the output means that PyTorch is keeping track of this operation, which will help compute the gradients automatically during model training.

Now covering activation functions. Activation functions introduce non-linearity into the model, enabling it to handle more complex patterns in the data. Two commonly used activation functions are ReLU (Rectified Linear Unit) and Sigmoid.

Mathematically, ReLU is represented as:

$f(x) = max(0, x)$

Where $x$ is the input to the function. The ReLU function ensures that positive input values remain unchanged, while negative ones are transformed to zero, creating a non-linear transformation.

The Sigmoid function, on the other hand, is represented as:

$\sigma(x) = \frac{1}{1 + e^{-x}}$

Where $x$ is the input. The Sigmoid function squashes the input value to lie between 0 and 1, which can be useful for binary classification tasks. However, in practice, ReLU is often preferred over Sigmoid for hidden layers due to its simplicity and performance benefits.

In [34]:
def ReLU(x: torch.Tensor) -> torch.Tensor:
    return torch.maximum(torch.zeros_like(x), x)

In [41]:
ReLU(layer(inp))

tensor([[1.4698, 0.2164, 0.0000]], grad_fn=<MaximumBackward0>)

We can define a ReLU activation function in PyTorch using the `nn.ReLU()` function from the `torch.nn module`. Then, it can be applied to the output tensor from our linear layer. The output tensor after activation will look just like the output of the raw implementation above.

The output tensor after activation demonstrates the effect of the ReLU function. It zeroes out any negative values, converting them to zero, while keeping positive values unchanged. This introduces non-linearity into the model, which is crucial for handling more complex patterns in the data. The `grad_fn=<ReluBackward0>` shows that the ReLU operation is also being tracked for automatic differentiation during training.


In [42]:
relu = nn.ReLU()

output_relu: torch.Tensor = relu(out)
print(f"Output ReLU: {output_relu}")

Output ReLU: tensor([[1.4698, 0.2164, 0.0000]], grad_fn=<ReluBackward0>)


Similarly, we can define and apply a Sigmoid activation function in PyTorch using the `nn.Sigmoid()` function from the `torch.nn` module.

In [44]:
def sigm(x: torch.Tensor) -> torch.Tensor:
    return 1 / (1 + torch.exp(-x))

In [45]:
sigm(layer(inp))

tensor([[0.8130, 0.5539, 0.2076]], grad_fn=<MulBackward0>)

In [47]:
sigmoid = nn.Sigmoid()

output_sigmoid: torch.Tensor = sigmoid(out)
print(f"Output Sigmoid: {output_sigmoid}")

Output Sigmoid: tensor([[0.8130, 0.5539, 0.2076]], grad_fn=<SigmoidBackward0>)


The output tensor after activation shows the effect of the Sigmoid function. It squashes the input values to lie between 0 and 1. This can be particularly useful in scenarios where you want to interpret the output as probabilities. The `grad_fn=<SigmoidBackward0>` indicates that the Sigmoid operation is tracked for automatic differentiation during training.

Explored the concept of tensor processing through Linear Layers and Activation Functions in PyTorch. Also how to use these two functions in combination to transform and process an input tensor. solidifying your understanding of these concepts and your ability to process tensors effectively as you move forward in building more complex neural network architectures in PyTorch.

In [50]:
inp: torch.Tensor = torch.tensor([[1., 2., 3.]], dtype=torch.float32) # example reducing din

layer: nn.Linear = nn.Linear(in_features=3, out_features=2)

relu: nn.ReLU = nn.ReLU()
sigmoid: nn.Sigmoid = nn.Sigmoid()

print(inp)
print(layer(inp))
print(relu(layer(inp)))
print(sigmoid(layer(inp)))

tensor([[1., 2., 3.]])
tensor([[1.8252, 0.9872]], grad_fn=<AddmmBackward0>)
tensor([[1.8252, 0.9872]], grad_fn=<ReluBackward0>)
tensor([[0.8612, 0.7285]], grad_fn=<SigmoidBackward0>)
