# PyTorch

PyTorch is a machine learning library that simplifies the creation and training of neural networks, widely used for its flexibility and ease of use. It supports dynamic computation graphs, enabling intuitive model design and experimentation.



## 1. Tensors

Everything in PyTorch is based on Tensor operations. A Tensor is a multi-dimensional matrix containing elements of a single data type:


In [None]:
import torch

# torch.empty(size): uninitiallized
x = torch.empty(1) # scalar
print("empty(1):", x)
x = torch.empty(3) # vector
print("empty(3):",x)
x = torch.empty(2, 3) # matrix
print("empty(2,3):",x)
x = torch.empty(2, 2, 3) # tensor, 3 dimensions
#x = torch.empty(2,2,2,3) # tensor, 4 dimensions
print("empty(2, 2, 3):",x)

# torch.rand(size): random numbers [0, 1]
x = torch.rand(5, 3)
print("rand(5,3):", x)

# torch.zeros(size), fill with 0
# torch.ones(size), fill with 1
x = torch.zeros(5, 3)
print("zeros(5,3):", x)

empty(1): tensor([1.3452e-43])
empty(3): tensor([ 1.4418e-15,  3.2863e-41, -8.0485e-07])
empty(2,3): tensor([[ 1.4423e-15,  3.2863e-41, -8.0485e-07],
        [ 4.4679e-41,  8.9683e-44,  0.0000e+00]])
empty(2, 2, 3): tensor([[[7.0065e-45, 0.0000e+00, 0.0000e+00],
         [0.0000e+00, 0.0000e+00, 0.0000e+00]],

        [[0.0000e+00, 0.0000e+00, 0.0000e+00],
         [0.0000e+00, 1.4013e-45, 0.0000e+00]]])
rand(5,3): tensor([[0.3082, 0.5944, 0.8576],
        [0.5026, 0.5358, 0.8352],
        [0.5365, 0.0569, 0.4760],
        [0.3217, 0.8649, 0.5425],
        [0.7246, 0.9956, 0.6134]])
zeros(5,3): tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])


In [None]:
# check size
print("size", x.size())  # x.size(0)
print("shape", x.shape)  # x.shape[0]

size torch.Size([5, 3])
shape torch.Size([5, 3])


In [None]:


# check data type
print(x.dtype)

# specify types, float32 default
x = torch.zeros(5, 3, dtype=torch.float16)
print(x)

# check type
print(x.dtype)

torch.float32
tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]], dtype=torch.float16)
torch.float16


In [None]:
#A tensor is a fundamental data structure in PyTorch and other deep learning frameworks.
# It is a multi-dimensional array, similar to NumPy arrays, but with additional capabilities specifically designed for deep learning.

# construct from data
x = torch.tensor([5.5, 3])
print(x, x.dtype)

tensor([5.5000, 3.0000]) torch.float32


In [None]:
# requires_grad argument
# This will tell pytorch that it will need to calculate the gradients for this tensor
# later in your optimization steps
# i.e. this is a variable in your model that you want to optimize
x = torch.tensor([5.5, 3], requires_grad=True)
print(x)

tensor([5.5000, 3.0000], requires_grad=True)


#### Operations with Tensors

In [None]:
# Operations
x = torch.ones(2, 2)
y = torch.rand(2, 2)

# elementwise addition
z = x + y
# torch.add(x,y)

# in place addition, everythin with a trailing underscore is an inplace operation
# i.e. it will modify the variable


print(x)
print(y)
print(z)

tensor([[1., 1.],
        [1., 1.]])
tensor([[0.5600, 0.5209],
        [0.2001, 0.3453]])
tensor([[1.5600, 1.5209],
        [1.2001, 1.3453]])


In [None]:
# subtraction
z = x - y
z = torch.sub(x, y)

# multiplication
z = x * y
z = torch.mul(x,y)

# division
z = x / y
z = torch.div(x,y)

In [None]:
# Slicing
x = torch.rand(5,3)
print(x)
print("x[:, 0]", x[:, 0]) # all rows, column 0
print("x[1, :]", x[1, :]) # row 1, all columns
print("x[1, 1]", x[1,1]) # element at 1, 1



tensor([[0.0181, 0.1100, 0.5256],
        [0.2587, 0.3269, 0.4605],
        [0.7458, 0.0070, 0.4519],
        [0.3945, 0.8181, 0.2563],
        [0.0352, 0.1937, 0.8751]])
x[:, 0] tensor([0.0181, 0.2587, 0.7458, 0.3945, 0.0352])
x[1, :] tensor([0.2587, 0.3269, 0.4605])
x[1, 1] tensor(0.3269)
x[1,1].item() 0.32685965299606323


In [None]:
# Reshape with torch.view()
x = torch.randn(4, 4)
y = x.view(16)
z = x.view(-1, 8)  # the size -1 is inferred from other dimensions

# Print sizes
print(x.size(), y.size(), z.size())


torch.Size([4, 4]) torch.Size([16]) torch.Size([2, 8])


## 2. Autograd:

In PyTorch, autograd (automatic differentiation) helps us figure out how changing the inputs of our mathematical operations affects the final result.
requires_grad=True:

Autograd in PyTorch is like an automatic assistant that keeps track of how your model's performance changes with each tweak (parameter update), helping you efficiently improve your model over time.

When you create a PyTorch tensor and set requires_grad=True, it's like telling PyTorch, "Hey, keep an eye on this tensor. I might want to know how much it contributes to the final result later."
This is crucial in machine learning. When you're training a model, you want to know how much each input (like a parameter or feature) affects the output (like the prediction). Setting requires_grad=True helps PyTorch automatically track these relationships.

Set `requires_grad = True`:

## 3. Model, Loss & Optimizer


   1.**Forward Pass:**
•	Imagine you're teaching a model. During the forward pass, the model makes predictions based on the input data. These predictions might be good or bad.
2.	**Loss Calculation:**
•	You compare the model's predictions to the actual (true) values. This difference is the "error" or "loss." The goal is to minimize this loss.
3.	**Backward Pass (Backpropagation):**
•	Backpropagation is like giving feedback to the model. It asks, "For each parameter in the model, how much did it contribute to the error?"
•	**Gradients are calculated.** Gradients tell us the direction and magnitude of change needed in each parameter to reduce the error.
4.**Update Parameters:**
•	With the gradients in hand, you update the model's parameters to reduce the error. This step is where learning happens.
•	The model is now a bit better at making predictions.


A typical PyTorch pipeline looks like this:

1. Design model (input, output, forward pass with different layers)
2. Construct loss and optimizer
3. Training loop:
  - Forward = compute prediction and loss
  - Backward = compute gradients
  - Update weights

In [None]:
import torch
import torch.nn as nn

# Linear regression
# f = w * x
# here : f = 2 * x

# 0) Training samples, watch the shape!
X = torch.tensor([[1], [2], [3], [4], [5], [6], [7], [8]], dtype=torch.float32)
Y = torch.tensor([[2], [4], [6], [8], [10], [12], [14], [16]], dtype=torch.float32)

n_samples, n_features = X.shape
print(f'n_samples = {n_samples}, n_features = {n_features}')

# 0) create a test sample
X_test = torch.tensor([5], dtype=torch.float32)

In [None]:
# 1) Design Model, the model has to implement the forward pass!

# Here we could simply use a built-in model from PyTorch
# model = nn.Linear(input_size, output_size)

# model class (LinearRegression) always inherit from nn module.
class LinearRegression(nn.Module):
    def __init__(self, input_dim, output_dim):
        super(LinearRegression, self).__init__()

        # define different layers apply in our model
        # in linear regression we use just one layer so we write nn linear regression.
        self.lin = nn.Linear(input_dim, output_dim)

    def forward(self, x):

      # in the forward pass we pass the linear regression parameter
        return self.lin(x)


input_size, output_size = n_features, n_features

model = LinearRegression(input_size, output_size)

print(f'Prediction before training: f({X_test.item()}) = {model(X_test).item():.3f}')

# 2) Define loss and optimizer
learning_rate = 0.01
n_epochs = 100

loss = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

# 3) Training loop

#, an epoch is one complete pass through the entire training dataset during the training of a model.
for epoch in range(n_epochs):
    # predict = forward pass with our model
    y_predicted = model(X)

    # loss
    l = loss(Y, y_predicted)

    # calculate gradients = backward pass
    l.backward()

    # update weights
    optimizer.step()

    # zero the gradients after updating

    # this optimizer step will update the model paramter
    optimizer.zero_grad()

    if (epoch+1) % 10 == 0:
        w, b = model.parameters() # unpack parameters
        print('epoch ', epoch+1, ': w = ', w[0][0].item(), ' loss = ', l.item())

print(f'Prediction after training: f({X_test.item()}) = {model(X_test).item():.3f}')

## 4. First Neural Net

implementation of a simple neural network for binary classification using PyTorch.

Neural Network Architecture:

•	The code defines a neural network class (SimpleNN) with one input layer, one hidden layer with ReLU activation, and one output layer.
•	The neural network is designed for binary classification with two input features.
•	Dataset Creation:

•	A small dataset (X) with input features and corresponding labels (Y) is created.
•	Model Initialization:

•	An instance of the SimpleNN model is created with specified input, hidden, and output sizes.
•	The model will learn to map input features to binary labels.
•	Loss Function and Optimizer:

•	The Binary Cross Entropy with Logits loss (nn.BCEWithLogitsLoss()) is used, suitable for binary classification problems.
•	The Adam optimizer (optim.Adam) is employed to optimize the model's parameters during training.
•	Training Loop:

•	The model is trained for a specified number of epochs.
•	In each epoch, predictions are made using the current model, and the loss is calculated by comparing predictions to actual labels.
•	The optimizer is then used to perform a backward pass to compute gradients and update the model parameters.
•	Testing the Trained Model:

•	After training, the model is tested on a separate set of data (test_data).
•	Predictions are made, and the results are printed, showing the input features and corresponding model predictions.
•	Activation Function:

•	The ReLU activation function is used in the hidden layer to introduce non-linearity.
•	Printed Outputs:



The code prints the loss during training at intervals of 100 epochs.
It also prints the predictions on the test data after training.
In summary, the code demonstrates the complete lifecycle of a simple neural network: from defining the architecture to training and testing for binary classification. The model is trained to make predictions on the XOR dataset (a classic binary classification problem).







In [2]:
# Import necessary libraries
import torch
import torch.nn as nn
import torch.optim as optim

# Define a simple neural network class
class SimpleNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(SimpleNN, self).__init__()
        # Define the first fully connected layer
        self.fc1 = nn.Linear(input_size, hidden_size)
        # Apply the Rectified Linear Unit (ReLU) activation function
        self.relu = nn.ReLU()
        # Define the second fully connected layer
        self.fc2 = nn.Linear(hidden_size, output_size)

    # Define the forward pass of the network
    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

# Create a dataset with input features (X) and corresponding labels (Y)
X = torch.tensor([[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]], dtype=torch.float32)
Y = torch.tensor([[0.0], [1.0], [1.0], [0.0]], dtype=torch.float32)

# Initialize the neural network with specified input, hidden, and output sizes
input_size = 2
hidden_size = 3
output_size = 1
model = SimpleNN(input_size, hidden_size, output_size)

# Define the loss function (Binary Cross Entropy with Logits) and the optimizer (Adam)
criterion = nn.BCEWithLogitsLoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

# Training loop
n_epochs = 1000
for epoch in range(n_epochs):
    # Forward pass: make predictions using the neural network
    predictions = model(X)
    # Compute the loss by comparing predictions to actual labels
    loss = criterion(predictions, Y)

    # Backward pass: compute gradients and update model parameters
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Print the loss every 100 epochs
    if (epoch + 1) % 100 == 0:
        print(f'Epoch {epoch + 1}/{n_epochs}, Loss: {loss.item():.4f}')

# Test the trained model
with torch.no_grad():
    # Create a test dataset
    test_data = torch.tensor([[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]], dtype=torch.float32)
    # Make predictions on the test data and apply sigmoid for binary classification
    predictions = model(test_data)

    print("\nPredictions:")
    for i in range(len(test_data)):
        # Print input features and corresponding predictions
        print(f"Input: {test_data[i].numpy()}, Prediction: {torch.sigmoid(predictions[i]).item():.4f}")


Epoch 100/1000, Loss: 0.4384
Epoch 200/1000, Loss: 0.1615
Epoch 300/1000, Loss: 0.0614
Epoch 400/1000, Loss: 0.0323
Epoch 500/1000, Loss: 0.0202
Epoch 600/1000, Loss: 0.0140
Epoch 700/1000, Loss: 0.0103
Epoch 800/1000, Loss: 0.0079
Epoch 900/1000, Loss: 0.0063
Epoch 1000/1000, Loss: 0.0052

Predictions:
Input: [0. 0.], Prediction: 0.0025
Input: [0. 1.], Prediction: 0.9855
Input: [1. 0.], Prediction: 0.9985
Input: [1. 1.], Prediction: 0.0020
