## Arewa Data Science Academy
### Deep Learning Cohort1.0

#### Name: Ibrahim Ismaila
#### Email: ibrahim5322022@hotmail.com
#### Title: Week 2 Solution

### Exercises Solution

In [None]:
# Exercise 1a- Solution
a = torch.tensor([1, 2])

In [None]:
Create a straight line dataset using the linear regression formula (weight * X + bias).
Set weight=0.3 and bias=0.9 there should be at least 100 datapoints total.
Split the data into 80% training, 20% testing.
Plot the training and testing data so it becomes visual.






Build a PyTorch model by subclassing nn.Module.
Inside should be a randomly initialized nn.Parameter() with requires_grad=True, one for weights and one for bias.
Implement the forward() method to compute the linear regression function you used to create the dataset in 1.
Once you've constructed the model, make an instance of it and check its state_dict().
Note: If you'd like to use nn.Linear() instead of nn.Parameter() you can.


Create a loss function and optimizer using nn.L1Loss() and torch.optim.SGD(params, lr) respectively.
Set the learning rate of the optimizer to be 0.01 and the parameters to optimize should be the model parameters from the model you created in 2.



Write a training loop to perform the appropriate training steps for 300 epochs.
The training loop should test the model on the test dataset every 20 epochs.

Make predictions with the trained model on the test data.
Visualize these predictions against the original training and testing data (note: you may need to make sure the predictions are not on the GPU if you want to use non-CUDA-enabled libraries such as matplotlib to plot).


Save your trained model's state_dict() to file.
Create a new instance of your model class you made in 2. and load in the state_dict() you just saved to it.
Perform predictions on your test data with the loaded model and confirm they match the original model predictions from 4.

In [None]:
# Linear Regression with PyTorch


In [None]:
## Create a Straight Line Dataset

import numpy as np
import matplotlib.pyplot as plt
import torch
from sklearn.model_selection import train_test_split

# Create dataset
weight = 0.3
bias = 0.9
X = np.linspace(0, 1, 100)
y = weight * X + bias + np.random.randn(*X.shape) * 0.05  # Adding some noise

# Split the dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Plot the dataset
plt.scatter(X_train, y_train, label="Training data")
plt.scatter(X_test, y_test, label="Testing data")
plt.legend()
plt.show()

In [None]:
# Build a PyTorch Model
import torch.nn as nn

class LinearRegressionModel(nn.Module):
    def __init__(self):
        super(LinearRegressionModel, self).__init__()
        self.weight = nn.Parameter(torch.randn(1), requires_grad=True)
        self.bias = nn.Parameter(torch.randn(1), requires_grad=True)

    def forward(self, x):
        return self.weight * x + self.bias

# Instantiate the model
model = LinearRegressionModel()
print(model.state_dict())


In [None]:
# Create a Loss Function and Optimizer
loss_fn = nn.L1Loss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

In [None]:
# Training Loop
# Convert data to tensors
X_train_tensor = torch.tensor(X_train, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train, dtype=torch.float32)
X_test_tensor = torch.tensor(X_test, dtype=torch.float32)
y_test_tensor = torch.tensor(y_test, dtype=torch.float32)

# Training loop
epochs = 300
for epoch in range(epochs):
    model.train()

    # Forward pass
    y_pred = model(X_train_tensor)
    loss = loss_fn(y_pred, y_train_tensor)

    # Backward pass
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Test model every 20 epochs
    if (epoch + 1) % 20 == 0:
        model.eval()
        with torch.no_grad():
            test_pred = model(X_test_tensor)
            test_loss = loss_fn(test_pred, y_test_tensor)
            print(f"Epoch {epoch+1}, Train Loss: {loss.item()}, Test Loss: {test_loss.item()}")

In [None]:
# Make Predictions and Visualize
# Make predictions
model.eval()
with torch.no_grad():
    predictions = model(X_test_tensor)

# Plot predictions
plt.scatter(X_train, y_train, label="Training data")
plt.scatter(X_test, y_test, label="Testing data")
plt.scatter(X_test, predictions.numpy(), label="Predictions", marker='x')
plt.legend()
plt.show()

In [None]:
# Save and Load Model
# Save the model state_dict
torch.save(model.state_dict(), "linear_model.pth")

# Load the model
new_model = LinearRegressionModel()
new_model.load_state_dict(torch.load("linear_model.pth"))
new_model.eval()

# Confirm predictions match
with torch.no_grad():
    new_predictions = new_model(X_test_tensor)
    assert torch.allclose(predictions, new_predictions)
    print("Loaded model predictions match original model predictions.")

In [None]:
# Synchronous Machine Learning Dataset

In [None]:
# Load and Convert Data
import zipfile
import pandas as pd
import torch

# Download dataset in colab env
!wget https://archive.ics.uci.edu/static/public/607/synchronous+machine+data+set.zip -O data.zip

# Unzip data
with zipfile.ZipFile("data.zip", 'r') as my_zip:
    my_zip.extractall()

# Read CSV and parse to pandas
dataset_name = "synchronous machine.csv"
data = pd.read_csv(dataset_name, delimiter=";", thousands=',')
numpy_data = data.values

# Convert to PyTorch tensor
original_data_tensor = torch.tensor(numpy_data, dtype=torch.float32)


In [None]:
# Tensor Manipulation
# (a) Print size
print("Size of dataset:", original_data_tensor.size())

# (b) Create my_pi_tensor
my_pi_tensor = torch.full_like(original_data_tensor, 3.142, device=torch.device("cuda" if torch.cuda.is_available() else "cpu"))
print("First 13 rows of my_pi_tensor:", my_pi_tensor[:13])
print("Tensor device:", my_pi_tensor.device)
print("Tensor datatype:", my_pi_tensor.dtype)

# (c) Fifth-root of the sum of all values in my_pi_tensor
fifth_root = torch.sum(my_pi_tensor).pow(1/5)
print("Fifth-root of the sum:", fifth_root)

# (d) Create my_data_tensor
my_data_tensor = torch.cat((original_data_tensor[:100], original_data_tensor[-100:]), dim=0)
print("Size of my_data_tensor:", my_data_tensor.size())

# (e) Create features and target tensors
features = my_data_tensor[:, 0]  # dIf column
target = my_data_tensor[:, 1]    # If column

# (f) Split data into training and testing sets
train_size = int(0.75 * len(my_data_tensor))
train_features, test_features = features[:train_size], features[train_size:]
train_target, test_target = target[:train_size], target[train_size:]

In [None]:
# Define Linear Model
class SynchronousMachineLinearModel(nn.Module):
    def __init__(self):
        super(SynchronousMachineLinearModel, self).__init__()
        self.linear = nn.Linear(1, 1)

    def forward(self, x):
        return self.linear(x)

In [None]:
# Train Model
# Initialize the model
model = SynchronousMachineLinearModel()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
loss_fn = nn.MSELoss()

# Training loop
epochs = 100
train_losses, test_losses = [], []

for epoch in range(epochs):
    model.train()

    # Forward pass
    y_pred = model(train_features.unsqueeze(1))
    train_loss = loss_fn(y_pred, train_target.unsqueeze(1))

    # Backward pass
    optimizer.zero_grad()
    train_loss.backward()
    optimizer.step()

    # Evaluate on test set
    model.eval()
    with torch.no_grad():
        test_pred = model(test_features.unsqueeze(1))
        test_loss = loss_fn(test_pred, test_target.unsqueeze(1))

    train_losses.append(train_loss.item())
    test_losses.append(test_loss.item())

    print(f"Epoch {epoch+1}, Train Loss: {train_loss.item()}, Test Loss: {test_loss.item()}")

# Plot Loss against Epoch
plt.plot(range(epochs), train_losses, label="Train Loss")
plt.plot(range(epochs), test_losses, label="Test Loss")
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.legend()
plt.show()

# Comment on Results

The model's performance can be improved by tuning hyperparameters such as the learning rate, increasing the number of epochs, or using a more complex model architecture. Feature engineering and data preprocessing might also contribute to better predictions.

With this guide, you should be able to train multiple models using PyTorch, track them with MLflow, and deploy the best one on Streamlit. Feel free to explore further and customize the code as needed! If you have any questions or need additional assistance, let me know.
