# <font color='pickle'>**REFACTOR -nn_module--Lecture_2_2 LR**

# <font color='pickle'>**Install/import libraries**

In [None]:
# Install wandb and update it to the latest version
!pip install wandb --upgrade -q

[K     |████████████████████████████████| 1.8 MB 6.8 MB/s 
[K     |████████████████████████████████| 158 kB 55.6 MB/s 
[K     |████████████████████████████████| 181 kB 60.9 MB/s 
[K     |████████████████████████████████| 63 kB 1.7 MB/s 
[K     |████████████████████████████████| 157 kB 54.2 MB/s 
[K     |████████████████████████████████| 157 kB 50.1 MB/s 
[K     |████████████████████████████████| 157 kB 26.2 MB/s 
[K     |████████████████████████████████| 157 kB 58.1 MB/s 
[K     |████████████████████████████████| 157 kB 50.7 MB/s 
[K     |████████████████████████████████| 157 kB 18.4 MB/s 
[K     |████████████████████████████████| 157 kB 55.9 MB/s 
[K     |████████████████████████████████| 156 kB 51.7 MB/s 
[?25h  Building wheel for pathtools (setup.py) ... [?25l[?25hdone


In [None]:
# Importing PyTorch Library
import torch
import torch.nn as nn
import random
import wandb

In [None]:
# To get deterministic results
torch.manual_seed(456)
random.seed(123)

# <font color='pickle'>**Generating a Dataset**

We will generate a dummy dataset having 1000 observations and 2 features.
The observations are sampled from standard normal distribution.

Let us have our true parameter values to be w = [3, -4.5] and b = 5.2. 

`y = Xw.T + b + noise`

We will further assume that nose will be normally distributed with mean 0 and standard deviation of 0.01. 


In [None]:
def generate_dataset(w, b, num): 

    """
    Function to generate a dataset. 
    Input parameter : 
    w: weights, 
    b: bias 
    num: number of obervations
    Output: feature and labels of a dataset
    """
    
    # Generate X values from standard normal distribution 
    X = torch.normal(0, 1, (num, len(w.T)))

    # Generate y values: y = Xw + b
    y = torch.mm(X, w.T) + b

    # Adding noise in labels
    
    y += torch.normal(0, 0.01, y.shape)

    dataset = torch.utils.data.TensorDataset(X, y)

    # Returning the dataset generated
    return dataset

In [None]:
# Initializing actual weight and bias values
w_true = torch.Tensor([3, -4.5]).view(1,-1)
b_true = 5.2

# Calling the generate_dataset function to create a dummy dataset
train_dataset = generate_dataset(w_true, b_true, 1000)


In [None]:
train_dataset[:][0].shape

torch.Size([1000, 2])

In [None]:
train_dataset[:][1].shape

torch.Size([1000, 1])

# <font color='pickle'>**DataLoaders**

For training our model, we will use mini batches from the dataset and use them to update our model.

To simplify this process, we will define a function which will shuffle the dataset and access it in mini batches.

In [None]:
batch_size = 15
read_data = torch.utils.data.DataLoader(train_dataset,
                                        batch_size= batch_size,
                                        shuffle = True)

Lets create a batch size of 15 and visualize the feature and data values better understanding.

In [None]:
# Call the function read_data and visualize the size of minibatch
for X, y in read_data:
    print(X.shape)
    print(y.shape)
    break

torch.Size([15, 2])
torch.Size([15, 1])


# <font color='pickle'>**Linear Regression Model**

In [None]:
model = nn.Sequential(torch.nn.Linear(in_features=2, out_features=1))

# <font color='pickle'>**Loss Function**

In [None]:
mse_loss = torch.nn.MSELoss(reduction='mean')

# <font color='pickle'>**Optimization Algorithm**

In [None]:
# weight update step
optimizer = torch.optim.SGD(model.parameters(), lr = 0.005)        

# <font color='pickle'>**wandb login**

In [None]:
# Login to W&B
wandb.login()
wandb.init(name = "L_2_refactor1", project = 'dl22_l2')


ERROR:wandb.jupyter:Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.


<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33mhsingh-utd[0m. Use [1m`wandb login --relogin`[0m to force relogin


# <font color='pickle'>**Model Training**

**Model Training** involves five steps: 

- Step 0: Randomly initialize parameters / weights
- Step 1: Compute model's predictions - forward pass
- Step 2: Compute loss
- Step 3: Compute the gradients
- Step 4: Update the parameters
- Step 5: Repeat steps 1 - 4

Model training is repeating this process over and over, for many **epochs**.

## <font color = 'pickle'> **Intialize Model Parameters**

In [None]:
def init_weights(layer):
  if isinstance(layer, nn.Linear):
      torch.nn.init.normal_(layer.weight, mean=0, std=0.01)
      torch.nn.init.zeros_(layer.bias)

## <font color = 'pickle'> **Training Loop**

In [None]:
# Set the values for learning rate and number of epochs
model.apply(init_weights)
epochs = 10

# Iterate over the whole dataset
for epoch in range(epochs):
    
    # Iterate over mini batch
    for X, y in read_data:

        # step 1 :forward pass - compute predictions
        ypred = model(X)

        # step 2: Calculate minibatch loss
        batch_loss = mse_loss(ypred, y)
        

        # step 3: Compute gradient on loss with respect to weights and bias
        # grad_w = 2 *X.T.mm(ypred-y)/len(y)
        # grad_b = 2 *(ypred-y).sum()/len(y)

        optimizer.zero_grad()
        batch_loss.backward()

        # step 4: Update parameters using their gradient using optimization algorithm
        optimizer.step() 

    
    # Calculate and print loss for the complete epoch
    train_l = mse_loss(model(train_dataset[:][0]), train_dataset[:][1] )

    # We can observe the epoch vs loss curve in W&B
    wandb.log({"/Loss_2_2": train_l.item()})
        
    print(f'epoch {epoch + 1}, loss {float(train_l.item()):f}')

epoch 1, loss 14.910106
epoch 2, loss 4.350232
epoch 3, loss 1.279587
epoch 4, loss 0.376210
epoch 5, loss 0.111061
epoch 6, loss 0.032932
epoch 7, loss 0.009770
epoch 8, loss 0.002961
epoch 9, loss 0.000961
epoch 10, loss 0.000363


We can observe that with each epoch, our loss is getting reduced hence our linear regression model is able to classify accurately.

Now since we generated the dataset ourselves we know the actual values for weights and bias, so we can check the error in both of them.

In [None]:
#a Printing error in weights and bias
print(f'Error in estimating w: {w_true - model[0].weight.data}')
print(f'Error in estimating b: {b_true - model[0].bias.data}')
print(f'estimated value of w: {model[0].weight.data}')
print(f'estimated value of b: {model[0].bias.data}')

Error in estimating w: tensor([[ 0.0079, -0.0121]])
Error in estimating b: tensor([0.0089])
estimated value of w: tensor([[ 2.9921, -4.4879]])
estimated value of b: tensor([5.1911])


# <font color = 'pickle'>**Save and Load Model**

In [None]:
model.state_dict()

OrderedDict([('0.weight', tensor([[ 2.9921, -4.4879]])),
             ('0.bias', tensor([5.1911]))])

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
from pathlib import Path

In [None]:
save_model = Path('/content/drive/MyDrive/data/models/dl_fall_2022')

In [None]:
save_model.mkdir(exist_ok = True)

In [None]:
model_file = save_model/'2_2_LR_refactor.pt'

In [None]:
torch.save(model.state_dict(), model_file)

In [None]:
model1 = model
model1.load_state_dict(torch.load(model_file))

<All keys matched successfully>

In [None]:
model1.state_dict()

OrderedDict([('0.weight', tensor([[ 2.9921, -4.4879]])),
             ('0.bias', tensor([5.1911]))])

In [None]:
wandb.finish()

VBox(children=(Label(value='0.000 MB of 0.000 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
/Loss_2_2,█▃▂▁▁▁▁▁▁▁

0,1
/Loss_2_2,0.00036
