<a href="https://colab.research.google.com/github/gmehra123/data_science_projs/blob/main/Intermediate_deep_learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### OOP in pytorch
> OOP is a programming paradigm/method that encapsulates data and the abilities/methods that work on the data into one unit. Below is an example of a

In [None]:
class Account:
  def __init__(self,bal):
    self.bal = bal

  def deposit(self,dep):
    self.bal+= dep


In [None]:
acc = Account(100)
acc.deposit(20)
acc.bal

120

In [None]:
import pandas as pd
import torch.nn as nn
from torch.utils.data import Dataset

In [None]:
class WaterDataset(Dataset):
  def __init__(self,csv_path):
    super().__init__()
    data = pd.read_csv(csv_path)
    self.data = data.to_numpy()

  def __len__(self):
    return self.data.shape[0]

  def __getitem__(self,idx):
    features = self.data[idx,:-1]
    labels = self.data[idx,-1]
    return features,labels

In [None]:
water_data = WaterDataset('https://assets.datacamp.com/production/repositories/6193/datasets/fca5067912db2b2346f568ce806915450fe56b99/water_potability.csv')

In [None]:
from torch.utils.data import DataLoader

In [None]:
dataloader = DataLoader(water_data,batch_size=2,shuffle=True)

In [None]:
next(iter(dataloader))

[tensor([[0.6673, 0.4107, 0.1506, 0.6020, 0.6513, 0.7163, 0.5880, 0.8377, 0.6253],
         [0.5011, 0.4301, 0.4543, 0.5195, 0.5226, 0.2205, 0.5727, 0.5249, 0.3902]],
        dtype=torch.float64),
 tensor([0., 1.], dtype=torch.float64)]

In [None]:
class Net(nn.Module):
  def __init__(self):
    super(Net,self).__init__()
    self.fc1 = nn.Linear(9,16)
    self.fc2 = nn.Linear(16,8)
    self.fc3 = nn.Linear(8,1)

  def forward(self,x):
    x = nn.functional.relu(self.fc1(x))
    x = nn.functional.relu(self.fc2(x))
    x = nn.functional.sigmoid(self.fc3(x))
    return(x)

In [None]:
net = Net()
from torch.optim import Adam

In [None]:
crit = nn.BCELoss()
optimizer = Adam(net.parameters(),lr=0.001)

In [None]:
def train_model(dataset,net,epochs=5):
  crit = nn.BCELoss()
  optimizer = Adam(net.parameters(),lr=0.001)
  trainset,valset = torch.utils.data.random_split(dataset,[0.8,0.2])
  trainload = DataLoader(dataset,batch_size=3,shuffle=True)
  for epochs in range(epochs):
    for features,labels in trainload:
      optimizer.zero_grad()
      preds = net(features.float())
      loss = crit(preds,labels.reshape(-1,1))
      loss.backward()
      optimizer.step()


In [None]:
from torch.utils.data import random_split

In [None]:
feat,labels=next(iter(dataloader))

In [None]:
pred= net(feat.float())

In [None]:
labels=labels.float()

In [None]:
crit(pred,labels.reshape(-1,1))

tensor(0.6968, grad_fn=<BinaryCrossEntropyBackward0>)

In [None]:
pred.shape

torch.Size([2, 1])

In [None]:
labels.reshape(-1,1).shape

torch.Size([2, 1])

In [None]:
!pip install torchmetrics

Collecting torchmetrics
  Downloading torchmetrics-1.3.2-py3-none-any.whl (841 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m841.5/841.5 kB[0m [31m8.0 MB/s[0m eta [36m0:00:00[0m
Collecting lightning-utilities>=0.8.0 (from torchmetrics)
  Downloading lightning_utilities-0.11.2-py3-none-any.whl (26 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.10.0->torchmetrics)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.10.0->torchmetrics)
  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch>=1.10.0->torchmetrics)
  Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch>=1.10.0->torchmetrics)
  Using cached nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
Collectin

In [None]:
from torchmetrics import Accuracy

In [None]:
acc =Accuracy(task='binary')

In [None]:
import torch

In [None]:
crit = nn.BCELoss()
optimizer = Adam(net.parameters(),lr=0.001)
trainset,valset = random_split(water_data,[0.8,0.2])
trainload = DataLoader(trainset,batch_size=3,shuffle=True)
valload = DataLoader(valset,batch_size=3,shuffle=True)
for epochs in range(10):
  for features,labels in trainload:
    optimizer.zero_grad()
    preds = net(features.float())
    labels = labels.float()
    loss = crit(preds,labels.reshape(-1,1))
    loss.backward()
    optimizer.step()

  net.eval()

  with torch.no_grad():
    for features,labels in valload:
      output = net(features.float())
      preds = (output>=0.5).float()
      acc(preds,labels.reshape(-1,1))

  test_acc_epoch = acc.compute()
  print("Test Accuracy {}: ".format(epochs),acc.compute())
  net.train()


Test Accuracy 0:  tensor(0.6194)
Test Accuracy 1:  tensor(0.6194)
Test Accuracy 2:  tensor(0.6194)
Test Accuracy 3:  tensor(0.6194)
Test Accuracy 4:  tensor(0.6194)
Test Accuracy 5:  tensor(0.6194)
Test Accuracy 6:  tensor(0.6194)
Test Accuracy 7:  tensor(0.6194)
Test Accuracy 8:  tensor(0.6194)
Test Accuracy 9:  tensor(0.6194)


In [None]:
net.fc1.weight.dtype
net.fc2.weight.dtype

torch.float32

In [None]:
features.dtype

torch.float64

### Vanishing and Exploding gradients
* Neural networks suffer from unstable gradients during training. In some cases this leads to vanishing gradients as eralier layers do not get any updates.
* In other cases the opposite may happen as the gradients may explode
> To solve this problem we use a 3 step approach
> * **Proper weights intilization**. Good initialization ensures that the variance of a layers outputs is equal to the variance of its inputs. Variance of the gradients before and after a layer should also be the same. If you use RELU use *he-kaiming* initialization
> * **Appropriate activation functions** ReLU has a dying neuron problem. The elu activation function takes care of this to a large extent
> * **Batch Normalization** Is very similar to standardization but applied between layers

In [None]:
import torch.nn.init as init
import torch.nn as nn
from torch.utils.data import Dataset,DataLoader
from torch.optim import Adam

In [None]:
class WaterData(Dataset):
  def __init__(self,path):
    super().__init__()
    data = pd.read_csv(path)
    self.y = data['Potability'].to_numpy()
    self.X = data.drop(columns='Potability').to_numpy()

  def __len__(self):
    return len(self.y)

  def __getitem__(self,idx):
    features = self.X[idx]
    labels = self.y[idx]
    return features,labels

In [None]:
class Model(nn.Module):
  def __init__(self):
    super(Model,self).__init__()
    self.fc1 = nn.Linear(9,12)
    self.bn1 = nn.BatchNorm1d(12)
    self.fc2 = nn.Linear(12,10)
    self.bn2 = nn.BatchNorm1d(10)
    self.fc3 = nn.Linear(10,1)

    init.kaiming_uniform_(self.fc1.weight)
    init.kaiming_uniform_(self.fc2.weight)
    init.kaiming_uniform_(self.fc3.weight,nonlinearity="sigmoid")

  def forward(self,x):
    x = self.fc1(x)
    x = self.bn1(x)
    x = nn.functional.relu(x)
    x = self.fc2(x)
    x = nn.functional.relu(x)
    #x = self.bn2(x)
    x = self.fc3(x)
    x = nn.functional.sigmoid(x)
    return(x)

In [None]:
from torch.utils.data import random_split

In [None]:
net = Model()

In [None]:
water = WaterData('https://assets.datacamp.com/production/repositories/6193/datasets/fca5067912db2b2346f568ce806915450fe56b99/water_potability.csv')

In [None]:
trainset,valset = random_split(water,[0.8,0.2])

In [None]:
trainload = DataLoader(trainset,batch_size=5,shuffle=True)
valload = DataLoader(valset,batch_size=5,shuffle=True)

In [None]:
feat,label=next(iter(trainload))

In [None]:
for i in range(200):
  optimizer.zero_grad()
  preds = net(features.float())
  labels = labels.float()
  loss = crit(preds,labels.reshape(-1,1))
  print(loss.item())
  loss.backward()
  optimizer.step()

0.021751023828983307
0.021609120070934296
0.021468661725521088
0.021329661831259727
0.02119210734963417
0.021055951714515686
0.020921174436807632
0.020787760615348816
0.02065572515130043
0.020525000989437103
0.020395588129758835
0.020267488434910774
0.020140668377280235
0.020015103742480278
0.019890785217285156
0.01976771280169487
0.019645828753709793
0.01952514424920082
0.019405653700232506
0.019287321716547012
0.0191701240837574
0.019054073840379715
0.018939144909381866
0.018825318664312363
0.01871257834136486
0.018600905314087868
0.018490303307771683
0.018380772322416306
0.018272235989570618
0.0181647390127182
0.018058253452181816
0.017952758818864822
0.01784825325012207
0.01774471625685692
0.017642129212617874
0.017540495842695236
0.01743979938328266
0.01734001748263836
0.017241140827536583
0.017143193632364273
0.017046108841896057
0.016949914395809174
0.016854584217071533
0.01676008850336075
0.016666464507579803
0.016573671251535416
0.01648169942200184
0.016390550881624222
0.01630

In [None]:
crit=nn.BCELoss()

In [None]:
crit = nn.BCELoss()
optimizer = Adam(net.parameters(),lr=0.001)
for epochs in range(10):
  for features,labels in trainload:
    optimizer.zero_grad()
    preds = net(features.float())
    labels = labels.float()
    loss = crit(preds,labels.reshape(-1,1))
    loss.backward()
    optimizer.step()

  net.eval()

  with torch.no_grad():
    for features,labels in valload:
      output = net(features.float())
      preds = (output>=0.5).float()
      acc(preds,labels.reshape(-1,1))

  test_acc_epoch = acc.compute()
  print("Test Accuracy {}: ".format(epochs),acc.compute())
  net.train()

Test Accuracy 0:  tensor(0.6172)
Test Accuracy 1:  tensor(0.6180)
Test Accuracy 2:  tensor(0.6183)
Test Accuracy 3:  tensor(0.6189)
Test Accuracy 4:  tensor(0.6195)
Test Accuracy 5:  tensor(0.6195)
Test Accuracy 6:  tensor(0.6198)
Test Accuracy 7:  tensor(0.6205)
Test Accuracy 8:  tensor(0.6214)
Test Accuracy 9:  tensor(0.6222)


In [None]:
data = pd.read_csv('https://assets.datacamp.com/production/repositories/6193/datasets/fca5067912db2b2346f568ce806915450fe56b99/water_potability.csv')

In [None]:
net()

Potability
0    0.596718
1    0.403282
Name: proportion, dtype: float64

### Deep Learning with image data
* Digital images are comprised of pixels or picture elements. It is a tiny sqaure
* In gray scale each pixel represents 1 number between 0-255 representing all shades of grey
* In color images each pixel is described by a set of 3 numbers representing the intensity of red,green and blue (RGB)
* Folder structure for image processing
  * Train Folder
    * category 1 folder
    * category 2 folder
    * category n folder
  * Test folder
    * category 1 folder
    * category 2 folder
    * category 3 folder

### Convolutional neural networks
* A 256X256 image will have over 65K model inputs
* If the next layer is a 1000 neurons then the number of parameters quickly explodes to 65 million.
* if the image is color then you could end up with 200 million (65M X 3) parameters
* Poor at recognizing spatial patterns
* Parameters are collected in small grids called filters
* Perform convolution to create a feature map
* A feature map preserves spatial patterns and uses fewer inputs than a linear layer
* We can use several feature maps and apply activations to each feature map
* CNN network has
  * Feature extractor-: Convolution, Activation, Pooling

In [None]:
from torchvision.datasets import ImageFolder
from torchvision import transforms

In [None]:
train_transforms = transforms.Compose([transforms.ToTensor(),transforms.Resize((128,128))])
datatrain = ImageFolder(path,transform=train_transforms)

NameError: name 'path' is not defined