<a href="https://colab.research.google.com/github/FrodoBaggins87/Machine_Learning/blob/main/Classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Generate Toy Dataset


In [None]:
import sklearn
from sklearn.datasets import make_circles

#choose number of samples
n_samples=1000

#create circles
x,y=make_circles(n_samples,
                 noise=0.02,
                 random_state=66)

In [None]:
len(x), len(y)

In [None]:
print(x[:5])
print(y[:5])

In [None]:
import pandas as pd
circle_data=pd.DataFrame({"x_1":x[:,0],
                         "x_2":x[:,1],
                         "label":y})
circle_data.head()

In [None]:
#Visulaizing data
import matplotlib.pyplot as plt
plt.scatter(x=x[:,0],
            y=x[:,1],
            c=y, #colour deided by the value in y
            cmap=plt.cm.RdYlBu)#cmap sets the color map based on value of y

Check input and output shapes


In [None]:
x.shape,y.shape

In [None]:
print("sample x value", x[0])
print("sample y value", y[0])

Turn data into tensors


In [None]:
import torch
torch.__version__

In [None]:
type(x)

In [None]:
#Turn into tensor
x_tensor=torch.from_numpy(x).type(torch.float)
y_tensor=torch.from_numpy(y).type(torch.float)#float will convert to float32 by default
type(x_tensor),type(y_tensor),x_tensor.dtype,y_tensor.dtype

In [None]:
#split into training and test dataset
from sklearn.model_selection import train_test_split

x_train,x_test,y_train,y_test= train_test_split(x,
                                                y,
                                                test_size=0.2) #20% of whole dataset)

In [None]:
len(x_train),len(x_test),len(y_train),len(y_test)

Building a model


In [None]:
import torch
from torch import nn
#make device agnostic code
device= "cuda" if torch.cuda.is_available() else "cpu"
device

Now, the following steps will be done:
1. Make a subclass of nn.Module
2. Create 2 nn.Linear layers that are capable of handling shapes of our data
3. Define a forward() method based on forward computation required
4. Make an instance of the subclass and send to target device

In [None]:
x_train.shape, y_train.shape #small side note: shape here is not a method but an attribute of ndarray thats why no ()

In [None]:
class Circle_Model(nn.Module):
  def __init__(self):
    super().__init__()
    #in_features and out_features selected based on shape of x_train and y_train
    self.layer_1=nn.Linear(in_features=2, out_features=5)#takes 2 features (the coordinates) and gives out a specific number of features chosen by out_features parameter
    self.layer_2=nn.Linear(in_features=5, out_features=1)#takes features produced by first layer and gives 1 feature

    #define forward pass
    def forward(self,x):
      return self.layer_2(self.layer_1(x)) #x goes through first layer then second layer then gives output

#make an instance of our model
model_0=Circle_Model().to(device)
model_0

In [None]:
device

In [None]:
(next(model_0.parameters())).device# parameters() method in nn.Module returns an iterable over the parameters of the model, next() function is used to iterate once over the  iterable, hence reaching its fist value which is returned, .device is an attribute of nn.Module returning the device which the accessed parameter is stored in

In [None]:
type(model_0.parameters())

In [None]:
#replicating the model using nn.Sequential

model_0=nn.Sequential(
    nn.Linear(in_features=2, out_features=5),
    nn.Linear(in_features=5, out_features=1)
).to(device)

model_0# figure out how to change dtype of weights of a model

#makes essentially the same neural network as was made above as subclass of nn.Module
#using this will make all the required layers of the network at once while defining attributes of subclass in nn.Module insteead of defining each individually
#use this when network to be made is simple, if network is very customized use code like above making subclasses

In [None]:
#make untrained predictions
with torch.inference_mode():
  untrained_pred=model_0(torch.from_numpy(x_test.astype('float32')))
print("shape of untrained predictions:", untrained_pred.shape)
print("shape of test sample", x_test.shape)
print("first 10 predictions", torch.round(untrained_pred[:10]))
print("first 10 y_test", y_test[:10])


Setting up Loss Function and Optimizer:
Which ones to use?
1. For Regression: MAE or MSE is generally used.
2. For Classificaiton: Binary Cross Entropy or Categorical Cross Entropy is used generally
3. Optimizer: Most useful ones are SGD and Adam

In [None]:
#here will use BCE with Sigmoid function built in
loss_fn=nn.BCEWithLogitsLoss()# read its documentaiton
optimizer=torch.optim.SGD(params=model_0.parameters(),
                          lr=0.1)

In [None]:
#calculate accuracy
def accuracy_calc(y_true, y_pred):
  correct=torch.eq(y_true,y_pred).sum().item()#eq function makes a boolean tensor with True wherever both values are equal, sum() method adds up all elements and returns a tensor, item() method returns the numerical value of the values in tensor
  acc=(correct/len(y_pred))*100# in percentage
  return acc

Train Model:
1. Forward Pass
2. Calculate Loss
3. Set Optimizer Zero Grad
4. BackPropogation
5. Gradient Descent


In [1]:
model_0.eval() #putting in evaluation mode
#havent started the training loop yet still manually seeing how training will go
with torch.inference_mode():
  y_logits=model_0(torch.from_numpy(x_test.astype('float32')).to(device))
y_logits[:5]

NameError: name 'model_0' is not defined

In [None]:
y_test[:5]

In [None]:
y_pred_probs=torch.sigmoid(y_logits)# using sigmoid activation function
y_pred_probs

In [None]:
#Calculating prediction 2 diff ways
y_preds=torch.round(y_pred_probs)
# now in full
y_preds_again= torch.round(torch.sigmoid(model_0(torch.from_numpy(x_test.astype('float32')).to(device))))
#checking for equality
print(torch.eq(y_preds.squeeze(),y_preds_again.squeeze()))
#getting rid of extra dimension in prediction
y_preds.squeeze()