<a href="https://colab.research.google.com/github/tejask-42/Speech-Emotion-Recognition-Project/blob/main/Week_2/WiDS_Logistic_Regression_Assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Logistic Regression

Logistic regression is a process of modeling the probability of a discrete outcome given an input variable. The most common logistic regression models a binary outcome; something that can take two values such as true/false, yes/no, and so on.

In this week you will be doing logistic regression on breast cancer dataset using sklearn library. Feel free to create any new functions required.

In [None]:
#importing libraries
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn import datasets
import numpy as np

Preparing Data

In [None]:
breast_cancer = datasets.load_breast_cancer()
X, y = breast_cancer.data, breast_cancer.target

In [None]:
#spliting data for training and testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1234)
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

Binary cross entropy loss

In [None]:
def BCELoss(y,y_pred):
  y_pred = np.clip(y_pred, 1e-7, 1 - 1e-7) # to avoid log 0
  loss = -np.mean(y * np.log(y_pred) + (1 - y) * np.log(1 - y_pred))
  return loss

Develop a logistic regression model from scratch. Once the model is trained, evaluate its performance by calculating both the accuracy and the cross-entropy loss.

In [None]:
class LogisticRegression:
    def __init__(self, lr=0.1, iters=1000): #lr (learning rate) & iters (iterations) could be anything of your choice
      self.lr = lr
      self.iters = iters

    def sigmoid(self, X):
      return 1 / (1 + np.exp(-(X.dot(self.W) + self.b)))

    def fit(self, X, y):
      self.m, self.n = X.shape
      self.W = np.zeros(self.n) # weights
      self.b = 0 # bias
      self.X = X
      self.y = y
      for i in range(self.iters):
        self.update_weights()
      return self

    def predict(self, X):
      Z = self.sigmoid(X)
      Y = np.where(Z > 0.5, 1, 0)
      return Y

    def update_weights(self):
      A = self.sigmoid(self.X)
      tmp = A - self.y.T # error matrix
      tmp = np.reshape(tmp, self.m) # reshaped to 1D
      dW = np.dot(self.X.T, tmp) / self.m # weighted with errors and normalised
      db = np.sum(tmp) / self.m # average error -> bias
      self.W = self.W - self.lr * dW # going opposite of gradient
      self.b = self.b - self.lr * db
      return self

In [None]:
model = LogisticRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
loss = BCELoss(y_test, y_pred)
print("Accuracy:", round(accuracy*100, 2), "%")
print("Loss:", round(loss*100, 2), "%")

Accuracy: 95.61 %
Loss: 70.69 %
