# Logistic Regression

Logistic regression is a process of modeling the probability of a discrete outcome given an input variable. The most common logistic regression models a binary outcome; something that can take two values such as true/false, yes/no, and so on.

In this week you will be doing logistic regression on breast cancer dataset using sklearn library. Feel free to create any new functions required.

In [1]:
#importing libraries
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn import datasets
import numpy as np

Preparing Data

In [2]:
breast_cancer = datasets.load_breast_cancer()
X, y = breast_cancer.data, breast_cancer.target

In [16]:
#spliting data for training and testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1234)
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

Binary cross entropy loss

In [17]:
def BCELoss(y,y_pred):
  loss = 0
  for i in range(len(y)):
    y_pred = np.clip(y_pred, 1e-7, 1-1e-7)
    loss += (y[i]*np.log(y_pred[i]) + (1-y[i])*np.log(1-y_pred[i]))
    loss /= len(y)
    return loss

Develop a logistic regression model from scratch. Once the model is trained, evaluate its performance by calculating both the accuracy and the cross-entropy loss.

In [21]:
def sigmoid(x):
  return 1 / (1 + np.exp(-x))

class LogisticRegression:
    def __init__(self, lr=0.01, iters=1000): #lr (learning rate) & iters (iterations) could be anything of your choice
      self.lr = lr
      self.iters = iters

    def fit(self, X, y):
      self.X = X
      self.y = y
      self.m, self.n = X.shape
      self.weights = np.zeros(self.n)
      self.bias = 0
      for i in range(self.iters):
        Z = np.dot(X, self.weights) + self.bias
        A = sigmoid(Z)
        self.loss = BCELoss(y,A)
        dw = (1/self.m)*np.dot(X.T, (A-y))
        db = (1/self.m)*np.sum(A-y)
        self.weights -= self.lr*dw
        self.bias -= self.lr*db
      return self

    def predict(self, X):
      Z = np.dot(X, self.weights) + self.bias
      A = sigmoid(Z)
      y_pred = np.where(A >= 0.5, 1, 0)
      return y_pred

model = LogisticRegression(lr=0.01, iters=1000)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred)*100, "%")
print("Cross Entropy Loss:" , abs(model.loss)*100, "%")

Accuracy: 93.85964912280701 %
Cross Entropy Loss: 0.060049062913583647 %
