<a href="https://colab.research.google.com/github/RegiReis7/Machine-Learning/blob/main/Simple_Logistic_Regression_Implementation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [29]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

In [30]:
from sklearn.datasets import make_classification

x, y = make_classification(n_samples=1000, n_features=10, n_classes=2, random_state=39)

In [31]:
from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.33, random_state=42)

In [32]:
from sklearn.linear_model import LogisticRegression

model = LogisticRegression()
model.fit(x_train, y_train)

In [33]:
y_pred = model.predict(x_test)

In [34]:
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

score = accuracy_score(y_test, y_pred)
cm = confusion_matrix(y_test, y_pred)
cr = classification_report(y_test, y_pred)

print("Accuracy Score", score)
print("Confusion Matrix", cm)
print("Classification Report", type(cr))

Accuracy Score 0.8939393939393939
Confusion Matrix [[149  14]
 [ 21 146]]
Classification Report <class 'str'>


Hyper parameter Tuning and Cross Validation



GridSearch CV

In [35]:
from sklearn.model_selection import GridSearchCV

model2 = LogisticRegression()
penalty = ['l1', 'l2', 'elasticnet', 'none']
solver = ['newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga']
c_values = [100, 10, 1.0, 0.1, 0.01]

param_grid = dict(penalty=penalty, solver=solver, C=c_values)

cv = GridSearchCV(estimator=model2, param_grid=param_grid, cv=5, scoring='accuracy')
cv.fit(x_train, y_train)

225 fits failed out of a total of 500.
The score on these train-test partitions for these parameters will be set to nan.
If these failures are not expected, you can try to debug them by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
25 fits failed with the following error:
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/sklearn/model_selection/_validation.py", line 686, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "/usr/local/lib/python3.10/dist-packages/sklearn/linear_model/_logistic.py", line 1162, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "/usr/local/lib/python3.10/dist-packages/sklearn/linear_model/_logistic.py", line 54, in _check_solver
    raise ValueError(
ValueError: Solver newton-cg supports only 'l2' or 'none' penalties, got l1 penalty.

--------------------------------

In [36]:
y2_pred = cv.predict(x_test)

In [37]:
score = accuracy_score(y_test, y2_pred)
cm = confusion_matrix(y_test, y2_pred)
cr = classification_report(y_test, y2_pred)

print("Accuracy Score", score)
print("Confusion Matrix", cm)
print("Classification Report", cr)

Accuracy Score 0.9
Confusion Matrix [[152  11]
 [ 22 145]]
Classification Report               precision    recall  f1-score   support

           0       0.87      0.93      0.90       163
           1       0.93      0.87      0.90       167

    accuracy                           0.90       330
   macro avg       0.90      0.90      0.90       330
weighted avg       0.90      0.90      0.90       330



Randomized SearchCV

In [38]:
from sklearn.model_selection import RandomizedSearchCV

In [39]:
model3 = LogisticRegression()

random = RandomizedSearchCV(estimator=model3, param_distributions=param_grid, cv=5, scoring='accuracy')
random.fit(x_train, y_train)

40 fits failed out of a total of 50.
The score on these train-test partitions for these parameters will be set to nan.
If these failures are not expected, you can try to debug them by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
5 fits failed with the following error:
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/sklearn/model_selection/_validation.py", line 686, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "/usr/local/lib/python3.10/dist-packages/sklearn/linear_model/_logistic.py", line 1162, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "/usr/local/lib/python3.10/dist-packages/sklearn/linear_model/_logistic.py", line 54, in _check_solver
    raise ValueError(
ValueError: Solver sag supports only 'l2' or 'none' penalties, got elasticnet penalty.

---------------------------------

In [40]:
random.best_params_

{'solver': 'liblinear', 'penalty': 'l1', 'C': 10}

In [41]:
y_pred3 = random.predict(x_test)

In [42]:
score = accuracy_score(y_test, y_pred3)
cm = confusion_matrix(y_test, y_pred3)
cr = classification_report(y_test, y_pred3)

print("Accuracy Score", score)
print("Confusion Matrix", cm)
print("Classification Report", cr)

Accuracy Score 0.8909090909090909
Confusion Matrix [[148  15]
 [ 21 146]]
Classification Report               precision    recall  f1-score   support

           0       0.88      0.91      0.89       163
           1       0.91      0.87      0.89       167

    accuracy                           0.89       330
   macro avg       0.89      0.89      0.89       330
weighted avg       0.89      0.89      0.89       330

