## Logistic Regression For MultiClass Classification

## One Vs Rest

def - This approach fits one classifier per class, where each classifier is trained to distinguish one class against all other classes.
 It is computationally efficient, requiring only n_classes classifiers, and offers good interpretability since each class is represented by a single classifier.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

In [2]:
from sklearn.linear_model import LogisticRegression

In [3]:
from sklearn.datasets import make_classification

In [4]:
#creates datasets
X,y = make_classification(n_samples=1000,n_features=10,n_informative=3,n_classes=3,random_state=15)

In [12]:
X.shape

(1000, 10)

In [11]:
y.shape

(1000,)

In [7]:
from sklearn.model_selection import train_test_split

In [13]:
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.30,random_state=42)

In [14]:
logistic = LogisticRegression(multi_class='ovr')

In [15]:
logistic.fit(X_train,y_train)



In [16]:
y_pred = logistic.predict(X_test)

In [17]:
from sklearn.metrics import classification_report,accuracy_score,confusion_matrix

In [18]:
print(accuracy_score(y_test,y_pred))
print(classification_report(y_test,y_pred))
print(confusion_matrix(y_test,y_pred))


0.79
              precision    recall  f1-score   support

           0       0.87      0.82      0.84       102
           1       0.81      0.73      0.77       102
           2       0.71      0.82      0.76        96

    accuracy                           0.79       300
   macro avg       0.79      0.79      0.79       300
weighted avg       0.80      0.79      0.79       300

[[84 10  8]
 [ 3 74 25]
 [10  7 79]]


## HyperParameter Tuning

### Random Search Cv

In [19]:
from sklearn.model_selection import RandomizedSearchCV

In [30]:
model = LogisticRegression(multi_class='ovr')

In [31]:
penalty=['l1', 'l2', 'elasticnet']
c_values = [100,10,1.0,0.1,0.01]
solver=['lbfgs', 'liblinear', 'newton-cg', 'newton-cholesky', 'sag', 'saga']

In [32]:
params = dict(penalty = penalty, C = c_values,solver = solver)
params

{'penalty': ['l1', 'l2', 'elasticnet'],
 'C': [100, 10, 1.0, 0.1, 0.01],
 'solver': ['lbfgs',
  'liblinear',
  'newton-cg',
  'newton-cholesky',
  'sag',
  'saga']}

In [33]:
randomcv = RandomizedSearchCV(model,param_distributions=params,cv=5,scoring='accuracy')

In [34]:
randomcv.fit(X_train,y_train)

10 fits failed out of a total of 50.
The score on these train-test partitions for these parameters will be set to nan.
If these failures are not expected, you can try to debug them by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
5 fits failed with the following error:
Traceback (most recent call last):
  File "C:\Users\Win\AppData\Roaming\Python\Python313\site-packages\sklearn\model_selection\_validation.py", line 866, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
    ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Win\AppData\Roaming\Python\Python313\site-packages\sklearn\base.py", line 1389, in wrapper
    return fit_method(estimator, *args, **kwargs)
  File "C:\Users\Win\AppData\Roaming\Python\Python313\site-packages\sklearn\linear_model\_logistic.py", line 1193, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  Fil

In [35]:
y_pred_new = randomcv.predict(X_test)

In [36]:
print(accuracy_score(y_test,y_pred_new))
print(classification_report(y_test,y_pred_new))
print(confusion_matrix(y_test,y_pred_new))


0.7766666666666666
              precision    recall  f1-score   support

           0       0.88      0.81      0.85       102
           1       0.77      0.71      0.73       102
           2       0.70      0.81      0.75        96

    accuracy                           0.78       300
   macro avg       0.78      0.78      0.78       300
weighted avg       0.78      0.78      0.78       300

[[83 11  8]
 [ 4 72 26]
 [ 7 11 78]]


In [37]:
randomcv.best_params_

{'solver': 'liblinear', 'penalty': 'l1', 'C': 0.1}

In [38]:
randomcv.best_score_

np.float64(0.8028571428571428)