___

<a href='http://www.pieriandata.com'><img src='../Pierian_Data_Logo.png'/></a>
___
<center><em>Copyright by Pierian Data Inc.</em></center>
<center><em>For more information, visit us at <a href='http://www.pieriandata.com'>www.pieriandata.com</a></em></center>

# Multi-Class Logistic Regression

Students often ask how to perform non binary classification with Logistic Regression. Fortunately, the process with scikit-learn is pretty much the same as with binary classification. To expand our understanding, we'll go through a simple data set, as well as seeing how to use LogisiticRegression with a manual GridSearchCV (instead of LogisticRegressionCV). 

## Imports

In [None]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

In [None]:
df = pd.read_csv('../DATA/iris.csv')
df.head()

In [None]:
df.describe()

In [None]:
pd.DataFrame(df['species'].value_counts())

In [None]:
sns.countplot(x='species', data=df)

In [None]:
sns.scatterplot(x='petal_length', y='petal_width', data=df, hue='species')

In [None]:
sns.pairplot(data=df, hue='species')

In [None]:
sns.heatmap(df.corr(), annot=True)

In [None]:
X = df.drop('species', axis=1)
y = df['species']

In [None]:
pd.DataFrame(y)

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing   import StandardScaler

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=101)

scaler = StandardScaler()
scaled_X_train = scaler.fit_transform(X_train)
scaled_X_test  = scaler.transform(X_test)

In [None]:
from sklearn.linear_model    import LogisticRegression
from sklearn.model_selection import GridSearchCV

In [None]:
log_model = LogisticRegression(solver='saga', multi_class='ovr', max_iter=5000)

In [None]:
penalty  = ['l1', 'l2', 'elasticnet']
l1_ratio = np.linspace(0, 1, 10)
C        = np.logspace(0, 10, 20)

param_grid = {'penalty': penalty, 'l1_ratio': l1_ratio, 'C': C}

In [None]:
grid_model = GridSearchCV(estimator=log_model, param_grid=param_grid)
grid_model.fit(scaled_X_train, y_train)

In [None]:
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report, plot_confusion_matrix

In [None]:
grid_model.best_params_

In [None]:
y_pred = grid_model.predict(scaled_X_test)
(y_pred)

In [None]:
accuracy_score(y_test, y_pred)

In [None]:
confusion_matrix(y_test, y_pred)

In [None]:
plot_confusion_matrix(grid_model, scaled_X_test, y_test)

In [None]:
print(classification_report(y_test, y_pred))

In [None]:
from sklearn.metrics import plot_roc_curve
from sklearn.metrics import roc_curve, auc

## Data

We will work with the classic Iris Data Set. The Iris flower data set or Fisher's Iris data set is a multivariate data set introduced by the British statistician, eugenicist, and biologist Ronald Fisher in his 1936 paper The use of multiple measurements in taxonomic problems as an example of linear discriminant analysis.

Full Details: https://en.wikipedia.org/wiki/Iris_flower_data_set

In [None]:
df = pd.read_csv('../DATA/iris.csv')

In [None]:
df.head()

### Exploratory Data Analysis and Visualization

Feel free to explore the data further on your own.

In [None]:
df.info()

In [None]:
df.describe()

In [None]:
df['species'].value_counts()

In [None]:
sns.countplot(df['species'])

In [None]:
sns.scatterplot(x='sepal_length',y='sepal_width',data=df,hue='species')

In [None]:
sns.scatterplot(x='petal_length',y='petal_width',data=df,hue='species')

In [None]:
sns.pairplot(df,hue='species')

In [None]:
sns.heatmap(df.corr(),annot=True)

Easily discover new plot types with a google search! Searching for "3d matplotlib scatter plot" quickly takes you to: https://matplotlib.org/3.1.1/gallery/mplot3d/scatter3d.html

In [None]:
df['species'].unique()

In [None]:
from mpl_toolkits.mplot3d import Axes3D 
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
colors = df['species'].map({'setosa':0, 'versicolor':1, 'virginica':2})
ax.scatter(df['sepal_width'],df['petal_width'],df['petal_length'],c=colors);

### Train | Test Split and Scaling

In [None]:
X = df.drop('species',axis=1)
y = df['species']

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=101)

In [None]:
scaler = StandardScaler()

In [None]:
scaled_X_train = scaler.fit_transform(X_train)
scaled_X_test = scaler.transform(X_test)

## Multi-Class Logistic Regression Model

In [None]:
from sklearn.linear_model import LogisticRegression

In [None]:
from sklearn.model_selection import GridSearchCV

In [None]:
# Depending on warnings you may need to adjust max iterations allowed 
# Or experiment with different solvers
log_model = LogisticRegression(solver='saga',multi_class="ovr",max_iter=5000)

### GridSearch for Best Hyper-Parameters

Main parameter choices are regularization penalty choice and regularization C value.

In [None]:
# Penalty Type
penalty = ['l1', 'l2']

# Use logarithmically spaced C values (recommended in official docs)
C = np.logspace(0, 4, 10)

In [None]:
grid_model = GridSearchCV(log_model,param_grid={'C':C,'penalty':penalty})

In [None]:
grid_model.fit(scaled_X_train,y_train)

In [None]:
grid_model.best_params_

### Model Performance on Classification Tasks

In [None]:
from sklearn.metrics import accuracy_score,confusion_matrix,classification_report,plot_confusion_matrix

In [None]:
y_pred = grid_model.predict(scaled_X_test)

In [None]:
accuracy_score(y_test,y_pred)

In [None]:
confusion_matrix(y_test,y_pred)

In [None]:
plot_confusion_matrix(grid_model,scaled_X_test,y_test)

In [None]:
# Scaled so highest value=1
plot_confusion_matrix(grid_model,scaled_X_test,y_test,normalize='true')

In [None]:
print(classification_report(y_test,y_pred))

## Evaluating Curves and AUC

**Make sure to watch the video on this! We need to manually create the plots for a Multi-Class situation. Fortunately, Scikit-learn's documentation already has plenty of examples on this.**

Source: https://scikit-learn.org/stable/auto_examples/model_selection/plot_roc.html

We have created a function for you that does this automatically, essentially creating and plotting an ROC per class.

In [None]:
from sklearn.metrics import roc_curve, auc

In [None]:
def plot_multiclass_roc(clf, X_test, y_test, n_classes, figsize=(5,5)):
    y_score = clf.decision_function(X_test)

    # structures
    fpr = dict()
    tpr = dict()
    roc_auc = dict()

    # calculate dummies once
    y_test_dummies = pd.get_dummies(y_test, drop_first=False).values
    for i in range(n_classes):
        fpr[i], tpr[i], _ = roc_curve(y_test_dummies[:, i], y_score[:, i])
        roc_auc[i] = auc(fpr[i], tpr[i])

    # roc for each class
    fig, ax = plt.subplots(figsize=figsize)
    ax.plot([0, 1], [0, 1], 'k--')
    ax.set_xlim([0.0, 1.0])
    ax.set_ylim([0.0, 1.05])
    ax.set_xlabel('False Positive Rate')
    ax.set_ylabel('True Positive Rate')
    ax.set_title('Receiver operating characteristic example')
    for i in range(n_classes):
        ax.plot(fpr[i], tpr[i], label='ROC curve (area = %0.2f) for label %i' % (roc_auc[i], i))
    ax.legend(loc="best")
    ax.grid(alpha=.4)
    sns.despine()
    plt.show()

In [None]:
plot_multiclass_roc(grid_model, scaled_X_test, y_test, n_classes=3, figsize=(16, 10))

------
------