### Codio Activity 13.7: Multi-Class Logistic Regression

**Expected Time = 90 minutes**

**Total Points = 60**

This activity focuses on implementing `LogisticRegression` estimator using three approaches for multi class classification.  Two of these, one vs. rest and multinomial, are available using the estimator directly.  The third example, one vs. one, is implemented from the scikit-learn `multiclass` module.  Most important is that you can consider each of these models as options when building classification models and that you select the best depending on your identified metric.

#### Index

- [Problem 1](#Problem-1)
- [Problem 2](#Problem-2)
- [Problem 3](#Problem-3)
- [Problem 4](#Problem-4)
- [Problem 5](#Problem-5)
- [Problem 6](#Problem-6)

In [None]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, recall_score, precision_score
from sklearn.model_selection import train_test_split
from sklearn.multiclass import OneVsOneClassifier

### The Data

Below, the penguins data is loaded and the target feature for all classes is converted to a numeric value.  Thus, we have three classes where 0, 1, and 2 represent Adelie, Chinstrap, and Gentoo respectively.  

In [None]:
penguins = sns.load_dataset('penguins').dropna()
X = penguins.drop(['species', 'island', 'sex'], axis = 1)
y = penguins.species
y_num = pd.factorize(y)[0]
categories = pd.factorize(y)[1]
print(categories)

In [None]:
X.head()

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=518)

[Back to top](#-Index)

### Problem 1

#### One vs. Rest Classification

**10 Points**

To begin, use the `LogisticRegression` estimator with the argument `multi_class = 'ovr'` and `random_state = 42` to fit a model on the training data named `ovr_lgr`.   

In [None]:
### GRADED

ovr_lgr = ''

### BEGIN SOLUTION
ovr_lgr = LogisticRegression(multi_class='ovr', random_state=42).fit(X_train, y_train)
### END SOLUTION

# Answer check
ovr_lgr

In [None]:
### BEGIN HIDDEN TESTS
ovr_lgr_ = LogisticRegression(multi_class='ovr', random_state=42).fit(X_train, y_train)
#
#
#
assert ovr_lgr.multi_class == ovr_lgr_.multi_class
### END HIDDEN TESTS

[Back to top](#-Index)

### Problem 2

#### Examining the Probabilities

**10 Points**

Examine the predicted probabilities on the testing data.  Assign these to `ovr_probs` as an array below.  

In [None]:
### GRADED

ovr_probs = ovr_lgr.predict_proba(X_test)

### BEGIN SOLUTION
ovr_probs = ovr_lgr.predict_proba(X_test)
### END SOLUTION

# Answer check
pd.DataFrame(ovr_probs, columns = ['p(adelie)', 'p(gentoo)', 'p(chinstrap)']).head()

In [None]:
### BEGIN HIDDEN TESTS
ovr_lgr_ = LogisticRegression(multi_class='ovr', random_state=42).fit(X_train, y_train)
ovr_probs_ = ovr_lgr.predict_proba(X_test)
#
#
#
np.testing.assert_array_equal(ovr_probs, ovr_probs_)
### END HIDDEN TESTS

[Back to top](#-Index)

### Problem 3

#### Trying multinomial

**10 Points**

Now, fit a `LogisticRegression` estimator with `multi_class = 'multinomial'` and `random_state = 42`.  Fit the model on the training data as `multi_lgr` below.

In [None]:
### GRADED

multi_lgr = ''

### BEGIN SOLUTION
multi_lgr = LogisticRegression(multi_class='multinomial', random_state=42).fit(X_train, y_train)
### END SOLUTION

# Answer check
multi_lgr

In [None]:
### BEGIN HIDDEN TESTS
multi_lgr_ = LogisticRegression(multi_class='multinomial', random_state=42).fit(X_train, y_train)
#
#
#
assert multi_lgr.multi_class == multi_lgr_.multi_class
### END HIDDEN TESTS

[Back to top](#-Index)

### Problem 4

#### Examining the Probabilities

**10 Points**

Again, examine the probabilities from the multinomial estimator above on the test data.  Assign them as an array to `multi_probs` below. 

In [None]:
### GRADED

multi_probs = multi_lgr.predict_proba(X_test)

### BEGIN SOLUTION
multi_probs = multi_lgr.predict_proba(X_test)
### END SOLUTION

# Answer check
pd.DataFrame(multi_probs, columns = ['p(adelie)', 'p(gentoo)', 'p(chinstrap)']).head()

In [None]:
### BEGIN HIDDEN TESTS
multi_lgr_ = LogisticRegression(multi_class='multinomial', random_state=42).fit(X_train, y_train)
multi_probs_ = multi_lgr.predict_proba(X_test)
#
#
#
np.testing.assert_array_equal(multi_probs, multi_probs_)
### END HIDDEN TESTS

[Back to top](#-Index)

### Problem 5

#### One vs. One Classifier

**10 Points**

Similar in thinking to the one vs. rest approach, the one vs. one approach pairs every combination of the target class and builds a logistic model on this binary problem.  This means that for three classes you would have 6 different logistic regressors.  

The LogisticRegression estimator does not have this as a default but scikitlearn implements this approach through the `OneVsOneClassifier` that accepts a classification estimator. Below, instantiate a `OneVsOneClassifier` with a `LogisticRegression` estimator os `ovo_clf` below, and fit this on the training data. In your Logistic estimator set `random_state = 42`.     

In [None]:
### GRADED

ovo_clf = ''

### BEGIN SOLUTION
ovo_clf = OneVsOneClassifier(LogisticRegression(random_state = 42)).fit(X_train, y_train)
### END SOLUTION

# Answer check
ovo_clf 

In [None]:
### BEGIN HIDDEN TESTS
ovo_clf_ = OneVsOneClassifier(LogisticRegression(random_state = 42)).fit(X_train, y_train)
#
#
#
assert type(ovo_clf) == type(ovo_clf_)
### END HIDDEN TESTS

[Back to top](#-Index)

### Problem 6

#### Comparing Performance

**10 Points**

Create a DataFrame that contains the scores on testing data in terms of accuracy.  Assign to `eval_df` below.  Which classifier performed best in terms of accuracy?  Assign your answer as a string -- `ovr`, `multi`, or `ovo` -- below to `best_acc`. 

| estimator | accuracy | 
| ------ | ------ |
| ovo | - |
| multi | - |
| ovo | - |

In [None]:
### GRADED

best_acc = ''
### BEGIN SOLUTION
best_acc = 'multi'
### END SOLUTION

# Answer check
print(best_acc)

In [None]:
### BEGIN HIDDEN TESTS
best_acc_ = 'multi'
#
#
#
assert best_acc == best_acc_
### END HIDDEN TESTS

Hopefully this activity increased your facility with the `LogisticRegression` estimator and how it can be used in a multi-class setting.  Of course, these options are things you may consider in a grid search rather than fitting each on their own, however the One vs. One will have to implemented as its own object.  Further, many of the fitting procedures should raise warnings.  As seen before, there is regularization behind the scenes so scaling the data should happen prior to fitting.  Further, you may need to give the estimator more time for the gradient descent to converge, which you can control with the `max_iter` argument.