# Multiclass Logistic Regression

## Multiclass using SKlearn's LogisticRegression

In the previous sections, we learnt how to use Sklearn's LogisticRegression module and how to fine tune the parameters for 2 class or binary class problem.

In this section, we will learn how to use the LogisticRegression for a multiclass problem involving 3 or more classes.

According to the sklearn documentation, in the multiclass scenario, the LogisticRegression algorithm uses the one-vs-rest (OvR) scheme if the ‘multi_class’ option is set to ‘ovr’.
It uses the cross-entropy loss if the ‘multi_class’ option is set to ‘multinomial’. 

A multiclass option of ‘multinomial’ is supported only by certain solvers such as the ‘lbfgs’, ‘sag’, ‘saga’ and ‘newton-cg’.

Let us try to take the simple example of iris dataset. 

```python
from sklearn.linear_model import LogisticRegression
import pandas as pd

iris = pd.read_csv('../../../data/iris.csv', na_values='?').dropna()
iris.info()
iris.shape

lr_iris = LogisticRegression()
lr_iris = lr_iris.fit(X, y)
y_pred = lr_iris.predict(X)
```
Measure the performance of the trained model over the training set:
```python
lr_iris.score(y_pred, train_data['Survived'])
0.96
```

<br/>

## Exercise:

Train the model with LogisticRegression.

- Train using scikit learn logistic regression module.
- Get the prediction on the training set and print out the score.

In [45]:
# Here is the distplot used to generate Age plot. Modify features variable for fare.
import numpy as np
import pandas as pd
import seaborn as sns
from sklearn import datasets,metrics
from sklearn.linear_model import LogisticRegression

iris = datasets.load_iris()
iris_data = iris.data
iris_data = pd.DataFrame(iris_data, columns=iris.feature_names)
iris_data['species'] = iris.target 
iris_data['species'].unique()

features = iris.feature_names
target = 'species'

X = iris_data[features]
y = iris_data[target]

#write your code below

### Solution

```python

lr_iris = LogisticRegression()
lr_iris = lr_iris.fit(X, y)
y_pred = lr_iris.predict(X)

print(metrics.accuracy_score(y_pred,y))
```


## multiclass Parameter

SKlearn's LogisticRegression class takes a parameter called multiclass to tune the algorithm for multiclass scenario. LogisticRegression uses two approaches for multiclass problem. 

### 1. One-Vs-Rest (OVR)

One-vs.-rest (or one-vs-all, OvA) classifier involves training a single classifier per class, with the samples of that class as positive sample and all other samples as negatives. For OVA, the assumption is that there are $N$ independent classification problems, meaning $N$ classes, and for each class we learn a logistic (probability) model. The key assumption is that each of these problems is independent of the other $N-1$  logistic regression problems. Hence for each sample we either classify this as class $Y_i$  or not. This is repeated for all classes.


### 2. Multinomial

The alternative to One vs Rest classifier is the multinomial logistic regression classifier. The multinomial classifier does not classifier each class seperately, instead it uses the softmax function to predict if a single data point falls in one of the $N$ classes. Most of the times you may not see a significant difference in the results but one benifit is that you model the entire distribution $P(Y_i = c ) = \text{SoftMax} \big( \beta_1, x_1, \beta_2, x_2.... \beta_n, x_n,  \big)$ rather than the individual distributions  $P(Y_i = c)$. 

What method should you use? It typically pays to try both and see how well it works on your validation/test set. 

Let us try to set the appropriate value for multiclass parameter.

First let us try OVR approach. This uses liblinear solver.

## Exercise

Set the multiclass to 'ovr' in LogisticRegression model.



In [47]:
#Write your code below

### Solution

```python

lr_iris = LogisticRegression(multi_class='ovr')
lr_iris = lr_iris.fit(X, y)
y_pred = lr_iris.predict(X)

print(metrics.accuracy_score(y_pred,y))```


## Multinomial option

Now let us try multinomial approach


Let us try to set the appropriate value for multiclass parameter.

## Exercise

Set the multiclass to 'multinomial' and solver as 'newton-cg'


In [51]:
#Write your code below

### Solution

```python

lr_iris = LogisticRegression(multi_class='multinomial',solver='newton-cg')
lr_iris = lr_iris.fit(X, y)
y_pred = lr_iris.predict(X)

print(metrics.accuracy_score(y_pred,y))
```


## Other Solver options - lbfgs

Now let us try different solvers for multinomial


Let us try  lbfgs as the solver in this case.

## Exercise

Set the multiclass to 'multinomial' and solver as 'lbfgs'


In [50]:
#Write your code below

### Solution

```python
lr_iris = LogisticRegression(multi_class='multinomial',solver='lbfgs')
lr_iris = lr_iris.fit(X, y)
y_pred = lr_iris.predict(X)

print(metrics.accuracy_score(y_pred,y))
```


## Other Solver options - SAG

Now let us try different solvers for multinomial


Let us try  sag as the solver..

## Exercise

Set the multiclass to 'multinomial' and solver as 'sag'


In [53]:
#Write your code below

### Solution

```python
lr_iris = LogisticRegression(multi_class='multinomial',solver='sag')
lr_iris = lr_iris.fit(X, y)
y_pred = lr_iris.predict(X)

print(metrics.accuracy_score(y_pred,y))
```


## Other Solver options - SAGA

Now let us try different solvers for multinomial


Let us try  saga as the solver.

## Exercise

Set the multiclass to 'multinomial' and solver as 'saga'


In [54]:
#Write your code below

### Solution

```python
lr_iris = LogisticRegression(multi_class='multinomial',solver='saga')
lr_iris = lr_iris.fit(X, y)
y_pred = lr_iris.predict(X)

print(metrics.accuracy_score(y_pred,y))
```


## GridSearchCV to tune the model

Now let us try GridSearchCV with saga and multinomial option


## Exercise

Write code to use GridSearchCV to figure out the best parameters for C,max_iter and penalty from the below code.

In [67]:
from sklearn.model_selection import GridSearchCV
import time

penalty = ['l1','l2']
max_iter=[80, 100,140]
C = np.linspace(0.1, 1.0, num=5)

# Write your code below

X.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm)
0,5.1,3.5,1.4,0.2
1,4.9,3.0,1.4,0.2
2,4.7,3.2,1.3,0.2
3,4.6,3.1,1.5,0.2
4,5.0,3.6,1.4,0.2


### Solution

```python
param_grid = dict(max_iter=max_iter, C=C, penalty=penalty)

lr_iris = LogisticRegression(multi_class='multinomial',solver='saga')

grid = GridSearchCV(estimator=lr_iris, param_grid=param_grid, cv = 5)

start_time = time.time()
grid_result = grid.fit(X, y)
# Summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
print("Execution time: " + str((time.time() - start_time)) + ' ms')
```



From the above the best parameters for the multiclass is below:

```python
Best: 0.986667 using {'C': 0.325, 'max_iter': 100, 'penalty': 'l2'}
```

### Exercise

Create LogisticRegression model with the best parameters from the GridSearchCV in the previous step.


In [38]:
#Write the code below


#### Solution

```python
lr_iris = LogisticRegression(multi_class='multinomial',solver='saga', C=0.325, max_iter= 100, penalty= 'l2')
lr_iris = lr_iris.fit(X, y)
y_pred = lr_iris.predict(X)

print(metrics.accuracy_score(y_pred,y))
```