# Multinomial Logistic Regression

Logistic regression is useful in classification on more than just binary classification.

What is we want an algorithm that discriminates between cats, dogs, birds and bees?

This is where multinomial classification comes in.

The multinomial regression function consists of two functional layers-

1. Linear prediction function (a.k.a. logit layer)
2. Softmax function (a.k.a. softmax layer)

The simplest way to think of it is as $k$ regression models being fit (one binary model for each class). Then, we take the [softmax](https://en.wikipedia.org/wiki/Softmax_function) of the probabilities on each, and pick the one with the highest probability:

![](../assets/logit_matrix.png)

In [1]:
# imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn import metrics
from sklearn.metrics import confusion_matrix

In [2]:
#
df = pd.read_csv('../data/cleveland_data.csv', header = None)
col_names = ['age', 'sex', 'cp', 'trestbps', 'chol', 'fbs', 'restecg', 'thalach', 'exang', 'oldpeak', 'slope', 'ca', 'thal', 'num'] 

df.columns = col_names # setting df col names

df.sample(5)

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,num
262,60.0,0.0,1.0,150.0,240.0,0.0,0.0,171.0,0.0,0.9,1.0,0.0,3.0,0
88,53.0,0.0,4.0,138.0,234.0,0.0,2.0,160.0,0.0,0.0,1.0,0.0,3.0,0
7,57.0,0.0,4.0,120.0,354.0,0.0,0.0,163.0,1.0,0.6,1.0,0.0,3.0,0
63,54.0,0.0,3.0,135.0,304.0,1.0,0.0,170.0,0.0,0.0,1.0,0.0,3.0,0
113,43.0,0.0,4.0,132.0,341.0,1.0,2.0,136.0,1.0,3.0,2.0,0.0,7.0,2


In [3]:
# Basic Data Cleaning

df.replace({'?': np.nan}, inplace = True) 
#df.info() # want to change thal and ca from object to float
df[['ca','thal']] = df[['ca','thal']].astype('float64')

df['ca_null'] = df['ca'].isnull().astype(int)
df['thal_null'] = df['thal'].isnull().astype(int)

df.ca = df.ca.fillna(0.)
df.thal = df.thal.fillna(0.)

df.describe()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,num,ca_null,thal_null
count,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0,303.0
mean,54.438944,0.679868,3.158416,131.689769,246.693069,0.148515,0.990099,149.607261,0.326733,1.039604,1.60066,0.663366,4.70297,0.937294,0.013201,0.006601
std,9.038662,0.467299,0.960126,17.599748,51.776918,0.356198,0.994971,22.875003,0.469794,1.161075,0.616226,0.934375,1.971038,1.228536,0.114325,0.08111
min,29.0,0.0,1.0,94.0,126.0,0.0,0.0,71.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0
25%,48.0,0.0,3.0,120.0,211.0,0.0,0.0,133.5,0.0,0.0,1.0,0.0,3.0,0.0,0.0,0.0
50%,56.0,1.0,3.0,130.0,241.0,0.0,1.0,153.0,0.0,0.8,2.0,0.0,3.0,0.0,0.0,0.0
75%,61.0,1.0,4.0,140.0,275.0,0.0,2.0,166.0,1.0,1.6,2.0,1.0,7.0,2.0,0.0,0.0
max,77.0,1.0,4.0,200.0,564.0,1.0,2.0,202.0,1.0,6.2,3.0,3.0,7.0,4.0,1.0,1.0


In [4]:
#
y = df.num

cat_cols = ['cp', 'restecg', 'slope']
num_cols = ['age', 'trestbps', 'chol', 'restecg', 'thalach', 'oldpeak', 'ca', 'thal']

X = df[num_cols + ['ca_null', 'thal_null', 'sex']] #Don't want to add categorical bc we will add them as dummies 


for c in cat_cols:
    X = X.join(pd.get_dummies(df[c].astype(int), drop_first=True, prefix=c))

# Add polynomial 
for c in num_cols:
    X[c + '2'] = X[c] ** 2
    X[c + '3'] = X[c] ** 3

In [5]:
#
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
logreg = LogisticRegression()
logreg.fit(X_train, y_train)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


LogisticRegression()

In [6]:
# evaluating the model
y_pred = logreg.predict(X_test)
print('Accuracy of logistic regression classifier on test set: {:.2f}'.format(logreg.score(X_test, y_test)))
print(confusion_matrix(y_test, y_pred))

Accuracy of logistic regression classifier on test set: 0.54
[[43  0  3  0  1]
 [16  0  3  0  0]
 [ 8  0  4  0  0]
 [ 5  0  2  2  0]
 [ 3  0  0  1  0]]


In [7]:
#
logreg.predict_proba(X_test)[:20]


array([[0.95640433, 0.03124195, 0.00426723, 0.00587675, 0.00220974],
       [0.0269394 , 0.05303587, 0.405702  , 0.06483994, 0.44948279],
       [0.26584122, 0.21651808, 0.2046417 , 0.18965608, 0.12334293],
       [0.27308189, 0.26727389, 0.20642197, 0.23086519, 0.02235707],
       [0.6018074 , 0.16620791, 0.0904168 , 0.08602482, 0.05554307],
       [0.11489436, 0.18983399, 0.3045917 , 0.23020493, 0.16047503],
       [0.09322384, 0.21795057, 0.30738864, 0.30044066, 0.08099628],
       [0.6236485 , 0.1709294 , 0.07987833, 0.08729041, 0.03825335],
       [0.14613694, 0.20725994, 0.27369907, 0.23210143, 0.14080261],
       [0.57028932, 0.19416427, 0.09283723, 0.10914735, 0.03356183],
       [0.49598953, 0.19789044, 0.12499561, 0.12135499, 0.05976944],
       [0.67472233, 0.15700248, 0.06809724, 0.07374504, 0.02643291],
       [0.21145779, 0.2110533 , 0.26031424, 0.19783531, 0.11933938],
       [0.4517966 , 0.20944516, 0.14034854, 0.13721376, 0.06119593],
       [0.81277024, 0.1105747 , 0.

In [8]:
b.shape,logreg.classes_

NameError: name 'b' is not defined

In [None]:
logreg.predict(X_test)[:20]
