# **Building multiclass classification models using SkLearn**

---
**We have choosen a clean dataset with no hard pre-processing requirements. The primary focus will be on dealing with the logistics of model builing for multiclass problems. We will be just be making base models without any parameter tuning or cross-validation.**

### Following are the alternatives to deal with multiclass classification
1. **Models that inherently support multiclass classification.**

    *These algorithms usually work out of the box without much changes in default parameters. However cetrain models may require explicit parameters (eg. (multi_class = ”multinomial” or "ovr") *
    - Logistic
    - Naive Bayes    
    - Decision Trees  
    - Nearest Neighbours (KNN)
    
    
2. **Use wrapper functions to make it happen**

    *These wrappers can take a binary classifier as input and build multiple classifiers which work collectively to simulate multiclass classification*
    - One vs Rest Classifier (n estimators build for n classes - efficient)
    - One vs One Classifier (nC2 estimators build for n classes - compute intensive)

In [2]:
import pandas as pd
import pandas_profiling

### **Importing data and generating profile report**

In [3]:
contraceptive_data = pd.read_csv("Contraceptive method choice.csv")
pandas_profiling.ProfileReport(contraceptive_data)

0,1
Number of variables,10
Number of observations,1473
Total Missing (%),0.0%
Total size in memory,115.2 KiB
Average record size in memory,80.1 B

0,1
Numeric,7
Categorical,0
Boolean,3
Date,0
Text (Unique),0
Rejected,0
Unsupported,0

0,1
Distinct count,4
Unique (%),0.3%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,2.1378
Minimum,1
Maximum,4
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,1
Q1,1
Median,2
Q3,3
95-th percentile,3
Maximum,4
Range,3
Interquartile range,2

0,1
Standard deviation,0.86486
Coef of variation,0.40455
Kurtosis,-1.325
Mean,2.1378
MAD,0.7531
Skewness,-0.098918
Sum,3149
Variance,0.74798
Memory size,11.6 KiB

Value,Count,Frequency (%),Unnamed: 3
3,585,39.7%,
1,436,29.6%,
2,425,28.9%,
4,27,1.8%,

Value,Count,Frequency (%),Unnamed: 3
1,436,29.6%,
2,425,28.9%,
3,585,39.7%,
4,27,1.8%,

Value,Count,Frequency (%),Unnamed: 3
1,436,29.6%,
2,425,28.9%,
3,585,39.7%,
4,27,1.8%,

0,1
Distinct count,4
Unique (%),0.3%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,3.4297
Minimum,1
Maximum,4
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,2
Q1,3
Median,4
Q3,4
95-th percentile,4
Maximum,4
Range,3
Interquartile range,1

0,1
Standard deviation,0.81635
Coef of variation,0.23802
Kurtosis,0.65601
Mean,3.4297
MAD,0.69609
Skewness,-1.2661
Sum,5052
Variance,0.66642
Memory size,11.6 KiB

Value,Count,Frequency (%),Unnamed: 3
4,899,61.0%,
3,352,23.9%,
2,178,12.1%,
1,44,3.0%,

Value,Count,Frequency (%),Unnamed: 3
1,44,3.0%,
2,178,12.1%,
3,352,23.9%,
4,899,61.0%,

Value,Count,Frequency (%),Unnamed: 3
1,44,3.0%,
2,178,12.1%,
3,352,23.9%,
4,899,61.0%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.073999

0,1
0,1364
1,109

Value,Count,Frequency (%),Unnamed: 3
0,1364,92.6%,
1,109,7.4%,

0,1
Distinct count,3
Unique (%),0.2%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,1.9199
Minimum,1
Maximum,3
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,1
Q1,1
Median,2
Q3,3
95-th percentile,3
Maximum,3
Range,2
Interquartile range,2

0,1
Standard deviation,0.87638
Coef of variation,0.45647
Kurtosis,-1.681
Mean,1.9199
MAD,0.78562
Skewness,0.15611
Sum,2828
Variance,0.76803
Memory size,11.6 KiB

Value,Count,Frequency (%),Unnamed: 3
1,629,42.7%,
3,511,34.7%,
2,333,22.6%,

Value,Count,Frequency (%),Unnamed: 3
1,629,42.7%,
2,333,22.6%,
3,511,34.7%,

Value,Count,Frequency (%),Unnamed: 3
1,629,42.7%,
2,333,22.6%,
3,511,34.7%,

0,1
Distinct count,15
Unique (%),1.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,3.2614
Minimum,0
Maximum,16
Zeros (%),6.6%

0,1
Minimum,0
5-th percentile,0
Q1,1
Median,3
Q3,4
95-th percentile,8
Maximum,16
Range,16
Interquartile range,3

0,1
Standard deviation,2.3585
Coef of variation,0.72318
Kurtosis,1.5296
Mean,3.2614
MAD,1.8416
Skewness,1.099
Sum,4804
Variance,5.5628
Memory size,11.6 KiB

Value,Count,Frequency (%),Unnamed: 3
2,276,18.7%,
1,276,18.7%,
3,259,17.6%,
4,197,13.4%,
5,135,9.2%,
0,97,6.6%,
6,92,6.2%,
7,49,3.3%,
8,47,3.2%,
9,16,1.1%,

Value,Count,Frequency (%),Unnamed: 3
0,97,6.6%,
1,276,18.7%,
2,276,18.7%,
3,259,17.6%,
4,197,13.4%,

Value,Count,Frequency (%),Unnamed: 3
10,11,0.7%,
11,11,0.7%,
12,4,0.3%,
13,2,0.1%,
16,1,0.1%,

0,1
Distinct count,4
Unique (%),0.3%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,3.1337
Minimum,1
Maximum,4
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,1
Q1,3
Median,3
Q3,4
95-th percentile,4
Maximum,4
Range,3
Interquartile range,1

0,1
Standard deviation,0.97616
Coef of variation,0.3115
Kurtosis,-0.42365
Mean,3.1337
MAD,0.80451
Skewness,-0.83614
Sum,4616
Variance,0.95289
Memory size,11.6 KiB

Value,Count,Frequency (%),Unnamed: 3
4,684,46.4%,
3,431,29.3%,
2,229,15.5%,
1,129,8.8%,

Value,Count,Frequency (%),Unnamed: 3
1,129,8.8%,
2,229,15.5%,
3,431,29.3%,
4,684,46.4%,

Value,Count,Frequency (%),Unnamed: 3
1,129,8.8%,
2,229,15.5%,
3,431,29.3%,
4,684,46.4%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.74949

0,1
1,1104
0,369

Value,Count,Frequency (%),Unnamed: 3
1,1104,74.9%,
0,369,25.1%,

0,1
Distinct count,34
Unique (%),2.3%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,32.538
Minimum,16
Maximum,49
Zeros (%),0.0%

0,1
Minimum,16
5-th percentile,21
Q1,26
Median,32
Q3,39
95-th percentile,47
Maximum,49
Range,33
Interquartile range,13

0,1
Standard deviation,8.2272
Coef of variation,0.25285
Kurtosis,-0.94389
Mean,32.538
MAD,6.9653
Skewness,0.25645
Sum,47929
Variance,67.688
Memory size,11.6 KiB

Value,Count,Frequency (%),Unnamed: 3
25,80,5.4%,
26,69,4.7%,
32,64,4.3%,
30,64,4.3%,
28,63,4.3%,
35,62,4.2%,
24,61,4.1%,
22,59,4.0%,
27,59,4.0%,
29,59,4.0%,

Value,Count,Frequency (%),Unnamed: 3
16,3,0.2%,
17,8,0.5%,
18,7,0.5%,
19,18,1.2%,
20,28,1.9%,

Value,Count,Frequency (%),Unnamed: 3
45,41,2.8%,
46,22,1.5%,
47,43,2.9%,
48,30,2.0%,
49,23,1.6%,

0,1
Distinct count,4
Unique (%),0.3%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,2.9586
Minimum,1
Maximum,4
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,1
Q1,2
Median,3
Q3,4
95-th percentile,4
Maximum,4
Range,3
Interquartile range,2

0,1
Standard deviation,1.015
Coef of variation,0.34307
Kurtosis,-0.95113
Mean,2.9586
MAD,0.83893
Skewness,-0.51037
Sum,4358
Variance,1.0302
Memory size,11.6 KiB

Value,Count,Frequency (%),Unnamed: 3
4,577,39.2%,
3,410,27.8%,
2,334,22.7%,
1,152,10.3%,

Value,Count,Frequency (%),Unnamed: 3
1,152,10.3%,
2,334,22.7%,
3,410,27.8%,
4,577,39.2%,

Value,Count,Frequency (%),Unnamed: 3
1,152,10.3%,
2,334,22.7%,
3,410,27.8%,
4,577,39.2%,

0,1
Distinct count,2
Unique (%),0.1%
Missing (%),0.0%
Missing (n),0

0,1
Mean,0.85064

0,1
1,1253
0,220

Value,Count,Frequency (%),Unnamed: 3
1,1253,85.1%,
0,220,14.9%,

Unnamed: 0,Wife_age,Wife_education,Husband_education,Number_of_children_ever_born,Wife_religion,Wife_Employment,Husband_Occupation,Standard_of_living,Media_Exposure,Method
0,24,2,3,3,1,1,2,3,0,1
1,45,1,3,10,1,1,3,4,0,1
2,43,2,3,7,1,1,3,4,0,1
3,42,3,2,9,1,1,3,3,0,1
4,36,3,3,8,1,1,3,2,0,1


### **Decoupling target**

In [4]:
data = contraceptive_data
target = "Method"

X = data[data.columns.difference([target])]
y = data['Method']
del(data)

### **Test-Train split**

In [5]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=124421)

# **Model Building**

### **1. Inherently multinomial classifiers**

In [6]:
from sklearn.linear_model import LogisticRegression
logistic_classifier = LogisticRegression()
logistic_classifier.fit(X_train, y_train)

print("Logistic Regression")
print("Train Accuracy = ",logistic_classifier.score(X_train, y_train))
print("Test Accuracy = ", logistic_classifier.score(X_test, y_test))

Logistic Regression
Train Accuracy =  0.49951503394762364
Test Accuracy =  0.5475113122171946


In [7]:
test_predictions = logistic_classifier.predict(X_test)

In [8]:
from sklearn.metrics import precision_recall_fscore_support as score
precision, recall, fscore, support = score(y_test, test_predictions)

print('Precision:',precision)
print('Recall:',recall)
print('Fscore:',fscore)
print('Support:',support)

Precision: [0.62735849 0.60344828 0.43023256]
Recall: [0.66834171 0.33018868 0.54014599]
Fscore: [0.64720195 0.42682927 0.4789644 ]
Support: [199 106 137]


In [9]:
from sklearn.metrics import recall_score, precision_score

precision_1 = precision_score(y_test, test_predictions, labels=None, pos_label=1, average='macro')
precision_2 = precision_score(y_test, test_predictions, labels=None, pos_label=2, average='macro')
precision_3 = precision_score(y_test, test_predictions, labels=None, pos_label=3, average='macro')

print("Precision 1 = ",precision_1)
print("Precision 2 = ",precision_2)
print("Precision 3 = ",precision_3)

Precision 1 =  0.5536797748558805
Precision 2 =  0.5536797748558805
Precision 3 =  0.5536797748558805




In [10]:
from sklearn.naive_bayes import GaussianNB
nb_classifier = GaussianNB()
nb_classifier.fit(X_train, y_train)

print("Gaussian NB")
print("Train Accuracy = ",nb_classifier.score(X_train, y_train))
print("Test Accuracy = ", nb_classifier.score(X_test, y_test))

Gaussian NB
Train Accuracy =  0.46459747817652763
Test Accuracy =  0.5067873303167421


In [11]:
from sklearn.tree import DecisionTreeClassifier
dt_classifier = DecisionTreeClassifier()
dt_classifier.fit(X_train, y_train)

print("Decision Tree")
print("Train Accuracy = ",dt_classifier.score(X_train, y_train))
print("Test Accuracy = ", dt_classifier.score(X_test, y_test))

Decision Tree
Train Accuracy =  0.9631425800193987
Test Accuracy =  0.46153846153846156


In [12]:
from sklearn.neighbors import KNeighborsClassifier
knn_classifier = KNeighborsClassifier()
knn_classifier.fit(X_train, y_train)

print("KNN")
print("Train Accuracy = ",knn_classifier.score(X_train, y_train))
print("Test Accuracy = ", knn_classifier.score(X_test, y_test))

KNN
Train Accuracy =  0.6585838991270611
Test Accuracy =  0.5271493212669683


### **2. Wrapper functions**

In [13]:
from sklearn.multiclass import OneVsRestClassifier
from sklearn.svm import SVC
svc_classifier = SVC()
svc_classifier.fit(X_train, y_train)

print("SVC")
print("Train Accuracy = ",svc_classifier.score(X_train, y_train))
print("Test Accuracy = ", svc_classifier.score(X_test, y_test))

SVC
Train Accuracy =  0.6624636275460718
Test Accuracy =  0.5678733031674208


In [14]:
y_train.value_counts()

1    430
3    374
2    227
Name: Method, dtype: int64