# Analysis

In [1]:
%run pipeline_utils.ipynb

(32560, 14)
(32560, 1)


## Load models 

In [5]:
import joblib
# votingP = joblib.load('voting_model.pkl') # using tuned models assigned heuristic weights
xgbP = joblib.load('xgb_grid_model.pkl') # tuned using grid search
rfcP = joblib.load('rfc_grid_model.pkl') # tuned using randomized search

In [6]:
# explicitly run voting pipeline. That model was not picklable because of some function transformer used in pipeline

In [7]:
# %run voting_pipeline.ipynb

In [8]:
Ygb = xgbP.predict(Xva)

In [9]:
Yrf = rfcP.predict(Xva)

In [11]:
diff = Ygb != Yrf

In [33]:
diff[:10]

array([False,  True, False, False, False, False,  True, False, False,
       False])

# RF misclassifies a data point as 0 while XGB classifies it as 1

In [None]:
# We observe that the second sample Xts.iloc[1] is getting misclassified by one of the classifier
# In the below case Xts[1] should be classified as 1 but RF mis-classifies it as 0

In [14]:
Xva.iloc[1]

age                                32
workclass                     Private
fnlwgt                         185027
education                  Assoc-acdm
education-num                      12
marital-status     Married-civ-spouse
occupation               Craft-repair
relationship                  Husband
race                            White
sex                              Male
capital-gain                        0
capital-loss                     1887
hours-per-week                     40
native-country                Ireland
Name: 11931, dtype: object

In [17]:
Yva[1] # the actual class is 1 

array([1])

In [18]:
Ygb[1] # XGB predicts correctly

1

In [19]:
Yrf[1] # RF predicts wrongly

0

In [28]:
# Lets observe the confidence score for this classification
Yrf_conf = rfcP.predict_proba(Xva)
Yrf_conf[1] # RF is confused between the two classes: 60% vs 39%

array([0.60641334, 0.39358666])

In [29]:
Ygb_conf = xgbP.predict_proba(Xva)
Ygb_conf[1] # XGB on  the other hand is very sure about its prediction : 87%

array([0.1217519, 0.8782481], dtype=float32)

## Observation
So we can see that though random forest chose class 0, it chose with confidence of 60% while XgBoost classified the same point as class-1 with confidence 87%. If we had a voting mechanism in which we weighed low those predictions which have small confidence and weighed high those predictions which are sure we could improve the classification accuracy. This can be an area of future exploration.  

# XGB misclassifies 

In [69]:
ind = np.logical_and((Yva.ravel()!=Ygb.ravel()),(Yva.ravel()==Yrf.ravel()))
ind[-2]

True

In [71]:
Xva.iloc[-2] # the data that got miclassified

age                                40
workclass                     Private
fnlwgt                         136244
education                Some-college
education-num                      10
marital-status     Married-civ-spouse
occupation               Adm-clerical
relationship                     Wife
race                            White
sex                            Female
capital-gain                        0
capital-loss                        0
hours-per-week                     40
native-country          United-States
Name: 7933, dtype: object

In [72]:
Yva[-2]

array([0])

In [73]:
Ygb[-2] # xgb got it wrong

1

In [75]:
Yrf[-2] # RF got it right

0

In [76]:
# So let's analyze their confidence level for this prediction
Ygb_conf[-2] # here we observe that XGB is totally confused about this data point

array([0.48785216, 0.51214784], dtype=float32)

In [79]:
Yrf_conf[-2] # here we observe that YRF on the other hand has a decent confidence on its prediction

array([0.73363899, 0.26636101])

We try to improve this by adding a voting classifier to our models, but we did not implement dynamic weights 
predictions based on their confidence level