# Neural Network: DC Heros Classifier
Deep Learning Module: Neural Networks
Goal: Create a multi-layer perceptron neural network model to predict on a labeled dataset of your choosing. Then, we will compare this model to a random forest model and describe the relative tradeoffs between complexity and accuracy. Vary the hyperparameters of our MLP.

## Data Set Description:
    This folder contains data behind the story Comic Books Are Still Made By Men, For Men And About Men.

The data comes from DC Wikia. Characters were scraped on August 24. Appearance counts were scraped on September 2. The month and year of the first issue each character appeared in was pulled on October 6.

In [41]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

from collections import Counter

import os
import seaborn as sns

import matplotlib.pyplot as plt
plt.style.use('ggplot')
from tqdm import tqdm

import re
from scipy.cluster.vq import kmeans, vq
from pylab import plot, show
from matplotlib.lines import Line2D
import matplotlib.colors as mcolors

from sklearn.cluster import KMeans
from sklearn import neighbors
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler

# Import Perceptron.
from sklearn.linear_model import Perceptron

In [42]:
# Load Dataset
dc_hero_df = pd.read_csv('/Users/mehrunisaqayyum/Downloads/dc-wikia-data.csv')
dc_hero_df

Unnamed: 0,page_id,name,urlslug,ID,ALIGN,EYE,HAIR,SEX,GSM,ALIVE,APPEARANCES,FIRST APPEARANCE,YEAR
0,1422,Batman (Bruce Wayne),\/wiki\/Batman_(Bruce_Wayne),Secret Identity,Good Characters,Blue Eyes,Black Hair,Male Characters,,Living Characters,3093.0,"1939, May",1939.0
1,23387,Superman (Clark Kent),\/wiki\/Superman_(Clark_Kent),Secret Identity,Good Characters,Blue Eyes,Black Hair,Male Characters,,Living Characters,2496.0,"1986, October",1986.0
2,1458,Green Lantern (Hal Jordan),\/wiki\/Green_Lantern_(Hal_Jordan),Secret Identity,Good Characters,Brown Eyes,Brown Hair,Male Characters,,Living Characters,1565.0,"1959, October",1959.0
3,1659,James Gordon (New Earth),\/wiki\/James_Gordon_(New_Earth),Public Identity,Good Characters,Brown Eyes,White Hair,Male Characters,,Living Characters,1316.0,"1987, February",1987.0
4,1576,Richard Grayson (New Earth),\/wiki\/Richard_Grayson_(New_Earth),Secret Identity,Good Characters,Blue Eyes,Black Hair,Male Characters,,Living Characters,1237.0,"1940, April",1940.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...
6891,66302,Nadine West (New Earth),\/wiki\/Nadine_West_(New_Earth),Public Identity,Good Characters,,,Female Characters,,Living Characters,,,
6892,283475,Warren Harding (New Earth),\/wiki\/Warren_Harding_(New_Earth),Public Identity,Good Characters,,,Male Characters,,Living Characters,,,
6893,283478,William Harrison (New Earth),\/wiki\/William_Harrison_(New_Earth),Public Identity,Good Characters,,,Male Characters,,Living Characters,,,
6894,283471,William McKinley (New Earth),\/wiki\/William_McKinley_(New_Earth),Public Identity,Good Characters,,,Male Characters,,Living Characters,,,


In [43]:
# What are our column labels? 
dc_hero_df.columns

Index(['page_id', 'name', 'urlslug', 'ID', 'ALIGN', 'EYE', 'HAIR', 'SEX',
       'GSM', 'ALIVE', 'APPEARANCES', 'FIRST APPEARANCE', 'YEAR'],
      dtype='object')

## Model Preparation
We do not need to normalize data when the columns are mainly dummy variables with 1 and 0 values.

### Normalize Data 
so that all variables have a mean of 0 and standard deviation

Code to run if needed. 
X = StandardScaler().fit_transform(new_df)

In [44]:
#Drop unnecessary non numeric columns: GSM and 'page_id'
dc_hero_df = dc_hero_df.drop(columns = ['GSM','page_id'])

In [45]:
dc_hero_df

Unnamed: 0,name,urlslug,ID,ALIGN,EYE,HAIR,SEX,ALIVE,APPEARANCES,FIRST APPEARANCE,YEAR
0,Batman (Bruce Wayne),\/wiki\/Batman_(Bruce_Wayne),Secret Identity,Good Characters,Blue Eyes,Black Hair,Male Characters,Living Characters,3093.0,"1939, May",1939.0
1,Superman (Clark Kent),\/wiki\/Superman_(Clark_Kent),Secret Identity,Good Characters,Blue Eyes,Black Hair,Male Characters,Living Characters,2496.0,"1986, October",1986.0
2,Green Lantern (Hal Jordan),\/wiki\/Green_Lantern_(Hal_Jordan),Secret Identity,Good Characters,Brown Eyes,Brown Hair,Male Characters,Living Characters,1565.0,"1959, October",1959.0
3,James Gordon (New Earth),\/wiki\/James_Gordon_(New_Earth),Public Identity,Good Characters,Brown Eyes,White Hair,Male Characters,Living Characters,1316.0,"1987, February",1987.0
4,Richard Grayson (New Earth),\/wiki\/Richard_Grayson_(New_Earth),Secret Identity,Good Characters,Blue Eyes,Black Hair,Male Characters,Living Characters,1237.0,"1940, April",1940.0
...,...,...,...,...,...,...,...,...,...,...,...
6891,Nadine West (New Earth),\/wiki\/Nadine_West_(New_Earth),Public Identity,Good Characters,,,Female Characters,Living Characters,,,
6892,Warren Harding (New Earth),\/wiki\/Warren_Harding_(New_Earth),Public Identity,Good Characters,,,Male Characters,Living Characters,,,
6893,William Harrison (New Earth),\/wiki\/William_Harrison_(New_Earth),Public Identity,Good Characters,,,Male Characters,Living Characters,,,
6894,William McKinley (New Earth),\/wiki\/William_McKinley_(New_Earth),Public Identity,Good Characters,,,Male Characters,Living Characters,,,


In [46]:
#Drop rows with NAN to run our Perceptron and Random Forest Classifier models b/c we need numeric values for all records.
dc_hero_df.dropna(axis=0)

Unnamed: 0,name,urlslug,ID,ALIGN,EYE,HAIR,SEX,ALIVE,APPEARANCES,FIRST APPEARANCE,YEAR
0,Batman (Bruce Wayne),\/wiki\/Batman_(Bruce_Wayne),Secret Identity,Good Characters,Blue Eyes,Black Hair,Male Characters,Living Characters,3093.0,"1939, May",1939.0
1,Superman (Clark Kent),\/wiki\/Superman_(Clark_Kent),Secret Identity,Good Characters,Blue Eyes,Black Hair,Male Characters,Living Characters,2496.0,"1986, October",1986.0
2,Green Lantern (Hal Jordan),\/wiki\/Green_Lantern_(Hal_Jordan),Secret Identity,Good Characters,Brown Eyes,Brown Hair,Male Characters,Living Characters,1565.0,"1959, October",1959.0
3,James Gordon (New Earth),\/wiki\/James_Gordon_(New_Earth),Public Identity,Good Characters,Brown Eyes,White Hair,Male Characters,Living Characters,1316.0,"1987, February",1987.0
4,Richard Grayson (New Earth),\/wiki\/Richard_Grayson_(New_Earth),Secret Identity,Good Characters,Blue Eyes,Black Hair,Male Characters,Living Characters,1237.0,"1940, April",1940.0
...,...,...,...,...,...,...,...,...,...,...,...
6506,William Magnus (robot) (New Earth),\/wiki\/William_Magnus_(robot)_(New_Earth),Secret Identity,Bad Characters,Blue Eyes,Brown Hair,Male Characters,Living Characters,1.0,"1963, July",1963.0
6508,Boka (New Earth),\/wiki\/Boka_(New_Earth),Public Identity,Good Characters,Hazel Eyes,Black Hair,Female Characters,Living Characters,1.0,"1962, March",1962.0
6521,Jeffrey Graham (New Earth),\/wiki\/Jeffrey_Graham_(New_Earth),Public Identity,Good Characters,Blue Eyes,Blond Hair,Male Characters,Living Characters,1.0,"1951, May",1951.0
6526,Green Arrow (Oliver Queen),\/wiki\/Green_Arrow_(Oliver_Queen),Secret Identity,Good Characters,Green Eyes,Blond Hair,Male Characters,Living Characters,1.0,"1941, November",1941.0


## Create dummies separately for 'X' and 'Y'

### Note: Create Dummies and encoders for feature and target columns to classify how DC characters are good or bad based on characteristics.

In [48]:
#Create Dummies and encoders for feature and target columnsn to classify how DC characters are good or bad based on characteristics.
#new_df = pd.get_dummies(dc_hero_df['EYE','HAIR','ID','SEX','ALIVE'])

#dc_hero_df.info()
#new_df = pd.get_dummies(old_df['EYE'])
new_df = pd.get_dummies(dc_hero_df, columns = ['EYE','HAIR','SEX','ALIVE','ID'])

new_target_df= pd.get_dummies(dc_hero_df['ALIGN'])
#need drop original "EYE"
new_df.info()
new_df.head()
new_target_df.info()
new_target_df.head()

#Y is our new_target_df

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6896 entries, 0 to 6895
Data columns (total 49 columns):
name                          6896 non-null object
urlslug                       6896 non-null object
ALIGN                         6295 non-null object
APPEARANCES                   6541 non-null float64
FIRST APPEARANCE              6827 non-null object
YEAR                          6827 non-null float64
EYE_Amber Eyes                6896 non-null uint8
EYE_Auburn Hair               6896 non-null uint8
EYE_Black Eyes                6896 non-null uint8
EYE_Blue Eyes                 6896 non-null uint8
EYE_Brown Eyes                6896 non-null uint8
EYE_Gold Eyes                 6896 non-null uint8
EYE_Green Eyes                6896 non-null uint8
EYE_Grey Eyes                 6896 non-null uint8
EYE_Hazel Eyes                6896 non-null uint8
EYE_Orange Eyes               6896 non-null uint8
EYE_Photocellular Eyes        6896 non-null uint8
EYE_Pink Eyes                 6896 

Unnamed: 0,Bad Characters,Good Characters,Neutral Characters,Reformed Criminals
0,0,1,0,0
1,0,1,0,0
2,0,1,0,0
3,0,1,0,0
4,0,1,0,0


In [49]:
new_target_df.info

<bound method DataFrame.info of       Bad Characters  Good Characters  Neutral Characters  Reformed Criminals
0                  0                1                   0                   0
1                  0                1                   0                   0
2                  0                1                   0                   0
3                  0                1                   0                   0
4                  0                1                   0                   0
...              ...              ...                 ...                 ...
6891               0                1                   0                   0
6892               0                1                   0                   0
6893               0                1                   0                   0
6894               0                1                   0                   0
6895               1                0                   0                   0

[6896 rows x 4 columns]>

In [36]:
new_target_df

Unnamed: 0,Bad Characters,Good Characters,Neutral Characters,Reformed Criminals
0,0,1,0,0
1,0,1,0,0
2,0,1,0,0
3,0,1,0,0
4,0,1,0,0
...,...,...,...,...
6891,0,1,0,0
6892,0,1,0,0
6893,0,1,0,0
6894,0,1,0,0


In [27]:
print(*new_target_df)

Bad Characters Good Characters Neutral Characters Reformed Criminals


In [28]:
print(new_target_df)

      Bad Characters  Good Characters  Neutral Characters  Reformed Criminals
0                  0                1                   0                   0
1                  0                1                   0                   0
2                  0                1                   0                   0
3                  0                1                   0                   0
4                  0                1                   0                   0
...              ...              ...                 ...                 ...
6891               0                1                   0                   0
6892               0                1                   0                   0
6893               0                1                   0                   0
6894               0                1                   0                   0
6895               1                0                   0                   0

[6896 rows x 4 columns]


In [37]:
new_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6896 entries, 0 to 6895
Data columns (total 49 columns):
name                          6896 non-null object
urlslug                       6896 non-null object
ALIGN                         6295 non-null object
APPEARANCES                   6541 non-null float64
FIRST APPEARANCE              6827 non-null object
YEAR                          6827 non-null float64
EYE_Amber Eyes                6896 non-null uint8
EYE_Auburn Hair               6896 non-null uint8
EYE_Black Eyes                6896 non-null uint8
EYE_Blue Eyes                 6896 non-null uint8
EYE_Brown Eyes                6896 non-null uint8
EYE_Gold Eyes                 6896 non-null uint8
EYE_Green Eyes                6896 non-null uint8
EYE_Grey Eyes                 6896 non-null uint8
EYE_Hazel Eyes                6896 non-null uint8
EYE_Orange Eyes               6896 non-null uint8
EYE_Photocellular Eyes        6896 non-null uint8
EYE_Pink Eyes                 6896 

In [50]:
#Drop string columns from both features and target listed as "non-null object" after running .info on new_df.
## Assign new data set to those dropped columns but still includes dummies.
new_df2 = new_df.drop(columns = ['name','urlslug','FIRST APPEARANCE','ALIGN'])

#new_target_df = new_target_df.drop(columns = ['ALIGN']) 

In [11]:
new_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6896 entries, 0 to 6895
Data columns (total 45 columns):
APPEARANCES                   6541 non-null float64
YEAR                          6827 non-null float64
EYE_Amber Eyes                6896 non-null uint8
EYE_Auburn Hair               6896 non-null uint8
EYE_Black Eyes                6896 non-null uint8
EYE_Blue Eyes                 6896 non-null uint8
EYE_Brown Eyes                6896 non-null uint8
EYE_Gold Eyes                 6896 non-null uint8
EYE_Green Eyes                6896 non-null uint8
EYE_Grey Eyes                 6896 non-null uint8
EYE_Hazel Eyes                6896 non-null uint8
EYE_Orange Eyes               6896 non-null uint8
EYE_Photocellular Eyes        6896 non-null uint8
EYE_Pink Eyes                 6896 non-null uint8
EYE_Purple Eyes               6896 non-null uint8
EYE_Red Eyes                  6896 non-null uint8
EYE_Violet Eyes               6896 non-null uint8
EYE_White Eyes                6896 non-

In [51]:
# Establish X and Y
X = new_df2
Y = new_target_df

In [53]:
X.info

<bound method DataFrame.info of       APPEARANCES    YEAR  EYE_Amber Eyes  EYE_Auburn Hair  EYE_Black Eyes  \
0          3093.0  1939.0               0                0               0   
1          2496.0  1986.0               0                0               0   
2          1565.0  1959.0               0                0               0   
3          1316.0  1987.0               0                0               0   
4          1237.0  1940.0               0                0               0   
...           ...     ...             ...              ...             ...   
6891          NaN     NaN               0                0               0   
6892          NaN     NaN               0                0               0   
6893          NaN     NaN               0                0               0   
6894          NaN     NaN               0                0               0   
6895          NaN     NaN               0                0               0   

      EYE_Blue Eyes  EYE_Brown 

In [13]:
Y

Unnamed: 0,Bad Characters,Good Characters,Neutral Characters,Reformed Criminals
0,0,1,0,0
1,0,1,0,0
2,0,1,0,0
3,0,1,0,0
4,0,1,0,0
...,...,...,...,...
6891,0,1,0,0
6892,0,1,0,0
6893,0,1,0,0
6894,0,1,0,0


## Feature Selection

In [54]:
#apply SelectKBest class to extract top 5 best features
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import chi2

bestfeatures = SelectKBest(score_func=chi2, k=5)
fit = bestfeatures.fit(X,Y)

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

### Split and Train Data
#### Observation: Y should have same number of rows as X: 6898. And Y_train should have same number of rows as X_train: 5172 rows.

In [60]:
#Split data to train and test with 20% sample 
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size=.25,random_state =5)

In [61]:
X_train

Unnamed: 0,EYE_Amber Eyes,EYE_Auburn Hair,EYE_Black Eyes,EYE_Blue Eyes,EYE_Brown Eyes,EYE_Gold Eyes,EYE_Green Eyes,EYE_Grey Eyes,EYE_Hazel Eyes,EYE_Orange Eyes,...,HAIR_White Hair,SEX_Female Characters,SEX_Genderless Characters,SEX_Male Characters,SEX_Transgender Characters,ALIVE_Deceased Characters,ALIVE_Living Characters,ID_Identity Unknown,ID_Public Identity,ID_Secret Identity
3008,0,0,0,0,0,0,0,0,0,0,...,0,1,0,0,0,0,1,0,1,0
5304,0,0,0,0,1,0,0,0,0,0,...,0,1,0,0,0,1,0,0,0,0
1653,0,0,0,0,0,0,0,0,0,0,...,1,1,0,0,0,1,0,0,0,0
1001,0,0,0,0,0,0,0,0,0,0,...,0,1,0,0,0,0,1,0,1,0
3701,0,0,1,0,0,0,0,0,0,0,...,1,0,0,1,0,0,1,0,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3046,0,0,0,0,0,0,0,0,0,0,...,0,1,0,0,0,0,1,0,0,1
1725,0,0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,1,0,0,0
4079,0,0,0,0,0,0,0,0,0,0,...,0,1,0,0,0,1,0,0,0,1
2254,0,0,0,0,0,0,0,0,0,0,...,0,0,1,0,0,0,1,0,1,0


### Standardized X

## Perceptron Model

In [57]:
X = new_df2
Y = new_target_df
X = X.dropna(axis=1)

In [58]:
X

Unnamed: 0,EYE_Amber Eyes,EYE_Auburn Hair,EYE_Black Eyes,EYE_Blue Eyes,EYE_Brown Eyes,EYE_Gold Eyes,EYE_Green Eyes,EYE_Grey Eyes,EYE_Hazel Eyes,EYE_Orange Eyes,...,HAIR_White Hair,SEX_Female Characters,SEX_Genderless Characters,SEX_Male Characters,SEX_Transgender Characters,ALIVE_Deceased Characters,ALIVE_Living Characters,ID_Identity Unknown,ID_Public Identity,ID_Secret Identity
0,0,0,0,1,0,0,0,0,0,0,...,0,0,0,1,0,0,1,0,0,1
1,0,0,0,1,0,0,0,0,0,0,...,0,0,0,1,0,0,1,0,0,1
2,0,0,0,0,1,0,0,0,0,0,...,0,0,0,1,0,0,1,0,0,1
3,0,0,0,0,1,0,0,0,0,0,...,1,0,0,1,0,0,1,0,1,0
4,0,0,0,1,0,0,0,0,0,0,...,0,0,0,1,0,0,1,0,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
6891,0,0,0,0,0,0,0,0,0,0,...,0,1,0,0,0,0,1,0,1,0
6892,0,0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,1,0,1,0
6893,0,0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,1,0,1,0
6894,0,0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,1,0,1,0


In [65]:
Y
#Y should have same number of rows as X: 6898. And Y_train should have same number of rows as X_train: 5172 rows.

Unnamed: 0,Bad Characters,Good Characters,Neutral Characters,Reformed Criminals
0,0,1,0,0
1,0,1,0,0
2,0,1,0,0
3,0,1,0,0
4,0,1,0,0
...,...,...,...,...
6891,0,1,0,0
6892,0,1,0,0
6893,0,1,0,0
6894,0,1,0,0


In [66]:
X_train

Unnamed: 0,EYE_Amber Eyes,EYE_Auburn Hair,EYE_Black Eyes,EYE_Blue Eyes,EYE_Brown Eyes,EYE_Gold Eyes,EYE_Green Eyes,EYE_Grey Eyes,EYE_Hazel Eyes,EYE_Orange Eyes,...,HAIR_White Hair,SEX_Female Characters,SEX_Genderless Characters,SEX_Male Characters,SEX_Transgender Characters,ALIVE_Deceased Characters,ALIVE_Living Characters,ID_Identity Unknown,ID_Public Identity,ID_Secret Identity
3008,0,0,0,0,0,0,0,0,0,0,...,0,1,0,0,0,0,1,0,1,0
5304,0,0,0,0,1,0,0,0,0,0,...,0,1,0,0,0,1,0,0,0,0
1653,0,0,0,0,0,0,0,0,0,0,...,1,1,0,0,0,1,0,0,0,0
1001,0,0,0,0,0,0,0,0,0,0,...,0,1,0,0,0,0,1,0,1,0
3701,0,0,1,0,0,0,0,0,0,0,...,1,0,0,1,0,0,1,0,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3046,0,0,0,0,0,0,0,0,0,0,...,0,1,0,0,0,0,1,0,0,1
1725,0,0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,1,0,0,0
4079,0,0,0,0,0,0,0,0,0,0,...,0,1,0,0,0,1,0,0,0,1
2254,0,0,0,0,0,0,0,0,0,0,...,0,0,1,0,0,0,1,0,1,0


In [67]:
Y_train

Unnamed: 0,Bad Characters,Good Characters,Neutral Characters,Reformed Criminals
3008,0,1,0,0
5304,0,1,0,0
1653,0,1,0,0
1001,0,1,0,0
3701,0,1,0,0
...,...,...,...,...
3046,1,0,0,0
1725,0,0,1,0
4079,1,0,0,0
2254,1,0,0,0


In [68]:
# Need to Import Perceptron.
from sklearn.linear_model import Perceptron

# Establish Perceptron Model.
# 10,000 iterations to ensure accuracy since data is non-normalized.
#perceptron = Perceptron(n_iter=10000)
### If running in your own environment on scikit-learn 0.21, run the line of code below instead:
perceptron = Perceptron(max_iter=10000, tol=0, n_iter_no_change=10000)

# Fit Perceptron.
perceptron.fit(X_train, Y_train)

ValueError: y should be a 1d array, got an array of shape (5172, 4) instead.

In [64]:
Y_train

Unnamed: 0,Bad Characters,Good Characters,Neutral Characters,Reformed Criminals
3008,0,1,0,0
5304,0,1,0,0
1653,0,1,0,0
1001,0,1,0,0
3701,0,1,0,0
...,...,...,...,...
3046,1,0,0,0
1725,0,0,1,0
4079,1,0,0,0
2254,1,0,0,0


In [None]:
# Get Parameters.
print('Score: ' + str(perceptron.score(X_train, Y_train)))

### Visualize Perceptron Model's Border

In [None]:
# Establish a mesh for our plot.
x_min, x_max = X.test.min() - 1, X.test.max() + 3
y_min, y_max = X.project.min() - 1, X.project.max() + 3
xx, yy = np.meshgrid(np.arange(x_min, x_max, .1),
                     np.arange(y_min, y_max, .1))

# Predict over that mesh.
Z = (perceptron.predict(np.c_[xx.ravel(), yy.ravel()])=='pass')


# Reshape the prediction to be plottable.
Z = Z.reshape(xx.shape)

# Plot the mesh.
plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha=0.8)

# Plot our two scatters.
plt.scatter(test_data.project[0:10], test_data.test[0:10], marker='x')
plt.scatter(test_data.project[10:20], test_data.test[10:20], marker='o')

# Aesthetics.
plt.xlim(x_min, x_max)
plt.ylim(y_min, y_max)
plt.xlabel('Project Grade')
plt.ylabel('Test Grade')
plt.title('Passing Grades Perceptron Example')
plt.show()

## Random Forest Classifer Model

In [19]:
from sklearn import ensemble
from sklearn.model_selection import cross_val_score

rfc = ensemble.RandomForestClassifier()
X = new_df
Y = new_target_df
X = X.dropna(axis=1)

cross_val_score(rfc, X, Y, cv=10)

array([0.34927536, 0.35797101, 0.38695652, 0.39710145, 0.41449275,
       0.43768116, 0.42525399, 0.46008708, 0.41944848, 0.4383164 ])

### Analysis: 
The score cross validation reports is the accuracy of the tree. Here we're about 42% accurate. This was a weak performing classifier model.

In [69]:
from sklearn import ensemble
from sklearn.model_selection import cross_val_score

rfc = ensemble.RandomForestClassifier()
X = new_df2
Y = new_target_df
X = X.dropna(axis=1)

cross_val_score(rfc, X, Y, cv=10)

array([0.35507246, 0.34637681, 0.37391304, 0.4       , 0.40724638,
       0.44057971, 0.42670537, 0.44847605, 0.42960813, 0.43251089])

### Analysis: 
We re ran the RFC model with the updated data points. But the score cross validation reports a similar accuracy of the tree. Here we're about still about 42% accurate. This was a weak performing classifier model.