# Challenge
The idea behind this challenge is for you to explore the dataset, build a prediction model from it and then code a Python module that serves the model. Therefore there will be two main parts:

1. Your Jupyter (IPython) notebook containing all your experiments, analyzes, and results. In this notebook, you will perform all your visualizations, data normalizations, training and evaluation of your model.
2. A complete Python module ready to be used containing the model you trained. This module should be coded as if the engineering team would use it so it must contain usage instructions and a clear interface (generally speaking) to access your model.

## The dataset
Kickstarter is one of the main online crowdfunding platforms in the world. The dataset provided contains more de 300,000 projects launched on the platform in 2018. In the `data.csv` file there are the following columns:

- **ID**: internal ID, _numeric_
- **name**: name of the project, _string_
- **category**: project's category, _string_
- **main_category**: campaign's category, _string_
- **currency**: project's currency, _string_
- **deadline**: project's deadline date, _timestamp_
- **goal**: fundraising goal, _numeric_
- **launched**: project's start date, _timestamp_
- **pledged**: amount pledged by backers (project's currency), _numeric_
- **state**: project's current state, _string_; **this is what you have to predict**
- **backers**: amount of poeple that backed the project, _numeric_
- **country**: project's country, _string_
- **usd pledged**: amount pledged by backers converted to USD (conversion made by KS), _numeric_
- **usd_pledged_real**: amount pledged by backers converted to USD (conversion made by fixer.io api), _numeric_
- **usd_goal_real**: fundraising goal is USD, _numeric_

## Goal
Your goal is to predict whether a project will be successful or not. It is entirely up to you which features you will use and which model. When it comes to performance metrics you should be able to say when the model is good enough. There are no minimum requirements or tricky conditions. What we are trying to evaluate is how you handle an unknown dataset in a classification task and your ability to deliver the results.

## Deliverables
Do not use this notebook for your submission. The expected outputs are:

1. A Jupyter (IPython) notebook (that you have to create) containing your work and explanations. This is where you will put all your experiments, notes, visualizations and transformations in the data. This is also where you will prepare your data and train your prediction model.
2. A Python module containing your model and functions to use to predict Kickstarter projects' state. Assume that in order to review your work an engineer will import this module and try to make some predictions so your model should be in it.
3. A Markdown file containing usage instructions for your Python module. 

In [1]:
# This cell shows how you can use challenge package

from challenge import Data, MyKNN
data = Data("data.csv")
model = MyKNN(data.X, data.y)
model.train()
model.predict()
model.display_acc()
model.display_confusion_matrix()

Accuracy of K Nearest Neighbor 0.99955

Confusion Matrix for K Nearest Neighbor  is as follows :
[[65143    47]
 [    2 44261]]



In [2]:
import numpy as np
import pandas as pd
from sklearn import svm

from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix
from sklearn.tree import DecisionTreeClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

%matplotlib inline


In [3]:
def read_as_pandas(file_path):
    return pd.read_csv(file_path)

In [4]:
def success(df):
    if df['state']== 'successful':
        return 1
    else:
        return 0

In [5]:
data_file_path = "data.csv"
data = read_as_pandas(data_file_path)

## Usefullness of Columns

|Column      |        Usefullness                             |
|:---------------------:|:---------------------------------:| 
|ID                 |                         Useless |
|name               |                         Useless |
|category           |                         May be Useful|
|currency           |                         May be Useful|
|deadline           |                         Useful  |
|goal               |                         May be Useful|
|launched           |                          Useful |
|pledged            |                          Useful |
|backers            |                          Useful |
|country            |                          Useless|
|usd pledged        |                          Useful |
|usd_pledged_real   |                          Useful |
|usd_goal_real      |                          Useful |





In [6]:
data.head(10)

Unnamed: 0,ID,name,category,main_category,currency,deadline,goal,launched,pledged,state,backers,country,usd pledged,usd_pledged_real,usd_goal_real
0,1000002330,The Songs of Adelaide & Abullah,Poetry,Publishing,GBP,2015-10-09,1000.0,2015-08-11 12:12:28,0.0,failed,0,GB,0.0,0.0,1533.95
1,1000003930,Greeting From Earth: ZGAC Arts Capsule For ET,Narrative Film,Film & Video,USD,2017-11-01,30000.0,2017-09-02 04:43:57,2421.0,failed,15,US,100.0,2421.0,30000.0
2,1000004038,Where is Hank?,Narrative Film,Film & Video,USD,2013-02-26,45000.0,2013-01-12 00:20:50,220.0,failed,3,US,220.0,220.0,45000.0
3,1000007540,ToshiCapital Rekordz Needs Help to Complete Album,Music,Music,USD,2012-04-16,5000.0,2012-03-17 03:24:11,1.0,failed,1,US,1.0,1.0,5000.0
4,1000011046,Community Film Project: The Art of Neighborhoo...,Film & Video,Film & Video,USD,2015-08-29,19500.0,2015-07-04 08:35:03,1283.0,canceled,14,US,1283.0,1283.0,19500.0
5,1000014025,Monarch Espresso Bar,Restaurants,Food,USD,2016-04-01,50000.0,2016-02-26 13:38:27,52375.0,successful,224,US,52375.0,52375.0,50000.0
6,1000023410,Support Solar Roasted Coffee & Green Energy! ...,Food,Food,USD,2014-12-21,1000.0,2014-12-01 18:30:44,1205.0,successful,16,US,1205.0,1205.0,1000.0
7,1000030581,Chaser Strips. Our Strips make Shots their B*tch!,Drinks,Food,USD,2016-03-17,25000.0,2016-02-01 20:05:12,453.0,failed,40,US,453.0,453.0,25000.0
8,1000034518,SPIN - Premium Retractable In-Ear Headphones w...,Product Design,Design,USD,2014-05-29,125000.0,2014-04-24 18:14:43,8233.0,canceled,58,US,8233.0,8233.0,125000.0
9,100004195,STUDIO IN THE SKY - A Documentary Feature Film...,Documentary,Film & Video,USD,2014-08-10,65000.0,2014-07-11 21:55:48,6240.57,canceled,43,US,6240.57,6240.57,65000.0


In [7]:
data.describe()

Unnamed: 0,ID,goal,pledged,backers,usd pledged,usd_pledged_real,usd_goal_real
count,378661.0,378661.0,378661.0,378661.0,374864.0,378661.0,378661.0
mean,1074731000.0,49080.79,9682.979,105.617476,7036.729,9058.924,45454.4
std,619086200.0,1183391.0,95636.01,907.185035,78639.75,90973.34,1152950.0
min,5971.0,0.01,0.0,0.0,0.0,0.0,0.01
25%,538263500.0,2000.0,30.0,2.0,16.98,31.0,2000.0
50%,1075276000.0,5200.0,620.0,12.0,394.72,624.33,5500.0
75%,1610149000.0,16000.0,4076.0,56.0,3034.09,4050.0,15500.0
max,2147476000.0,100000000.0,20338990.0,219382.0,20338990.0,20338990.0,166361400.0


In [8]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 378661 entries, 0 to 378660
Data columns (total 15 columns):
ID                  378661 non-null int64
name                378657 non-null object
category            378661 non-null object
main_category       378661 non-null object
currency            378661 non-null object
deadline            378661 non-null object
goal                378661 non-null float64
launched            378661 non-null object
pledged             378661 non-null float64
state               378661 non-null object
backers             378661 non-null int64
country             378661 non-null object
usd pledged         374864 non-null float64
usd_pledged_real    378661 non-null float64
usd_goal_real       378661 non-null float64
dtypes: float64(5), int64(2), object(8)
memory usage: 43.3+ MB


In [9]:
data.isnull().sum()

ID                     0
name                   4
category               0
main_category          0
currency               0
deadline               0
goal                   0
launched               0
pledged                0
state                  0
backers                0
country                0
usd pledged         3797
usd_pledged_real       0
usd_goal_real          0
dtype: int64

In [10]:
data['state'].unique()

array(['failed', 'canceled', 'successful', 'live', 'undefined', 'suspended'], dtype=object)

## Processing Labels

- the state columns contains following values: failed', 'canceled', 'successful', 'live', 'undefined', 'suspended'
- We have to only predict failed or successful
- We will make 'failed', 'canceled', 'live', 'undefined', 'suspended' as 0
- We will make 'successful' as 1
- Here 0 represent as failed and 1 represent as successful

In [11]:
processed_data = pd.concat([data[data['state'] == 'successful'], data[data['state'] == 'failed']])

In [12]:
processed_data['success_or_not']= processed_data.apply(func = success, axis = 1)

In [13]:
processed_data

Unnamed: 0,ID,name,category,main_category,currency,deadline,goal,launched,pledged,state,backers,country,usd pledged,usd_pledged_real,usd_goal_real,success_or_not
5,1000014025,Monarch Espresso Bar,Restaurants,Food,USD,2016-04-01,50000.0,2016-02-26 13:38:27,52375.00,successful,224,US,52375.00,52375.00,50000.00,1
6,1000023410,Support Solar Roasted Coffee & Green Energy! ...,Food,Food,USD,2014-12-21,1000.0,2014-12-01 18:30:44,1205.00,successful,16,US,1205.00,1205.00,1000.00,1
11,100005484,Lisa Lim New CD!,Indie Rock,Music,USD,2013-04-08,12500.0,2013-03-09 06:42:58,12700.00,successful,100,US,12700.00,12700.00,12500.00,1
14,1000057089,Tombstone: Old West tabletop game and miniatur...,Tabletop Games,Games,GBP,2017-05-03,5000.0,2017-04-05 19:44:18,94175.00,successful,761,GB,57763.78,121857.33,6469.73,1
18,1000070642,Mike Corey's Darkness & Light Album,Music,Music,USD,2012-08-17,250.0,2012-08-02 14:11:32,250.00,successful,7,US,250.00,250.00,250.00,1
20,1000072011,CMUK. Shoes: Take on Life Feet First.,Fashion,Fashion,USD,2013-12-30,20000.0,2013-11-25 07:06:11,34268.00,successful,624,US,34268.00,34268.00,20000.00,1
24,1000091520,The Book Zoo - A Mini-Comic,Comics,Comics,USD,2014-11-12,175.0,2014-10-23 17:15:50,701.66,successful,66,US,701.66,701.66,175.00,1
25,1000102741,Matt Cavenaugh & Jenny Powers make their 1st a...,Music,Music,USD,2011-01-06,10000.0,2010-12-07 23:16:50,15827.00,successful,147,US,15827.00,15827.00,10000.00,1
27,1000104688,Permaculture Skills,Webseries,Film & Video,CAD,2014-12-14,17757.0,2014-11-14 18:02:00,48905.00,successful,571,CA,43203.25,42174.03,15313.04,1
28,1000104953,Rebel Army Origins: The Heroic Story Of Major ...,Comics,Comics,GBP,2016-01-28,100.0,2015-12-29 16:59:29,112.38,successful,27,GB,167.70,160.60,142.91,1


## New Feature : Duration

- launched and deadline feature itself is not enough to get good results
- How I think is that if I know the duration of the project and the money then may be based on that I can predict the state of project
- So Duration is an important feature 
- Calculating Duration based on launched and deadline feature 

In [14]:
for i in ['launched','deadline']:
    processed_data[i] = pd.to_datetime(processed_data[i])

In [15]:
processed_data

Unnamed: 0,ID,name,category,main_category,currency,deadline,goal,launched,pledged,state,backers,country,usd pledged,usd_pledged_real,usd_goal_real,success_or_not
5,1000014025,Monarch Espresso Bar,Restaurants,Food,USD,2016-04-01,50000.0,2016-02-26 13:38:27,52375.00,successful,224,US,52375.00,52375.00,50000.00,1
6,1000023410,Support Solar Roasted Coffee & Green Energy! ...,Food,Food,USD,2014-12-21,1000.0,2014-12-01 18:30:44,1205.00,successful,16,US,1205.00,1205.00,1000.00,1
11,100005484,Lisa Lim New CD!,Indie Rock,Music,USD,2013-04-08,12500.0,2013-03-09 06:42:58,12700.00,successful,100,US,12700.00,12700.00,12500.00,1
14,1000057089,Tombstone: Old West tabletop game and miniatur...,Tabletop Games,Games,GBP,2017-05-03,5000.0,2017-04-05 19:44:18,94175.00,successful,761,GB,57763.78,121857.33,6469.73,1
18,1000070642,Mike Corey's Darkness & Light Album,Music,Music,USD,2012-08-17,250.0,2012-08-02 14:11:32,250.00,successful,7,US,250.00,250.00,250.00,1
20,1000072011,CMUK. Shoes: Take on Life Feet First.,Fashion,Fashion,USD,2013-12-30,20000.0,2013-11-25 07:06:11,34268.00,successful,624,US,34268.00,34268.00,20000.00,1
24,1000091520,The Book Zoo - A Mini-Comic,Comics,Comics,USD,2014-11-12,175.0,2014-10-23 17:15:50,701.66,successful,66,US,701.66,701.66,175.00,1
25,1000102741,Matt Cavenaugh & Jenny Powers make their 1st a...,Music,Music,USD,2011-01-06,10000.0,2010-12-07 23:16:50,15827.00,successful,147,US,15827.00,15827.00,10000.00,1
27,1000104688,Permaculture Skills,Webseries,Film & Video,CAD,2014-12-14,17757.0,2014-11-14 18:02:00,48905.00,successful,571,CA,43203.25,42174.03,15313.04,1
28,1000104953,Rebel Army Origins: The Heroic Story Of Major ...,Comics,Comics,GBP,2016-01-28,100.0,2015-12-29 16:59:29,112.38,successful,27,GB,167.70,160.60,142.91,1


In [16]:
processed_data['Duration']=processed_data['deadline']-processed_data['launched']


In [17]:
processed_data

Unnamed: 0,ID,name,category,main_category,currency,deadline,goal,launched,pledged,state,backers,country,usd pledged,usd_pledged_real,usd_goal_real,success_or_not,Duration
5,1000014025,Monarch Espresso Bar,Restaurants,Food,USD,2016-04-01,50000.0,2016-02-26 13:38:27,52375.00,successful,224,US,52375.00,52375.00,50000.00,1,34 days 10:21:33
6,1000023410,Support Solar Roasted Coffee & Green Energy! ...,Food,Food,USD,2014-12-21,1000.0,2014-12-01 18:30:44,1205.00,successful,16,US,1205.00,1205.00,1000.00,1,19 days 05:29:16
11,100005484,Lisa Lim New CD!,Indie Rock,Music,USD,2013-04-08,12500.0,2013-03-09 06:42:58,12700.00,successful,100,US,12700.00,12700.00,12500.00,1,29 days 17:17:02
14,1000057089,Tombstone: Old West tabletop game and miniatur...,Tabletop Games,Games,GBP,2017-05-03,5000.0,2017-04-05 19:44:18,94175.00,successful,761,GB,57763.78,121857.33,6469.73,1,27 days 04:15:42
18,1000070642,Mike Corey's Darkness & Light Album,Music,Music,USD,2012-08-17,250.0,2012-08-02 14:11:32,250.00,successful,7,US,250.00,250.00,250.00,1,14 days 09:48:28
20,1000072011,CMUK. Shoes: Take on Life Feet First.,Fashion,Fashion,USD,2013-12-30,20000.0,2013-11-25 07:06:11,34268.00,successful,624,US,34268.00,34268.00,20000.00,1,34 days 16:53:49
24,1000091520,The Book Zoo - A Mini-Comic,Comics,Comics,USD,2014-11-12,175.0,2014-10-23 17:15:50,701.66,successful,66,US,701.66,701.66,175.00,1,19 days 06:44:10
25,1000102741,Matt Cavenaugh & Jenny Powers make their 1st a...,Music,Music,USD,2011-01-06,10000.0,2010-12-07 23:16:50,15827.00,successful,147,US,15827.00,15827.00,10000.00,1,29 days 00:43:10
27,1000104688,Permaculture Skills,Webseries,Film & Video,CAD,2014-12-14,17757.0,2014-11-14 18:02:00,48905.00,successful,571,CA,43203.25,42174.03,15313.04,1,29 days 05:58:00
28,1000104953,Rebel Army Origins: The Heroic Story Of Major ...,Comics,Comics,GBP,2016-01-28,100.0,2015-12-29 16:59:29,112.38,successful,27,GB,167.70,160.60,142.91,1,29 days 07:00:31


In [18]:
processed_data['Number of days']= processed_data['Duration'].dt.days


In [19]:
processed_data

Unnamed: 0,ID,name,category,main_category,currency,deadline,goal,launched,pledged,state,backers,country,usd pledged,usd_pledged_real,usd_goal_real,success_or_not,Duration,Number of days
5,1000014025,Monarch Espresso Bar,Restaurants,Food,USD,2016-04-01,50000.0,2016-02-26 13:38:27,52375.00,successful,224,US,52375.00,52375.00,50000.00,1,34 days 10:21:33,34
6,1000023410,Support Solar Roasted Coffee & Green Energy! ...,Food,Food,USD,2014-12-21,1000.0,2014-12-01 18:30:44,1205.00,successful,16,US,1205.00,1205.00,1000.00,1,19 days 05:29:16,19
11,100005484,Lisa Lim New CD!,Indie Rock,Music,USD,2013-04-08,12500.0,2013-03-09 06:42:58,12700.00,successful,100,US,12700.00,12700.00,12500.00,1,29 days 17:17:02,29
14,1000057089,Tombstone: Old West tabletop game and miniatur...,Tabletop Games,Games,GBP,2017-05-03,5000.0,2017-04-05 19:44:18,94175.00,successful,761,GB,57763.78,121857.33,6469.73,1,27 days 04:15:42,27
18,1000070642,Mike Corey's Darkness & Light Album,Music,Music,USD,2012-08-17,250.0,2012-08-02 14:11:32,250.00,successful,7,US,250.00,250.00,250.00,1,14 days 09:48:28,14
20,1000072011,CMUK. Shoes: Take on Life Feet First.,Fashion,Fashion,USD,2013-12-30,20000.0,2013-11-25 07:06:11,34268.00,successful,624,US,34268.00,34268.00,20000.00,1,34 days 16:53:49,34
24,1000091520,The Book Zoo - A Mini-Comic,Comics,Comics,USD,2014-11-12,175.0,2014-10-23 17:15:50,701.66,successful,66,US,701.66,701.66,175.00,1,19 days 06:44:10,19
25,1000102741,Matt Cavenaugh & Jenny Powers make their 1st a...,Music,Music,USD,2011-01-06,10000.0,2010-12-07 23:16:50,15827.00,successful,147,US,15827.00,15827.00,10000.00,1,29 days 00:43:10,29
27,1000104688,Permaculture Skills,Webseries,Film & Video,CAD,2014-12-14,17757.0,2014-11-14 18:02:00,48905.00,successful,571,CA,43203.25,42174.03,15313.04,1,29 days 05:58:00,29
28,1000104953,Rebel Army Origins: The Heroic Story Of Major ...,Comics,Comics,GBP,2016-01-28,100.0,2015-12-29 16:59:29,112.38,successful,27,GB,167.70,160.60,142.91,1,29 days 07:00:31,29


In [20]:
processed_data.drop(labels=['usd pledged','name'], inplace=True, axis = 1)


In [21]:
processed_data.isnull().sum()


ID                  0
category            0
main_category       0
currency            0
deadline            0
goal                0
launched            0
pledged             0
state               0
backers             0
country             0
usd_pledged_real    0
usd_goal_real       0
success_or_not      0
Duration            0
Number of days      0
dtype: int64

In [22]:
processed_data.columns

Index(['ID', 'category', 'main_category', 'currency', 'deadline', 'goal',
       'launched', 'pledged', 'state', 'backers', 'country',
       'usd_pledged_real', 'usd_goal_real', 'success_or_not', 'Duration',
       'Number of days'],
      dtype='object')

## Features Selected

- **Number of days** : Total number of day required for the project to finish
- **backers**: amount of poeple that backed the project, _numeric_
- **usd_pledged_real**: amount pledged by backers converted to USD (conversion made by fixer.io api), _numeric_
- **usd_goal_real**: fundraising goal is USD, _numeric_

In [23]:
y = processed_data['success_or_not']
X = processed_data[['Number of days', 'backers', 'usd_pledged_real', 'usd_goal_real']]

## Applying different Classifiers

- ** Logistic Regression** 
- ** Multi-Layer Perceptron **
- ** K nearest Neighbor**
- ** Decision Tree**


In [24]:
class LogisticReg(object):
    def __init__(self, dataX, dataY):
        self.model = LogisticRegression()
        self.X_train, self.X_test, self.y_train, self.y_test = train_test_split(X, y, test_size=0.33)
    
    def train(self):
        self.model.fit(self.X_train, self.y_train)
        
    def predict(self):
        self.y_pred = self.model.predict(self.X_test)
        
    def disp_acc(self):
        print('Accuracy of Logistic Regression {:.5f}'.format(self.model.score(self.X_test, self.y_test)))
        print("")
    
    def disp_conf_matrix(self):
        conf_matrix = confusion_matrix(self.y_test, self.y_pred)
        print("Confusion Matrix for Logistic Regression is as follows :")
        print(conf_matrix)
        print("")

        
        
logistic_reg = LogisticReg(X, y)
logistic_reg.train()
logistic_reg.predict()
logistic_reg.disp_acc()
logistic_reg.disp_conf_matrix()

Accuracy of Logistic Regression 0.99895

Confusion Matrix for Logistic Regression is as follows :
[[64928   114]
 [    1 44410]]



In [25]:
class MultiLayerPerceptron(object):
    def __init__(self, dataX, dataY):
        self.model = MLPClassifier(solver='lbfgs', alpha=1e-5, hidden_layer_sizes=(25,11,7,5,3,), random_state=1)
        self.X_train, self.X_test, self.y_train, self.y_test = train_test_split(X, y, test_size=0.33)
    
    def train(self):
        self.model.fit(self.X_train, self.y_train)
        
    def predict(self):
        self.y_pred = self.model.predict(self.X_test)
        
    def disp_acc(self):
        print('Accuracy of Multi Layered Perceptron Network {:.5f}'.format(self.model.score(self.X_test, self.y_test)))
        print("")
    
    def disp_conf_matrix(self):
        conf_matrix = confusion_matrix(self.y_test, self.y_pred)
        print("Confusion Matrix for Multi Layered Perceptron Network is as follows :")
        print(conf_matrix)
        print("")

        
        
mlp = MultiLayerPerceptron(X, y)
mlp.train()
mlp.predict()
mlp.disp_acc()
mlp.disp_conf_matrix()

Accuracy of Multi Layered Perceptron Network 0.99891

Confusion Matrix for Multi Layered Perceptron Network is as follows :
[[65049   117]
 [    2 44285]]



In [26]:
class KNN(object):
    def __init__(self, dataX, dataY, n=3):
        self.model = KNeighborsClassifier(n_neighbors=n)
        self.X_train, self.X_test, self.y_train, self.y_test = train_test_split(X, y, test_size=0.33)
    
    def train(self):
        self.model.fit(self.X_train, self.y_train)
        
    def predict(self):
        self.y_pred = self.model.predict(self.X_test)
        
    def disp_acc(self):
        print('Accuracy of K Nearest Neighbor {:.5f}'.format(self.model.score(self.X_test, self.y_test)))
        print("")
    
    def disp_conf_matrix(self):
        conf_matrix = confusion_matrix(self.y_test, self.y_pred)
        print("Confusion Matrix for K Nearest Neighbor  is as follows :")
        print(conf_matrix)
        print("")

        
        
knn = KNN(X, y)
knn.train()
knn.predict()
knn.disp_acc()
knn.disp_conf_matrix()

Accuracy of K Nearest Neighbor 0.99964

Confusion Matrix for K Nearest Neighbor  is as follows :
[[65260    33]
 [    6 44154]]



In [27]:
class DecisionTree(object):
    def __init__(self, dataX, dataY):
        self.model = DecisionTreeClassifier(criterion = "gini", random_state = 100, max_depth=3, min_samples_leaf=5)
        self.X_train, self.X_test, self.y_train, self.y_test = train_test_split(X, y, test_size=0.33)
    
    def train(self):
        self.model.fit(self.X_train, self.y_train)
        
    def predict(self):
        self.y_pred = self.model.predict(self.X_test)
        
    def disp_acc(self):
        print('Accuracy of Decision Tree {:.5f}'.format(self.model.score(self.X_test, self.y_test)))
        print("")
    
    def disp_conf_matrix(self):
        conf_matrix = confusion_matrix(self.y_test, self.y_pred)
        print("Confusion Matrix for Decision Tree is as follows :")
        print(conf_matrix)
        print("")

        
        
dec_tree = DecisionTree(X, y)
dec_tree.train()
dec_tree.predict()
dec_tree.disp_acc()
dec_tree.disp_conf_matrix()

Accuracy of Decision Tree 0.92330

Confusion Matrix for Decision Tree is as follows :
[[59096  6156]
 [ 2239 41962]]



## Comparing Results based on Accuracy

- ** Logistic Regression**     - 0.99899
- ** Multi-Layer Perceptron ** - 0.99894
- ** K Nearest Neighbor**      - 0.99957
- ** Decision Tree**           - 0.92421



- Since K Nearest Neighbor is showing best performance hence picking K Nearest Neighbor in challenge.py package
