# KNN Bank Marketing Campaigns

In this activity you will use the provided dataset of a bank's telemarketing campaign. The bank's marketing partner ran the campaign and the bank has labeled the customers that opened an account after receiving a phone call. Now they want you to build a model that will help them to identify customers so they can provide the marketer with a better list of potential customers in the future.

## Instructions:

1. Read the CSV file into a Pandas DataFrame.

2. Separate the features `X` from the target `y`.

3. Encode the categorical variables from the features data using [`get_dummies`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.get_dummies.html).

4. Separate the data into training and testing subsets.

5. Scale the data using [`StandardScaler`](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html)

6. Instantiate an K Nearest Neighbor Classifier instance.

7. Fit the model using the training data.

8. Make predictions using the testing data.

9. Generate the classification report for the test data.

## Load Data
### 1. Read the CSV file into a Pandas DataFrame.


In [1]:
# Import modules
import pandas as pd
from pathlib import Path
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.preprocessing import StandardScaler

In [2]:
# Read the CSV file into a Pandas DataFrame
df = # YOUR CODE HERE

# Review the DataFrame
# YOUR CODE HERE

Unnamed: 0,age,job,marital,education,default,balance,housing,loan,contact,day,month,duration,campaign,pdays,previous,poutcome,y
0,30,unemployed,married,primary,no,1787,no,no,cellular,19,oct,79,1,-1,0,unknown,no
1,33,services,married,secondary,no,4789,yes,yes,cellular,11,may,220,1,339,4,failure,no
2,35,management,single,tertiary,no,1350,yes,no,cellular,16,apr,185,1,330,1,failure,no
3,30,management,married,tertiary,no,1476,yes,yes,unknown,3,jun,199,4,-1,0,unknown,no
4,59,blue-collar,married,secondary,no,0,yes,no,unknown,5,may,226,1,-1,0,unknown,no


### 2. Separate the features `X` from the target `y`

In [3]:
# Seperate the features, X,  from the target variable, y
y = # YOUR CODE HERE
X = # YOUR CODE HERE


In [4]:
# Preview the features data
# YOUR CODE HERE


Unnamed: 0,age,job,marital,education,default,balance,housing,loan,contact,day,month,duration,campaign,pdays,previous,poutcome
0,30,unemployed,married,primary,no,1787,no,no,cellular,19,oct,79,1,-1,0,unknown
1,33,services,married,secondary,no,4789,yes,yes,cellular,11,may,220,1,339,4,failure
2,35,management,single,tertiary,no,1350,yes,no,cellular,16,apr,185,1,330,1,failure
3,30,management,married,tertiary,no,1476,yes,yes,unknown,3,jun,199,4,-1,0,unknown
4,59,blue-collar,married,secondary,no,0,yes,no,unknown,5,may,226,1,-1,0,unknown


In [5]:
# Preview the first five entries for the target variable
# YOUR CODE HERE


0    no
1    no
2    no
3    no
4    no
Name: y, dtype: object

### 3. Encode the categorical variables from the features data using `get_dummies`.

In [6]:
# Encode the categorical variables using get_dummies
X = # YOUR CODE HERE


In [7]:
# Review the features data
# YOUR CODE HERE


Unnamed: 0,age,balance,day,duration,campaign,pdays,previous,job_admin.,job_blue-collar,job_entrepreneur,...,month_jun,month_mar,month_may,month_nov,month_oct,month_sep,poutcome_failure,poutcome_other,poutcome_success,poutcome_unknown
0,30,1787,19,79,1,-1,0,0,0,0,...,0,0,0,0,1,0,0,0,0,1
1,33,4789,11,220,1,339,4,0,0,0,...,0,0,1,0,0,0,1,0,0,0
2,35,1350,16,185,1,330,1,0,0,0,...,0,0,0,0,0,0,1,0,0,0
3,30,1476,3,199,4,-1,0,0,0,0,...,1,0,0,0,0,0,0,0,0,1
4,59,0,5,226,1,-1,0,0,1,0,...,0,0,1,0,0,0,0,0,0,1


### 4. Separate the data into training and testing subsets.

In [8]:
# Split the dataset using train_test_split
X_train, X_test, y_train, y_test = # YOUR CODE HERE


### 5. Scale the data using `StandardScaler`

In [9]:
# Instantiate a StandardScaler instance
scaler = # YOUR CODE HERE

# Fit the training data to the standard scaler
X_scaler = # YOUR CODE HERE

# Transform the training data using the scaler
X_train_scaled = # YOUR CODE HERE

# Transform the testing data using the scaler
X_test_scaled = # YOUR CODE HERE

### 6. Instantiate an K Nearest Neighbor Classifier instance.

In [10]:
# Import the KNeighborsClassifier module from sklearn
from sklearn.neighbors import KNeighborsClassifier

# Instantiate the KNeighborsClassifier model with n_neighbors = 3 
knn = # YOUR CODE HERE


### 7. Fit the model using the training data.

In [11]:
# Train the model using the training data
knn.# YOUR CODE HERE


KNeighborsClassifier(n_neighbors=3)

### 8. Make predictions using the testing data.

In [12]:
# Create predictions using the testing data
y_pred = # YOUR CODE HERE


### 9. Generate the classification report for the test data.

In [13]:
# Print the classification report comparing the testing data to the model predictions
# YOUR CODE HERE


              precision    recall  f1-score   support

          no       0.90      0.97      0.93       988
         yes       0.54      0.22      0.32       143

    accuracy                           0.88      1131
   macro avg       0.72      0.60      0.62      1131
weighted avg       0.85      0.88      0.86      1131

