## Train a KNN classifier

You are working for a home security company to develop a system that predicts whether a detected motion event (from sensors installed in the home) is likely to be a false alarm. This will help reduce unnecessary notifications to customers.

You have been provided with a dataset of motion events. Each sample includes:

- Event ID
- Street address at which the event occurred
- False alarm (1 for yes, 0 for no)
- Time of day (encoded as 0-23, where 0 represents midnight and 23 represents 11 PM)
- Duration of motion event (in seconds)
- Number of other motion events detected in the last hour
- Outdoor temperature at the time of the event (in Fahrenheit)
- Presence of pets in the household (1 for yes, 0 for no)
- Percentage of frames with detected human presence during the event

In the attached workspace, you will read this data from a file, and split it into training and test sets. Then, you will fit a `KNeighborsClassifier` (using the `sklearn` implementation) on the training set, and evaluate its accuracy in predicting the walkability class on the test set.

You'll need to specify this random state in your notebook:

> random_state = 13

The following items will be graded:

| Name | Type | Description |
| ---- | ---- | ---- |
|`Xtr`	|pandas dataframe	|Training data - features used as input to model.|
|`Xts`	|pandas dataframe	|Test data - features used as input to model.|
|`ytr`	|pandas series OR pandas data frame OR 1d numpy array	|Training data - target variable.|
|`yts`	|pandas series OR pandas data frame OR 1d numpy array	|Test data - target variable.|
|`yts_hat`	|1d numpy array	|Model prediction for test data.|
|`acc`	|float	|Accuracy of model on test data.|

In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

First, we'll load the dataset:

In [2]:
df = pd.read_csv('data.csv', index_col = 'Event_ID')

You can add some code here to inspect the data, see the names of features, and see the data types. For example, what is the proportion of false alarms? The cell below will not be graded.

In [3]:
df.head()

Unnamed: 0_level_0,Address,Presence_of_Pets,Percentage_of_Frames_with_Human_Presence,False_Alarm,Time_of_Day,Duration_of_Motion_Event,Events_in_Last_Hour,Outdoor_Temperature
Event_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
1,2422 Willow Rd,1,90.078592,0,5,19,0,53.212582
2,3036 Main Ct,1,98.169261,0,17,20,1,70.452566
3,6228 Chestnut Ln,0,25.555773,0,4,25,2,69.543186
4,5620 Maple Rd,0,26.367181,0,18,25,0,69.03484
5,9363 Oak Rd,0,99.221166,0,19,7,1,67.19568


(but, note that your code will be evaluated on *different* data organized in a data frame with the same columns - so in your solution, you should not hard-code anything specific to this data.)

Now we will split into training and test sets, using `train_test_split`! 

* Reserve 20% of the data for testing.
* Use the random state specified on the question page.

The following cell should create 

* `Xtr` and `Xts` as pandas data frames including *only* the features used to train the model, 
* and `ytr` and `yts` as either pandas data series or 1d numpy arrays containing the target variable. 

(For pandas data frames or data series, don't change the names of any columns.)

In [10]:
#grade (write your code in this cell and DO NOT DELETE THIS LINE)
df = df.drop('Address', axis=1)
X = df.drop('False_Alarm', axis=1)
y = df['False_Alarm']
random_state = 13
Xtr, Xts, ytr, yts = train_test_split(X, y, test_size = 0.2, random_state = random_state)


Now we are ready to fit the `KNeighborsClassifier`. Using 

* 9 neighbors
* and default settings for everything else, 

fit the model on the training data. Then, use it to make predictions for the test samples, and save this prediction in `yts_hat`. Evaluate the accuracy score of the model on the test data, and save this in `acc`.

In [11]:
#grade (write your code in this cell and DO NOT DELETE THIS LINE)
knn = KNeighborsClassifier(n_neighbors=9).fit(Xtr, ytr)
yts_hat = knn.predict(Xts)
acc = accuracy_score(yts, yts_hat)