# K-NN Implementation using Scikitt Learn

Let's first create our own dummy dataset. 
Here we need two kinds of attributes in our data: Feature and label. This is because K-NN is a supervised learning algorithm.

In [1]:
# Assigning features and label variables

# First Feature
weather=['Sunny','Sunny','Overcast','Rainy','Rainy','Rainy','Overcast','Sunny','Sunny',
'Rainy','Sunny','Overcast','Overcast','Rainy']

# Second Feature
temp=['Hot','Hot','Hot','Mild','Cool','Cool','Cool','Mild','Cool','Mild','Mild','Mild','Hot','Mild']

# Label or target varible
play=['No','No','Yes','Yes','Yes','No','Yes','No','Yes','Yes','Yes','Yes','Yes','No']

In this dataset, we have now defined two features (weather and temperature) and one output label (play).

Now K-NN requires numerical input data, so we need to represent our categorical features in a numerical fashion.

In order to encode this data, we would map each value to a number. e.g. Overcast:0, Rainy:1, and Sunny:2.

This process is known as label encoding, and sklearn conveniently will do this for you using the Label Encoder 
Hot:1, Mild:2 , Cool:0 , .

In [2]:
# Import LabelEncoder
from sklearn import preprocessing
#creating labelEncoder
le = preprocessing.LabelEncoder()
# Converting string labels into numbers.
weather_encoded=le.fit_transform(weather)
print(weather_encoded)

[2 2 0 1 1 1 0 2 2 1 2 0 0 1]


Here, you imported preprocessing module and created Label Encoder object.

Using this LabelEncoder object, you can fit and transform "weather" column into the numeric column.

Similarly, we will now encode temperature and output label (play) into numeric columns.

In [3]:
# converting string labels into numbers
temp_encoded=le.fit_transform(temp)
label=le.fit_transform(play)
print(temp_encoded)

[1 1 1 2 0 0 0 2 0 2 2 2 1 2]


Now we will combine all of the features in a single list of tuples using the zip method

In [4]:
#combinig weather and temp into single listof tuples
features=list(zip(weather_encoded,temp_encoded))

Let's build KNN classifier model.

First, import the KNeighborsClassifier module and create KNN classifier object by passing argument number of neighbors in KNeighborsClassifier() function.

Then, fit your model on the train set using fit() and perform prediction on the test set using predict().

In [5]:
from sklearn.neighbors import KNeighborsClassifier

model = KNeighborsClassifier(n_neighbors=3)

# Train the model using the training sets
model.fit(features,label)

#Predict Output
predicted= model.predict([[1,1]]) # 1:Rainy, 1:Hot
print(predicted)

[0]


In the above example, you have given input [1,1], where 1 means Rainy weather and 1 means Hot temperature. Model predicts [0], which means do not play.

In [7]:
!pip install watermark

Collecting watermark
  Downloading watermark-2.2.0-py2.py3-none-any.whl (6.8 kB)
Installing collected packages: watermark
Successfully installed watermark-2.2.0


You should consider upgrading via the 'c:\python38\python.exe -m pip install --upgrade pip' command.


In [8]:
%load_ext watermark

# python, ipython, packages, and machine characteristics
%watermark -v -m -p wget,pandas,numpy,geopy,altair,vega,vega_datasets,watermark 

# date
print (" ")
%watermark -u -n -t -z

Python implementation: CPython
Python version       : 3.8.7
IPython version      : 7.19.0

wget         : not installed
pandas       : 1.2.2
numpy        : 1.19.5
geopy        : not installed
altair       : 4.1.0
vega         : not installed
vega_datasets: not installed
watermark    : 2.2.0

Compiler    : MSC v.1928 64 bit (AMD64)
OS          : Windows
Release     : 10
Machine     : AMD64
Processor   : Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
CPU cores   : 8
Architecture: 64bit

 
Last updated: Wed Oct 06 2021 00:44:33Pakistan Standard Time

