# KNN for Classification – Implementation Guidelines

## Steps

1. **Import Dataset**  
   - Load the dataset into Python.

2. **Separate Features and Target**  
   - Features (X): Independent variables  
   - Target (Y): `Gender`

3. **Split Dataset**  
   - Training set: **70%**  
   - Testing set: **30%**

4. **Apply KNN Classifier**  
   - Use **[Scikit-Learn’s](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html) KNN Classifier** implementation.  
   - Train the model on the training set.  
   - Predict on the testing set.  

5. **Evaluate the Model**  
   - Use **Accuracy** as the evaluation metric.  


In [3]:
import pandas as pd
from sklearn.preprocessing import LabelEncoder
import warnings as wr
wr.filterwarnings('ignore')

In [5]:
df=pd.read_csv('weight-height.csv')
df.head()

Unnamed: 0,Gender,Height,Weight
0,Male,73.847017,241.893563
1,Male,68.781904,162.310473
2,Male,74.110105,212.740856
3,Male,71.730978,220.04247
4,Male,69.881796,206.349801


In [7]:
encoder = LabelEncoder()
df['Gender'] = encoder.fit_transform(df['Gender'])
df

Unnamed: 0,Gender,Height,Weight
0,1,73.847017,241.893563
1,1,68.781904,162.310473
2,1,74.110105,212.740856
3,1,71.730978,220.042470
4,1,69.881796,206.349801
...,...,...,...
8550,0,60.483946,110.565497
8551,0,63.423372,129.921671
8552,0,65.584057,155.942671
8553,0,67.429971,151.678405


In [9]:
X=df.drop(['Gender'], axis=1)
X.head()

Unnamed: 0,Height,Weight
0,73.847017,241.893563
1,68.781904,162.310473
2,74.110105,212.740856
3,71.730978,220.04247
4,69.881796,206.349801


In [11]:
Y = df[['Gender']]
Y.head()

Unnamed: 0,Gender
0,1
1,1
2,1
3,1
4,1


In [15]:
from sklearn.model_selection import train_test_split

In [17]:
X_train, X_test, Y_train, Y_test = train_test_split(
    X, Y, test_size=0.3, random_state=42)  

In [19]:
from sklearn.neighbors import KNeighborsClassifier
cneigh = KNeighborsClassifier(n_neighbors=5)


In [21]:
cneigh.fit(X_train, Y_train)

In [23]:
ypredCla=cneigh.predict(X_test)
ypredCla

array([0, 1, 1, ..., 0, 1, 0])

In [25]:
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(Y_test, ypredCla)
print(f'Accuracy: {accuracy:.2f}')


Accuracy: 0.91
