**Data Set Information:**
​
The file "sonar.mines" contains 111 patterns obtained by bouncing sonar signals off a metal cylinder at various angles and under various conditions. The file "sonar.rocks" contains 97 patterns obtained from rocks under similar conditions. The transmitted sonar signal is a frequency-modulated chirp, rising in frequency. The data set contains signals obtained from a variety of different aspect angles, spanning 90 degrees for the cylinder and 180 degrees for the rock.
​
Each pattern is a set of 60 numbers in the range 0.0 to 1.0. Each number represents the energy within a particular frequency band, integrated over a certain period of time. The integration aperture for higher frequencies occur later in time, since these frequencies are transmitted later during the chirp.
​
The label associated with each record contains the letter "R" if the object is a rock and "M" if it is a mine (metal cylinder). The numbers in the labels are in increasing order of aspect angle, but they do not encode the angle directly.
​

In [None]:
#import libreries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline 

In [None]:
#import dataset
sn= pd.read_csv("../input/sonar-dataset-suitable-for-classification/sonar.all-data.csv")

In [None]:
#Data overview
sn.head()

In [None]:
sn.info()

**EDA**

In [None]:
sns.countplot(data=sn , x="Label")

In [None]:
sn["Label"].value_counts()

The Label value counts are balanced.

In [None]:
#Determine the Features and Label
X= sn.drop("Label" , axis=1)
y= sn["Label"]

In [None]:
# Split the Data to Train & Test
from sklearn.model_selection import train_test_split

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [None]:
# Scaling the Features
from sklearn.preprocessing import StandardScaler

In [None]:
scaler = StandardScaler()

In [None]:
scaler.fit(X_train)

In [None]:
scaled_X_train= scaler.transform(X_train)
scaled_X_test= scaler.transform(X_test)

In [None]:
#Train the Model
from sklearn.neighbors import KNeighborsClassifier

In [None]:
knn_model= KNeighborsClassifier(n_neighbors=1)
knn_model.fit(scaled_X_train , y_train)

In [None]:
#Predicting Test Data
y_pred= knn_model.predict(scaled_X_test)

In [None]:
pd.DataFrame({"y_test" : y_test , "y_pred" : y_pred})

From the comparisan between the prediction data and actual data,it seems the model works properly. 

In [None]:
#Evaluating the Model
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

In [None]:
accuracy_score(y_test, y_pred)

In [None]:
confusion_matrix(y_test, y_pred)

In [None]:
print(classification_report(y_test, y_pred))

**The average of model accuracy is %93**

In [None]:
#Elbow Method for Choosing Reasonable K Values
test_error_rate=[]

for k in range (1, 20):
    knn_model = KNeighborsClassifier(n_neighbors=k)
    knn_model.fit(scaled_X_train, y_train)
    
    y_pred_test = knn_model.predict(scaled_X_test)
    
    test_error=1- accuracy_score(y_test, y_pred_test)
    test_error_rate.append(test_error)

In [None]:
test_error_rate

In [None]:
plt.figure(figsize=(10, 6))
plt.plot(range(1, 20), test_error_rate, label='Test Error')
plt.legend()
plt.ylabel('Error Rate')
plt.xlabel('K Value')

**By elbow method and viewing the chart** **, we understand that Reasonable K Value is one.**

In [None]:
# Creating a Pipeline to find K value
scaler= StandardScaler()

In [None]:
knn= KNeighborsClassifier()

In [None]:
knn.get_params().keys()

In [None]:
operations= [('scaler', scaler), ('knn', knn)]

In [None]:
from sklearn.pipeline import Pipeline

In [None]:
pipe= Pipeline(operations)

In [None]:
from sklearn.model_selection import GridSearchCV

In [None]:
k_values= list(range(1, 20))

In [None]:
param_grid= {'knn__n_neighbors': k_values}

In [None]:
full_cv_classifier= GridSearchCV(pipe, param_grid, cv=5, scoring='accuracy')

In [None]:
full_cv_classifier.fit(X_train, y_train)

In [None]:
full_cv_classifier.best_estimator_.get_params()

**This method also shows the best k value is 1**