# Explanation of the intuition behind the KNN algorithm.

The K-Nearest Neighbors (K-NN) algorithm, also known as the lazy learning algorithm, is aclear and  simple supervised machine learning method for regression and classification. It assumes that new cases are similar to existing ones, placing them in the most similar category.

The K-NN algorithm stores all available data and classifies a new data point based on its similarity to the existing data. This means that when new data appears, the KNN algorithm can quickly classify it into a suitable category.  As a non-parametric and lazy learner, K-NN makes no assumptions upfront, storing the dataset and acting when it classifies. Applications include pattern recognition, data mining, and intrusion detectiong applications.


# Pseudocode of the algorithm.
**Step-I:** Select the number K of neighbors.

**Step-II:** Determine the Euclidean distance between K neighbors.

**Step-III:** Take the K closest neighbors based on the Euclidean distance calculated.

**Step-IV:** Count the number of data points in each category among these K neighbors.

**Step-V:** Assign the new data points to the category with the greatest number of neighbors.

**Step-VI:** Our model is complete.mplete.plete.



# Implementation of the algorithm

In [30]:
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier

# Generate random data for 25 people
np.random.seed(42)

random_weights = np.random.normal(loc=75, scale=5, size=25)
random_heights = np.random.normal(loc=1.75, scale=0.1, size=25)
random_genders = np.random.choice(['Male', 'Female'], size=25)

random_data = {
    'Weight': random_weights,
    'Height': random_heights,
    'Gender': random_genders
}

# Convert to DataFrame
df_random = pd.DataFrame(random_data)

# Create and train the KNN model
knn_model = KNeighborsClassifier(n_neighbors=3)
knn_model.fit(df_random[['Weight', 'Height']], df_random['Gender'])

# Add a new data point for prediction
new_data_point = {'Weight': [76], 'Height': [1.56]}
df_new = pd.DataFrame(new_data_point)

# Make prediction on the new data point
prediction = knn_model.predict(df_new)

# Show the table of random data
print("Random Data Table:")
print(df_random)

# Show the prediction for the new data point
print("\nPrediction for the New Data Point:")
print(df_new)
print(f'Predicted Gender: {prediction[0]}')


Random Data Table:
       Weight    Height  Gender
0   77.483571  1.761092    Male
1   74.308678  1.634901    Male
2   78.238443  1.787570    Male
3   82.615149  1.689936    Male
4   73.829233  1.720831    Male
5   73.829315  1.689829    Male
6   82.896064  1.935228    Male
7   78.837174  1.748650    Male
8   72.652628  1.644229  Female
9   77.712800  1.832254    Male
10  72.682912  1.627916  Female
11  72.671351  1.770886  Female
12  76.209811  1.554033  Female
13  65.433599  1.617181    Male
14  66.375411  1.769686    Male
15  72.188562  1.823847    Male
16  69.935844  1.767137    Male
17  76.571237  1.738435  Female
18  70.459880  1.719890    Male
19  67.938481  1.602148    Male
20  82.328244  1.678016    Male
21  73.871118  1.703936    Male
22  75.337641  1.855712    Male
23  67.876259  1.784362  Female
24  72.278086  1.573696    Male

Prediction for the New Data Point:
   Weight  Height
0      76    1.56
Predicted Gender: Female


# Loss function + Optimization function identification
A lost function is used to measure the accurancy of a model`s prediction between the predicted output and the actual output for each training sample. So, knowing this, we can say that the K-Nearest Neighbors (KNN) algorithm does not involve a loss function or an optimization function, since it is an instance-based algorithm that does not adjust parameters by optimization.

In KNN, prediction is performed by finding the nearest neighbors based on a distance measure, such as Euclidean distance, and assigning a label based on the majority of votes (in classification) or calculating the mean (in regression) among those neighbors. There is neither a loss function that minimizes nor an optimization function that adjusts parameters, as in some other supervised learning algorithms.ngneighbors. There is neither a loss function that minimizes nor an optimization function that adjusts parameters, as in some other supervised learning algorithms.

However we can calculate the accuracy of our predictions, down you will see an example of it. d learning 
algorithms.

In [31]:
import pandas as pd
import numpy as np
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split

# Generate random data for 25 people
np.random.seed(42)

random_weights = np.random.normal(loc=75, scale=5, size=25)
random_heights = np.random.normal(loc=1.75, scale=0.1, size=25)
random_genders = np.random.choice(['Male', 'Female'], size=25)

random_data = {
    'Weight': random_weights,
    'Height': random_heights,
    'Gender': random_genders
}

# Convert to DataFrame
df_random = pd.DataFrame(random_data)

# Shuffle and split the data into training and test set
X = df_random[['Weight', 'Height']]
Y = df_random['Gender']
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.1, random_state=42)

# Convert to numpy arrays
x_train = np.array(x_train)
y_train = np.array(y_train)
x_test = np.array(x_test)
y_test = np.array(y_test)

# Create and train the KNN model
knn_model = KNeighborsClassifier(n_neighbors=3)
knn_model.fit(x_train, y_train)

# Make predictions on the test data
predictions = knn_model.predict(x_test)

# Evaluate the accuracy
accuracy = np.mean(predictions == y_test)

# Show the table of random data
print("Random Data Table:")
print(df_random)

# Show the predictions for the test data
print("\nPredictions for the Test Data:")
print(pd.DataFrame({'Weight': x_test[:, 0], 'Height': x_test[:, 1], 'Actual Gender': y_test, 'Predicted Gender': predictions}))

# Show the accuracy
print("\nAccuracy:", accuracy)


Random Data Table:
       Weight    Height  Gender
0   77.483571  1.761092    Male
1   74.308678  1.634901    Male
2   78.238443  1.787570    Male
3   82.615149  1.689936    Male
4   73.829233  1.720831    Male
5   73.829315  1.689829    Male
6   82.896064  1.935228    Male
7   78.837174  1.748650    Male
8   72.652628  1.644229  Female
9   77.712800  1.832254    Male
10  72.682912  1.627916  Female
11  72.671351  1.770886  Female
12  76.209811  1.554033  Female
13  65.433599  1.617181    Male
14  66.375411  1.769686    Male
15  72.188562  1.823847    Male
16  69.935844  1.767137    Male
17  76.571237  1.738435  Female
18  70.459880  1.719890    Male
19  67.938481  1.602148    Male
20  82.328244  1.678016    Male
21  73.871118  1.703936    Male
22  75.337641  1.855712    Male
23  67.876259  1.784362  Female
24  72.278086  1.573696    Male

Predictions for the Test Data:
      Weight    Height Actual Gender Predicted Gender
0  72.652628  1.644229        Female           Female
1  69.935