5. Develop a program to implement k-Nearest Neighbour algorithm to classify the randomly generated 100 values of x in the range of [0,1]. Perform the following based on dataset generated.
a) Label the first 50 points {x1,……,x50} as follows: if (xi ≤ 0.5), then xi ∊ Class1, else xi ∊ Class1
b) Classify the remaining points, x51,……,x100 using KNN. Perform this for k=1,2,3,4,5,20,30

In [None]:
import numpy as np
from sklearn.neighbors import KNeighborsClassifier

# --- Data Generation and Preparation ---

np.random.seed(42) # For reproducibility of random data
data = np.random.rand(100) # Generate 100 random numbers between 0 and 1

# Define labels for the first 50 points (our "training" set)
# These are the "true" labels for the training data based on a simple rule
train_labels_raw = ["Class1" if x <= 0.5 else "Class2" for x in data[:50]]
train_labels = np.array(train_labels_raw) # Convert to NumPy array for scikit-learn

# Prepare training and testing data for scikit-learn
# Scikit-learn expects input features (X) to be 2D arrays, even for a single feature
train_data = data[:50].reshape(-1, 1)
test_data = data[50:].reshape(-1, 1)

print("--- k-Nearest Neighbors Classification with scikit-learn ---")
print("Training dataset: First 50 points labeled based on the rule (x <= 0.5 -> Class1, x > 0.5 -> Class2)")
print("Testing dataset: Remaining 50 points to be classified\n")

# --- k-NN Classification for various 'k' values ---

k_values = [1, 2, 3, 4, 5, 20, 30]
results_by_k = {} # Dictionary to store classified labels for each k

for k in k_values:
    print(f"Results for k = {k}:")

    # Initialize the k-NN classifier
    # n_neighbors is the 'k' value
    # metric='euclidean' specifies the distance calculation method
    knn_model = KNeighborsClassifier(n_neighbors=k, metric='euclidean')

    # Train the model (k-NN "training" is essentially just memorizing the data)
    knn_model.fit(train_data, train_labels)

    # Make predictions on the test data
    classified_labels = knn_model.predict(test_data)
    results_by_k[k] = classified_labels # Store results

    # Print individual classifications for insight
    for i, label in enumerate(classified_labels):
        # Access the actual numerical value of the test point
        point_value = test_data[i, 0]
        print(f"  Point x{i + 51} (value: {point_value:.4f}) is classified as {label}")
    print("\n")

print("Classification complete.\n")

--- k-Nearest Neighbors Classification with scikit-learn ---
Training dataset: First 50 points labeled based on the rule (x <= 0.5 -> Class1, x > 0.5 -> Class2)
Testing dataset: Remaining 50 points to be classified

Results for k = 1:
  Point x51 (value: 0.9696) is classified as Class2
  Point x52 (value: 0.7751) is classified as Class2
  Point x53 (value: 0.9395) is classified as Class2
  Point x54 (value: 0.8948) is classified as Class2
  Point x55 (value: 0.5979) is classified as Class2
  Point x56 (value: 0.9219) is classified as Class2
  Point x57 (value: 0.0885) is classified as Class1
  Point x58 (value: 0.1960) is classified as Class1
  Point x59 (value: 0.0452) is classified as Class1
  Point x60 (value: 0.3253) is classified as Class1
  Point x61 (value: 0.3887) is classified as Class1
  Point x62 (value: 0.2713) is classified as Class1
  Point x63 (value: 0.8287) is classified as Class2
  Point x64 (value: 0.3568) is classified as Class1
  Point x65 (value: 0.2809) is classi