# #Q1. Write a Python code to implement the KNN classifier algorithm on load_iris dataset in
sklearn.datasets.

Sure! The k-Nearest Neighbors (KNN) algorithm is a simple and effective classification algorithm. It's commonly used for classification tasks. Here's how you can implement the KNN classifier algorithm on the load_iris dataset from sklearn.datasets:

In [None]:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Feature scaling (optional but recommended)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Create a KNN classifier object with k=3
k = 3
knn_classifier = KNeighborsClassifier(n_neighbors=k)

# Train the classifier on the training data
knn_classifier.fit(X_train, y_train)

# Make predictions on the test data
y_pred = knn_classifier.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")


n this code, we first load the Iris dataset and split it into training and testing sets using train_test_split. Then, we create a KNN classifier with n_neighbors=3 (you can change this value to experiment with different numbers of neighbors). We train the classifier using the training data and make predictions on the test data. Finally, we calculate and print the accuracy of the KNN classifier.

Please note that the Iris dataset is relatively small, and the KNN classifier is straightforward. In practice, you might want to perform additional data preprocessing, feature scaling, or use more sophisticated classification algorithms for larger and more complex datasets.






# #Q2. Write a Python code to implement the KNN regressor algorithm on load_boston dataset in
sklearn.datasets.

Sure! The k-Nearest Neighbors (KNN) regressor is a simple algorithm used for regression tasks. Let's implement the KNN regressor using the load_boston dataset from sklearn.datasets. Again, make sure you have the necessary libraries installed by running pip install scikit-learn.

Here's the Python code for the KNN regressor on the Boston dataset:

In [None]:
import numpy as np
import pandas as pd
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsRegressor
from sklearn.metrics import mean_squared_error, r2_score

# Load the Boston dataset
boston = load_boston()
X = boston.data
y = boston.target

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Initialize the KNN regressor
knn_regressor = KNeighborsRegressor(n_neighbors=5)  # Set the number of neighbors (k) to 5

# Train the regressor on the training data
knn_regressor.fit(X_train, y_train)

# Make predictions on the test data
y_pred = knn_regressor.predict(X_test)

# Calculate the Mean Squared Error (MSE) and R-squared score
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print("Mean Squared Error:", mse)
print("R-squared score:", r2)


In this code, we load the Boston dataset and split it into training and testing sets using train_test_split. Then, we create a KNN regressor with n_neighbors=5 (you can change this value to experiment with different numbers of neighbors). We train the regressor using the training data and make predictions on the test data. Finally, we calculate and print the Mean Squared Error (MSE) and R-squared score, which are commonly used evaluation metrics for regression tasks.

Again, note that in practical scenarios, you may want to perform additional data preprocessing, feature scaling, or try other regression algorithms to improve performance on larger and more complex datasets

# #Q3. Write a Python code snippet to find the optimal value of K for the KNN classifier algorithm using
cross-validation on load_iris dataset in sklearn.datasets.

To find the optimal value of K for the KNN classifier algorithm, we can use cross-validation to evaluate the model's performance for different values of K. We can use k-fold cross-validation to achieve this, where the dataset is divided into 'k' subsets (folds), and the model is trained and evaluated 'k' times, each time using a different fold as the test set and the rest as the training set.

In this example, I'll use a simple loop to iterate over different values of K and perform k-fold cross-validation to find the optimal K for the KNN classifier on the Iris dataset.

python


In [None]:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_score
from sklearn.neighbors import KNeighborsClassifier

# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# Define a range of K values to try
k_values = list(range(1, 31))

# Dictionary to store cross-validation scores for each K
k_scores = {}

# Perform k-fold cross-validation for each K value
for k in k_values:
    knn = KNeighborsClassifier(n_neighbors=k)
    scores = cross_val_score(knn, X, y, cv=5)  # 5-fold cross-validation
    k_scores[k] = np.mean(scores)

# Find the optimal K with the highest mean cross-validation score
optimal_k = max(k_scores, key=k_scores.get)

print("Cross-validation scores for each K:")
for k, score in k_scores.items():
    print(f"K={k}, Mean Accuracy={score:.4f}")

print("\nOptimal K:", optimal_k)


In this code, we define a range of K values from 1 to 30. Then, we loop through each K value, create a KNN classifier with that K value, and perform 5-fold cross-validation using cross_val_score. The mean cross-validation accuracy for each K value is stored in the k_scores dictionary. Finally, we find the optimal K by selecting the K with the highest mean accuracy.

Please note that the choice of the number of folds (cv=5 in this case) can vary based on the dataset size and other considerations. In practice, you can experiment with different values of cv to find the best option for your specific problem.






# #Q4. Implement the KNN regressor algorithm with feature scaling on load_boston dataset in
sklearn.datasets.

Sure! To implement the KNN regressor algorithm with feature scaling on the `load_boston` dataset, we will use the `StandardScaler` from `sklearn.preprocessing` to scale the features before training the KNN regressor. Feature scaling is essential for distance-based algorithms like KNN to ensure that all features contribute equally to the distance calculation.

Here's the Python code for the KNN regressor with feature scaling on the Boston dataset:

```python
import numpy as np
import pandas as pd
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsRegressor
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error, r2_score

# Load the Boston dataset
boston = load_boston()
X = boston.data
y = boston.target

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Initialize the StandardScaler
scaler = StandardScaler()

# Scale the features in the training and testing sets
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Initialize the KNN regressor
knn_regressor = KNeighborsRegressor(n_neighbors=5)  # Set the number of neighbors (k) to 5

# Train the regressor on the scaled training data
knn_regressor.fit(X_train_scaled, y_train)

# Make predictions on the scaled test data
y_pred = knn_regressor.predict(X_test_scaled)

# Calculate the Mean Squared Error (MSE) and R-squared score
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print("Mean Squared Error:", mse)
print("R-squared score:", r2)
```

In this code, we first load the Boston dataset and split it into training and testing sets using `train_test_split`. Then, we initialize the `StandardScaler` to scale the features in the training and testing sets. We fit the scaler on the training data using `fit_transform` and transform the test data using `transform`.

After scaling the features, we create a KNN regressor with `n_neighbors=5` (you can change this value to experiment with different numbers of neighbors). We train the regressor using the scaled training data and make predictions on the scaled test data. Finally, we calculate and print the Mean Squared Error (MSE) and R-squared score for the KNN regressor with feature scaling.

# #Q5. Write a Python code snippet to implement the KNN classifier algorithm with weighted voting on
load_iris dataset in sklearn.datasets.

To implement the KNN classifier algorithm with weighted voting, we can modify the standard KNN classifier by assigning weights to each neighbor's vote based on their distance from the query point. Closer neighbors will have a higher weight, indicating their higher importance in the voting process. We can achieve this by setting the `weights` parameter of the `KNeighborsClassifier` to `'distance'`.

Here's the Python code for the KNN classifier with weighted voting on the Iris dataset:

```python
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Initialize the KNN classifier with weighted voting
knn = KNeighborsClassifier(n_neighbors=3, weights='distance')

# Train the classifier on the training data
knn.fit(X_train, y_train)

# Make predictions on the test data
y_pred = knn.predict(X_test)

# Calculate the accuracy of the classifier
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
```

In this code, we load the Iris dataset and split it into training and testing sets using `train_test_split`. We create a KNN classifier with `n_neighbors=3` (you can change this value to experiment with different numbers of neighbors) and set `weights='distance'` to enable weighted voting.

We then train the classifier on the training data and make predictions on the test data. Finally, we calculate and print the accuracy of the KNN classifier with weighted voting.

Weighted voting allows the KNN classifier to give more importance to the closer neighbors, making it potentially more robust in certain scenarios where nearby neighbors might provide more relevant information for classification.

# #Q6. Implement a function to standardise the features before applying KNN classifier.

To implement a function to standardize the features before applying the KNN classifier, you can use the `standardize_features` function that I provided earlier. Here's how you can do it:

```python
import numpy as np

def standardize_features(X):
    """
    Standardize the features of the dataset.

    Parameters:
    X (numpy.ndarray): Input data with shape (num_samples, num_features).

    Returns:
    numpy.ndarray: Standardized data with shape (num_samples, num_features).
    """
    mean = np.mean(X, axis=0)
    std = np.std(X, axis=0)
    standardized_X = (X - mean) / std
    return standardized_X

def knn_classifier(X_train, y_train, X_test, k=3):
    """
    K-nearest neighbors (KNN) classifier.

    Parameters:
    X_train (numpy.ndarray): Training data with shape (num_train_samples, num_features).
    y_train (numpy.ndarray): Training labels with shape (num_train_samples,).
    X_test (numpy.ndarray): Test data with shape (num_test_samples, num_features).
    k (int): Number of nearest neighbors to consider.

    Returns:
    numpy.ndarray: Predicted labels for the test data.
    """
    standardized_X_train = standardize_features(X_train)
    standardized_X_test = standardize_features(X_test)
    
    num_test_samples = X_test.shape[0]
    predicted_labels = np.zeros(num_test_samples)

    for i in range(num_test_samples):
        distances = np.sqrt(np.sum((standardized_X_train - standardized_X_test[i])**2, axis=1))
        nearest_indices = np.argsort(distances)[:k]
        nearest_labels = y_train[nearest_indices]
        predicted_labels[i] = np.bincount(nearest_labels).argmax()

    return predicted_labels
```

In this implementation, the `standardize_features` function is used to standardize the training and test data before applying the KNN classifier. The `knn_classifier` function takes the training data `X_train`, training labels `y_train`, test data `X_test`, and the number of nearest neighbors `k` as inputs. It then returns the predicted labels for the test data using the KNN algorithm.

Remember to call `knn_classifier` with appropriate data and labels to obtain the predicted labels for your specific dataset.