Q1. Write a Python code to implement the KNN classifier algorithm on load_iris dataset in
sklearn.datasets.





ANS:
    
    
    
    
    Certainly! Here's a Python code example that demonstrates how to implement the K-Nearest Neighbors (KNN) classifier algorithm on the `load_iris` dataset from `sklearn.datasets`:

```python
# Import necessary libraries
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = load_iris()
X = iris.data  # Features
y = iris.target  # Target labels

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create a KNN classifier with a specified number of neighbors (k)
k = 3
knn_classifier = KNeighborsClassifier(n_neighbors=k)

# Fit the classifier to the training data
knn_classifier.fit(X_train, y_train)

# Make predictions on the test data
y_pred = knn_classifier.predict(X_test)

# Calculate accuracy on the test set
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy * 100:.2f}%")
```

In this code:

1. We import the necessary libraries, including `load_iris` to load the dataset, `train_test_split` to split it into training and testing sets, `KNeighborsClassifier` to create the KNN classifier, and `accuracy_score` to evaluate the model's performance.

2. We load the Iris dataset and split it into features (X) and target labels (y).

3. We split the dataset into training and testing sets using `train_test_split`. In this example, 70% of the data is used for training, and 30% is used for testing.

4. We create a KNN classifier with `KNeighborsClassifier` and specify the number of neighbors (k) to use. In this case, we set `k` to 3.

5. We fit the KNN classifier to the training data using the `fit` method.

6. We make predictions on the test data using the `predict` method.

7. Finally, we calculate and print the accuracy of the KNN classifier on the test set to evaluate its performance.

You can modify the value of `k` or other hyperparameters and explore different evaluation metrics based on your specific requirements and dataset.

Q2. Write a Python code to implement the KNN regressor algorithm on load_boston dataset in
sklearn.datasets.






ANS:
    
    
    
    Certainly! Here's a Python code example that demonstrates how to implement the K-Nearest Neighbors (KNN) regressor algorithm on the `load_boston` dataset from `sklearn.datasets`:

```python
# Import necessary libraries
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsRegressor
from sklearn.metrics import mean_squared_error, r2_score
import numpy as np

# Load the Boston Housing dataset
boston = load_boston()
X = boston.data  # Features
y = boston.target  # Target (housing prices)

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create a KNN regressor with a specified number of neighbors (k)
k = 3
knn_regressor = KNeighborsRegressor(n_neighbors=k)

# Fit the regressor to the training data
knn_regressor.fit(X_train, y_train)

# Make predictions on the test data
y_pred = knn_regressor.predict(X_test)

# Calculate and print evaluation metrics
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Mean Squared Error (MSE): {mse:.2f}")
print(f"R-squared (R2) Score: {r2:.2f}")
```

In this code:

1. We import the necessary libraries, including `load_boston` to load the dataset, `train_test_split` to split it into training and testing sets, `KNeighborsRegressor` to create the KNN regressor, and evaluation metrics (`mean_squared_error` and `r2_score`) to assess the model's performance.

2. We load the Boston Housing dataset and split it into features (X) and target values (y), where the target represents housing prices.

3. We split the dataset into training and testing sets using `train_test_split`. In this example, 70% of the data is used for training, and 30% is used for testing.

4. We create a KNN regressor with `KNeighborsRegressor` and specify the number of neighbors (k) to use. In this case, we set `k` to 3.

5. We fit the KNN regressor to the training data using the `fit` method.

6. We make predictions on the test data using the `predict` method.

7. Finally, we calculate and print evaluation metrics, including the Mean Squared Error (MSE) and R-squared (R2) score, to assess the performance of the KNN regressor.

You can adjust the value of `k` or other hyperparameters and explore different evaluation metrics based on your specific regression task and dataset.

Q3. Write a Python code snippet to find the optimal value of K for the KNN classifier algorithm using
cross-validation on load_iris dataset in sklearn.datasets.





ANS:
    
    
    
    To find the optimal value of K for the K-Nearest Neighbors (KNN) classifier using cross-validation on the `load_iris` dataset from `sklearn.datasets`, you can perform a grid search over a range of K values and evaluate the model's performance using cross-validation. Here's a Python code snippet to achieve this:

```python
# Import necessary libraries
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.neighbors import KNeighborsClassifier
import numpy as np

# Load the Iris dataset
iris = load_iris()
X = iris.data  # Features
y = iris.target  # Target labels

# Split the dataset into training and testing sets (not used for cross-validation)
X_train, _, y_train, _ = train_test_split(X, y, test_size=0.3, random_state=42)

# Define a range of K values to test
k_values = list(range(1, 21))  # You can adjust the range as needed

# Create an empty dictionary to store K-values and their corresponding cross-validation scores
k_scores = {}

# Perform cross-validation for each K value
for k in k_values:
    knn_classifier = KNeighborsClassifier(n_neighbors=k)
    scores = cross_val_score(knn_classifier, X_train, y_train, cv=5, scoring='accuracy')  # 5-fold cross-validation
    k_scores[k] = np.mean(scores)

# Find the K value with the highest cross-validation score
best_k = max(k_scores, key=k_scores.get)
best_score = k_scores[best_k]

print(f"Best K: {best_k}")
print(f"Best Cross-Validation Score: {best_score:.4f}")
```

In this code:

1. We load the Iris dataset and split it into features (X) and target labels (y).

2. We split the dataset into training and testing sets (not used for cross-validation) using `train_test_split`. You can adjust the test size as needed.

3. We define a range of K values (from 1 to 20 in this example) that we want to test.

4. We create an empty dictionary `k_scores` to store the K-values and their corresponding cross-validation scores.

5. We loop through each K value, create a KNN classifier with that K value, and perform 5-fold cross-validation (`cv=5`) to compute the mean accuracy score for each K value.

6. We record the K-value and its cross-validation score in the `k_scores` dictionary.

7. Finally, we identify the K value with the highest cross-validation score as the optimal K for the KNN classifier.

You can adjust the range of K values, the number of folds in cross-validation, and the scoring metric to suit your specific needs. The code will help you find the optimal K value for your KNN classifier on the `load_iris` dataset.

Q4. Implement the KNN regressor algorithm with feature scaling on load_boston dataset in
sklearn.datasets.





ANS:
    
    
    
    To implement the K-Nearest Neighbors (KNN) regressor algorithm with feature scaling on the `load_boston` dataset from `sklearn.datasets`, you can follow these steps. We'll use the `StandardScaler` from scikit-learn for feature scaling:

```python
# Import necessary libraries
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsRegressor
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error, r2_score

# Load the Boston Housing dataset
boston = load_boston()
X = boston.data  # Features
y = boston.target  # Target (housing prices)

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Perform feature scaling using StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Create a KNN regressor with a specified number of neighbors (k)
k = 3
knn_regressor = KNeighborsRegressor(n_neighbors=k)

# Fit the regressor to the scaled training data
knn_regressor.fit(X_train_scaled, y_train)

# Make predictions on the scaled test data
y_pred = knn_regressor.predict(X_test_scaled)

# Calculate and print evaluation metrics
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Mean Squared Error (MSE): {mse:.2f}")
print(f"R-squared (R2) Score: {r2:.2f}")
```

In this code:

1. We load the Boston Housing dataset and split it into features (X) and target values (y), representing housing prices.

2. We split the dataset into training and testing sets using `train_test_split`, with 70% of the data for training and 30% for testing.

3. We perform feature scaling using `StandardScaler`. This step standardizes the features to have zero mean and unit variance, which is important for KNN because it relies on distance metrics.

4. We create a KNN regressor with `KNeighborsRegressor` and specify the number of neighbors (k) to use. In this case, we set `k` to 3.

5. We fit the KNN regressor to the scaled training data using the `fit` method.

6. We make predictions on the scaled test data using the `predict` method.

7. Finally, we calculate and print evaluation metrics, including the Mean Squared Error (MSE) and R-squared (R2) score, to assess the performance of the KNN regressor.

By applying feature scaling, you ensure that all features have the same scale, which is a crucial step for KNN and helps improve the performance of the algorithm.

Q5. Write a Python code snippet to implement the KNN classifier algorithm with weighted voting on
load_iris dataset in sklearn.datasets.





ANS:
    
    
    
   Certainly! You can implement the K-Nearest Neighbors (KNN) classifier algorithm with weighted voting on the `load_iris` dataset from `sklearn.datasets`. In weighted voting, the contributions of neighbors to the decision are weighted by their distance. Here's a Python code snippet to achieve this:

```python
# Import necessary libraries
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = load_iris()
X = iris.data  # Features
y = iris.target  # Target labels

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create a KNN classifier with weighted voting and a specified number of neighbors (k)
k = 3
knn_classifier = KNeighborsClassifier(n_neighbors=k, weights='distance')  # Use 'distance' for weighted voting

# Fit the classifier to the training data
knn_classifier.fit(X_train, y_train)

# Make predictions on the test data
y_pred = knn_classifier.predict(X_test)

# Calculate accuracy on the test set
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy * 100:.2f}%")
```

In this code:

1. We load the Iris dataset and split it into features (X) and target labels (y).

2. We split the dataset into training and testing sets using `train_test_split`, with 70% of the data for training and 30% for testing.

3. We create a KNN classifier with `KNeighborsClassifier` and specify the number of neighbors (k) to use. Additionally, we set the `weights` parameter to 'distance' to enable weighted voting based on the inverse of distances.

4. We fit the KNN classifier to the training data using the `fit` method.

5. We make predictions on the test data using the `predict` method, and these predictions will take into account the distances to the neighbors.

6. Finally, we calculate and print the accuracy of the KNN classifier with weighted voting on the test set to assess its performance.

Weighted voting assigns more influence to closer neighbors in the decision-making process, which can lead to improved accuracy, especially when some neighbors are more relevant than others. 

Q6. Implement a function to standardise the features before applying KNN classifier.





ANS:
    
    
    
    
    To standardize the features before applying a K-Nearest Neighbors (KNN) classifier, you can create a function that uses the `StandardScaler` from scikit-learn to scale the features to have zero mean and unit variance. Here's a Python function to perform feature standardization:

```python
from sklearn.preprocessing import StandardScaler

def standardize_features(X_train, X_test):
    """
    Standardize features using StandardScaler.

    Parameters:
    - X_train: Training data features (numpy array or pandas DataFrame).
    - X_test: Test data features (numpy array or pandas DataFrame).

    Returns:
    - X_train_scaled: Standardized training data features.
    - X_test_scaled: Standardized test data features.
    """
    # Create a StandardScaler object
    scaler = StandardScaler()

    # Fit the scaler on the training data and transform both training and test data
    X_train_scaled = scaler.fit_transform(X_train)
    X_test_scaled = scaler.transform(X_test)

    return X_train_scaled, X_test_scaled
```

You can use this function by passing your training and test data features (X_train and X_test) as arguments. The function will return the standardized training and test data features, which you can then use for your KNN classifier or other machine learning algorithms.

Here's how you can use the `standardize_features` function in your KNN classifier code:

```python
# Load necessary libraries and data, and split the dataset into X_train and X_test

# Standardize features
X_train_scaled, X_test_scaled = standardize_features(X_train, X_test)

# Create a KNN classifier and fit it to the standardized training data
knn_classifier = KNeighborsClassifier(n_neighbors=k)
knn_classifier.fit(X_train_scaled, y_train)

# Make predictions on the standardized test data
y_pred = knn_classifier.predict(X_test_scaled)

# Calculate accuracy or other evaluation metrics on the test set
accuracy = accuracy_score(y_test, y_pred)
```

By standardizing the features before applying the KNN classifier, you ensure that all features have the same scale and zero mean, which helps the algorithm make more meaningful distance calculations and can improve the model's performance.