## Q1. What is the KNN Algorithm?

The K-Nearest Neighbors (KNN) algorithm is a simple, non-parametric, supervised learning algorithm used for both classification and regression tasks. It works by finding the "k" closest training data points (neighbors) to a new data point and predicting its label based on the majority label (for classification) or average value (for regression) of those neighbors. It relies heavily on distance metrics (e.g., Euclidean, Manhattan) to determine the proximity between data points.



## Q2. How Do You Choose the Value of K in KNN?
Choosing the right value of K is crucial, as it directly impacts the model’s performance:

Cross-Validation: Perform cross-validation to evaluate different K values and select the one that yields the best accuracy.
Bias-Variance Tradeoff: Smaller K values may lead to a highly flexible model, risking overfitting, while larger K values can lead to underfitting.



## Q3. Difference Between KNN Classifier and KNN Regressor

KNN Classifier: Used for classification tasks where the model predicts the class label of a new data point based on the majority class of its K-nearest neighbors.
KNN Regressor: Used for regression tasks where the model predicts a continuous value by averaging the target values of the K-nearest neighbors.


## Q4. How Do You Measure the Performance of KNN?
For Classification: Use metrics like accuracy, precision, recall, F1-score, and the confusion matrix.
For Regression: Evaluate with metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), and R-squared (R²).


## Q5. What is the Curse of Dimensionality in KNN?
The curse of dimensionality refers to the phenomenon where the distance between data points becomes less meaningful as the number of features (dimensions) increases. In high-dimensional spaces, the distance between any two points becomes nearly identical, making it difficult for KNN to effectively determine the nearest neighbors. This can lead to reduced model accuracy and increased computational complexity.



## Q6. How Do You Handle Missing Values in KNN?
Imputation: Replace missing values with the mean, median, or mode of the feature.
KNN Imputation: Fill missing values by finding the K-nearest neighbors and taking the average (for continuous data) or majority value (for categorical data).
Dropping Records: In cases with few missing values, you may drop records with missing entries if it does not affect data integrity.


## Q7. Compare and Contrast the Performance of the KNN Classifier and Regressor
KNN Classifier: Works well for problems where the decision boundaries are simple and where data classes are well-separated. It’s suitable for applications like image classification and text categorization.
KNN Regressor: Used in regression problems where target values are continuous. It works best when target values of nearby points are correlated, such as in predictive analysis for house prices.


## Q8. Strengths and Weaknesses of the KNN Algorithm for Classification and Regression Tasks
Strengths:

Simplicity: KNN is easy to understand and implement.
Non-parametric: No assumptions about data distribution are required.
Versatile: Can be used for both classification and regression.
Weaknesses:

Computationally Intensive: KNN is slow during prediction, especially with large datasets, as it calculates distances for each new data point.
Sensitive to Irrelevant Features: Irrelevant or redundant features can affect distance calculations and model accuracy.
Ineffective in High Dimensions: Suffers from the curse of dimensionality.

Addressing Weaknesses:

Use dimensionality reduction techniques (e.g., PCA).
Perform feature selection to retain only the most relevant features.
Implement algorithms like KD-Tree or Ball-Tree for faster distance calculations.


## Q9. Difference Between Euclidean Distance and Manhattan Distance in KNN
Euclidean Distance: Measures the straight-line (or "as-the-crow-flies") distance between points in space. It's suitable for continuous, numerical data and when the distance in all directions is equally significant.



Manhattan Distance: Measures the distance between points by only allowing vertical and horizontal moves (like a grid or city block). It's often used when dimensions are not correlated and is less sensitive to outliers than Euclidean distance.

Formula :
<img src = "https://th.bing.com/th/id/OIP.kYpQBUb08mLQi80C3QevwwHaDK?w=350&h=149&c=7&r=0&o=5&dpr=1.4&pid=1.7">



## Q10. Role of Feature Scaling in KNN
Feature scaling is crucial in KNN because the algorithm relies on distance calculations to determine neighbors. Features with larger ranges can dominate distance metrics, skewing the algorithm’s results. Applying techniques like min-max normalization or standardization ensures all features contribute equally to the distance calculation, leading to more accurate predictions.