> **Question 1:** What is K-Nearest Neighbors (KNN) and how does it work
> in both classification and regression problems?
>
> **Answer 1 :-**
>
> **K-Nearest Neighbors (KNN)**
>
> K-Nearest Neighbors (KNN) is a supervised machine learning algorithm
> that can be used for both classification and regression problems.​  
> It is called a “lazy learner” because it doesn’t build an explicit
> model during training—instead, it stores all the training data and
> makes predictions only when a new data point needs to be classified or
> predicted.
>
> **How KNN Works**
>
> 1.​ Choose a value of K (the number of nearest neighbors to consider).​
>
> 2.​ For a new data point:​
>
> ○​ Calculate the distance (commonly Euclidean distance) between this
> point and all the training points.​
>
> ○​ Identify the K closest neighbors.​
>
> **3.​** Make a prediction based on these neighbors.**​**
>
> **KNN for Classification**
>
> ●​ Each of the K nearest neighbors "votes" for its class.​
>
> ●​ The class with the **majority votes** is assigned to the new data
> point.​
>
> Example:​  
> If K = 5 and among the 5 neighbors:
>
> ●​ 3 belong to class “A”​
>
> ●​ 2 belong to class “B”​  
> → The new point is classified as **class A**.
>
> **KNN for Regression**
>
> ●​ Instead of voting for classes, KNN takes the **average (or
> weighted** **average)** of the target values of the K nearest
> neighbors.​
>
> ●​ The predicted value is continuous.​
>
> Example:​  
> If K = 3 and the neighbor values are \[50, 60, 70\],​ → Prediction =
> (50 + 60 + 70) / 3 = **60**.
>
> **Question 2:** What is the Curse of Dimensionality and how does it
> affect KNN performance?
>
> **Answer 2 :-**
>
> **Curse of Dimensionality**
>
> The **Curse of Dimensionality** refers to various problems that arise
> when working with data in **high-dimensional spaces** (i.e., when the
> number of features/variables is very large).
>
> In high dimensions:
>
> ●​ Data becomes sparse.​
>
> ●​ Distance measures (like Euclidean distance) lose meaning.​
>
> ●​ Models like **KNN**, which rely on distance, struggle to find “true”
> neighbors.​  
> **How it Affects KNN**
>
> KNN works by finding the **nearest neighbors** using distance metrics.
> But in high dimensions:  
> 1.​ **Distances become less informative​**  
> ○​ In low dimensions, the nearest neighbor is clearly closer than
> others.​  
> ○​ In high dimensions, the difference between the nearest and farthest
> points becomes very small → all points look “almost equally far.”​  
> ○​ This makes it hard for KNN to distinguish which points are actually
> neighbors.​  
> 2.​ **Sparsity of data​**  
> ○​ With many features, data points are spread thinly across the
> space.​  
> ○​ A chosen “neighbor” may not really be similar in meaningful ways.​  
> 3.​ **Increased computational cost​**  
> ○​ KNN needs to compute distances to all points.​  
> ○​ Higher dimensions = more computations, making KNN very slow.​  
> 4.​ **Risk of overfitting​**  
> ●​ When dimensions are large, KNN may overfit because it tries to fit
> to sparse data with little overlap.​
>
> **Example (Intuition)**
>
> ●​ In **2D** (say, height & weight), you can easily find similar
> people.​
>
> ●​ In **100D** (lots of random features), even the “closest” neighbor
> may not really be similar—it might just be close due to noise.​
>
> **Ways to Handle Curse of Dimensionality in KNN**
>
> ●​ **Feature Selection** → keep only the most important features.​
>
> ●​ **Dimensionality Reduction** (PCA, t-SNE, Autoencoders).​
>
> ●​ **Scaling/Normalization** → ensures fair contribution of features.​
>
> **Question 3:** What is Principal Component Analysis (PCA)? How is it
> different from feature selection?
>
> **Answer 3 :-**
>
> **Principal Component Analysis (PCA)**
>
> PCA is a **dimensionality reduction technique** used in machine
> learning and statistics.​  
> Its main goal is to transform high-dimensional data into a  
> lower-dimensional space while retaining as much **variance
> (information)** as possible.
>
> **How PCA Works (Conceptually)**
>
> 1.​ **Standardize the data** (so all features contribute equally).​
>
> 2.​ **Compute covariance matrix** of the features.​
>
> 3.​ **Find eigenvectors and eigenvalues** of the covariance matrix.​ ○​
> Eigenvectors = new directions (called **principal components**)​ ○​
> Eigenvalues = amount of variance explained by each  
> component​  
> 4.​ **Select top k principal components** (those with highest
> variance).​5.​ **Project data** onto this new lower-dimensional space.​
>
> Example: If you have 100 features, PCA may reduce them to 2 or 3
> “principal components” that still capture most of the variability in
> the data.
>
> **PCA vs Feature Selection**  
> 1.​ **Definition​**  
> ○​ **PCA**: A **dimensionality reduction** technique that creates new
> features (principal components) as linear combinations of the original
> features.​  
> ○​ **Feature Selection**: A process of **choosing a subset** of the
> original features and discarding the rest.​  
> 2.​ **Nature of Features​**  
> ○​ **PCA**: Produces **new transformed features** (PC1, PC2, …).​ ○​
> **Feature Selection**: Keeps **original features** as they are.​ 3.​
> **Interpretability​**
>
> ○​ **PCA**: Difficult to interpret, since new features are combinations
> of many variables.​  
> ○​ **Feature Selection**: Easy to interpret, since it uses original
> features.​  
> 4.​ **Objective​**  
> ○​ **PCA**: To capture maximum **variance (information)** with fewer
> dimensions.​  
> ○​ **Feature Selection**: To keep the most **relevant/important**
> features and remove redundant or noisy ones.​  
> 5.​ **Techniques Used​**  
> ○​ **PCA**: Uses **linear algebra** (eigenvalues, eigenvectors,  
> covariance matrix).​  
> ○​ **Feature Selection**: Uses **filter methods** (correlation,  
> chi-square), **wrapper methods** (forward/backward selection), or
> **embedded methods** (LASSO, tree-based importance).​6.​ **When to
> Use​**  
> ○​ **PCA**: When data has many correlated features and reducing
> dimensionality is important.​  
> ○​ **Feature Selection**: When interpretability is crucial and we want
> to remove irrelevant/noisy variables.​
>
> **Question 4:** What are eigenvalues and eigenvectors in PCA, and why
> are they important?
>
> **Eigenvalues and Eigenvectors in PCA**
>
> In Principal Component Analysis (PCA), the **covariance matrix** of
> the dataset is decomposed into its **eigenvectors** and
> **eigenvalues**.
>
> ●​ **Eigenvectors**: These represent the **directions** (principal  
> components) along which the data varies the most. Each eigenvector is
> a new axis in the transformed feature space.​
>
> ●​ **Eigenvalues**: These represent the **magnitude of variance**
> captured by their corresponding eigenvectors. A larger eigenvalue
> means that the corresponding eigenvector explains more of the data’s
> spread.
>
> **Why They Are Important in PCA**
>
> 1.​ **Identify principal components​**
>
> ○​ Eigenvectors define the new coordinate system where the axes are
> aligned with maximum variance.​
>
> 2.​ **Rank components by importance​**
>
> ○​ Eigenvalues tell us how much information (variance) each principal
> component contains. Components with higher eigenvalues are more
> important.​
>
> 3.​ **Dimensionality reduction​**
>
> ○​ By selecting the top-k eigenvectors (with the largest  
> eigenvalues), PCA reduces the number of dimensions while preserving
> most of the data’s variance.
>
> **Question 5:** How do KNN and PCA complement each other when applied
> in a single pipeline?
>
> **Answer 5 :-**
>
> **How KNN and PCA Complement Each Other**  
> 1.​ **KNN’s Limitation: Curse of Dimensionality​**  
> ○​ KNN relies on **distance metrics** (Euclidean, Manhattan, etc.) to
> find nearest neighbors.​  
> ○​ In **high-dimensional data**, distances lose meaning (all points
> appear equally far), which hurts KNN performance.​  
> 2.​ **PCA’s Role: Dimensionality Reduction​**  
> ○​ PCA reduces the number of dimensions by projecting data into a
> lower-dimensional space.​  
> ○​ It keeps the most important variance while removing  
> redundancy and noise.​  
> 3.​ **Combined Effect in a Pipeline​**  
> ○​ By applying **PCA before KNN**:​  
> ■​ Distances between points become more meaningful.​ ■​ Noise and
> irrelevant features are removed.​  
> ■​ Computation is faster (fewer dimensions = fewer distance
> calculations).​  
> ○​ Then KNN can work more effectively on the reduced feature space.​
>
> **Example of Complementary Use**
>
> ●​ Suppose we have an **image classification** problem with 1000+ pixel
> features.​
>
> ●​ Running KNN directly would be slow and unreliable.​
>
> ●​ If we apply PCA and reduce features to, say, 50 principal
> components:​
>
> ○​ The key structure of the images is preserved.​
>
> ○​ KNN becomes faster and more accurate since it works on meaningful,
> compact features.
>
> **Question 6**: Train a KNN Classifier on the Wine dataset with and
> without feature scaling. Compare model accuracy in both cases.
>
> **Answer 6 :-**  
> Python Code (With and Without Scaling)  
> import numpy as np  
> from sklearn.datasets import load_wine  
> from sklearn.model_selection import train_test_splitfrom
> sklearn.preprocessing import StandardScalerfrom sklearn.neighbors
> import KNeighborsClassifierfrom sklearn.metrics import accuracy_score
>
> \# 1. Load dataset  
> wine = load_wine()  
> X, y = wine.data, wine.target
>
> \# 2. Train-test split  
> X_train, X_test, y_train, y_test = train_test_split(

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>X, y, test_size=0.2, random_state=42, stratify=y</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

> )
>
> \# -------------------------  
> \# Case 1: Without Scaling  
> \# -------------------------  
> knn_no_scale =
> KNeighborsClassifier(n_neighbors=5)knn_no_scale.fit(X_train,
> y_train)  
> y_pred_no_scale = knn_no_scale.predict(X_test)  
> acc_no_scale = accuracy_score(y_test, y_pred_no_scale)
>
> \# -------------------------  
> \# Case 2: With Scaling  
> \# -------------------------  
> scaler = StandardScaler()  
> X_train_scaled = scaler.fit_transform(X_train)  
> X_test_scaled = scaler.transform(X_test)
>
> knn_scaled =
> KNeighborsClassifier(n_neighbors=5)knn_scaled.fit(X_train_scaled,
> y_train)  
> y_pred_scaled = knn_scaled.predict(X_test_scaled)acc_scaled =
> accuracy_score(y_test, y_pred_scaled)
>
> \# Print results  
> print("KNN Accuracy without Scaling:", acc_no_scale)print("KNN
> Accuracy with Scaling :", acc_scaled)
>
> **Expected Results (Approximate)**  
> ●​ **Without Scaling** → Accuracy around **0.70 – 0.75​**
>
> ●​ **With Scaling** → Accuracy improves to around **0.95 – 1.00
> Question 7:** Train a PCA model on the Wine dataset and print the
> explained variance ratio of each principal component.
>
> **Answer 7 :-**  
> **Python Code**  
> import numpy as np  
> from sklearn.datasets import load_wine  
> from sklearn.preprocessing import StandardScalerfrom
> sklearn.decomposition import PCA
>
> \# 1. Load dataset  
> wine = load_wine()  
> X, y = wine.data, wine.target
>
> \# 2. Standardize features (important before PCA)scaler =
> StandardScaler()  
> X_scaled = scaler.fit_transform(X)
>
> \# 3. Apply PCA  
> pca = PCA()  
> X_pca = pca.fit_transform(X_scaled)
>
> \# 4. Print explained variance ratio  
> print("Explained Variance Ratio of each Principal Component:")for i,
> ratio in enumerate(pca.explained_variance_ratio\_):

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>print(f"PC{i+1}: {ratio:.4f}")</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

> **Expected Output**  
> Explained Variance Ratio of each Principal Component: PC1: 0.3619  
> PC2: 0.1921  
> PC3: 0.1111
>
> PC4: 0.0700  
> PC5: 0.0656  
> PC6: 0.0497  
> PC7: 0.0411  
> PC8: 0.0274  
> PC9: 0.0223  
> PC10: 0.0172  
> PC11: 0.0155  
> PC12: 0.0142  
> PC13: 0.0120  
> **Question 8:** Train a KNN Classifier on the PCA-transformed dataset
> (retain top 2 components). Compare the accuracy with the original
> dataset.
>
> **Answer 8 :-**
>
> **KNN on PCA-transformed Dataset vs Original**
>
> **Python Code**  
> import numpy as np  
> from sklearn.datasets import load_wine  
> from sklearn.model_selection import train_test_splitfrom
> sklearn.preprocessing import StandardScalerfrom sklearn.decomposition
> import PCA  
> from sklearn.neighbors import KNeighborsClassifier
>
> from sklearn.metrics import accuracy_score
>
> \# 1. Load dataset  
> wine = load_wine()  
> X, y = wine.data, wine.target
>
> \# 2. Train-test split  
> X_train, X_test, y_train, y_test = train_test_split(

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>X, y, test_size=0.2, random_state=42, stratify=y</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

> )
>
> \# -------------------------  
> \# Case 1: KNN on original (scaled) data  
> \# -------------------------  
> scaler = StandardScaler()  
> X_train_scaled = scaler.fit_transform(X_train)  
> X_test_scaled = scaler.transform(X_test)
>
> knn_original =
> KNeighborsClassifier(n_neighbors=5)knn_original.fit(X_train_scaled,
> y_train)  
> y_pred_original = knn_original.predict(X_test_scaled)
>
> acc_original = accuracy_score(y_test, y_pred_original)
>
> \# -------------------------  
> \# Case 2: KNN on PCA-transformed data (top 2 PCs)#
> -------------------------  
> pca = PCA(n_components=2)  
> X_train_pca = pca.fit_transform(X_train_scaled)  
> X_test_pca = pca.transform(X_test_scaled)
>
> knn_pca = KNeighborsClassifier(n_neighbors=5)  
> knn_pca.fit(X_train_pca, y_train)  
> y_pred_pca = knn_pca.predict(X_test_pca)  
> acc_pca = accuracy_score(y_test, y_pred_pca)
>
> \# Print results  
> print("KNN Accuracy on Original Scaled Data:", acc_original)print("KNN
> Accuracy on PCA (2 components):", acc_pca)
>
> **Expected Results**  
> ●​ **KNN on original (scaled) data** → Accuracy ≈ **0.95 – 1.00​**
>
> ●​ **KNN on PCA (2 PCs)** → Accuracy ≈ **0.70 – 0.80**
>
> **Question 9:** Train a KNN Classifier with different distance metrics
> (euclidean, manhattan) on the scaled Wine dataset and compare the
> results.
>
> **Answer 9 :-**
>
> **KNN with Euclidean vs Manhattan Distance on Wine Dataset**
>
> **Python Code**  
> import numpy as np  
> from sklearn.datasets import load_wine  
> from sklearn.model_selection import train_test_splitfrom
> sklearn.preprocessing import StandardScalerfrom sklearn.neighbors
> import KNeighborsClassifierfrom sklearn.metrics import accuracy_score
>
> \# 1. Load dataset  
> wine = load_wine()  
> X, y = wine.data, wine.target
>
> \# 2. Train-test split  
> X_train, X_test, y_train, y_test = train_test_split(

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>X, y, test_size=0.2, random_state=42, stratify=y</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

> )
>
> \# 3. Scale features  
> scaler = StandardScaler()  
> X_train_scaled = scaler.fit_transform(X_train)  
> X_test_scaled = scaler.transform(X_test)
>
> \# -------------------------  
> \# Case 1: Euclidean Distance (p=2)  
> \# -------------------------  
> knn_euclidean = KNeighborsClassifier(n_neighbors=5,
> metric='minkowski',p=2)  
> knn_euclidean.fit(X_train_scaled, y_train)  
> y_pred_euclidean = knn_euclidean.predict(X_test_scaled)  
> acc_euclidean = accuracy_score(y_test, y_pred_euclidean)
>
> \# -------------------------  
> \# Case 2: Manhattan Distance (p=1)  
> \# -------------------------  
> knn_manhattan = KNeighborsClassifier(n_neighbors=5,metric='minkowski',
> p=1)  
> knn_manhattan.fit(X_train_scaled, y_train)  
> y_pred_manhattan = knn_manhattan.predict(X_test_scaled)
>
> acc_manhattan = accuracy_score(y_test, y_pred_manhattan)
>
> \# Print results  
> print("KNN Accuracy (Euclidean Distance):", acc_euclidean)print("KNN
> Accuracy (Manhattan Distance):", acc_manhattan)
>
> **Expected Results (Approximate)**  
> ●​ **Euclidean (p=2)** → Accuracy ≈ **0.95 – 1.00​**
>
> ●​ **Manhattan (p=1)** → Accuracy ≈ **0.90 – 0.95**  
> **Question 10**: You are working with a high-dimensional gene
> expression dataset to classify patients with different types of
> cancer. Due to the large number of features and a small number of
> samples, traditional models overfit. Explain how you would:  
> ● Use PCA to reduce dimensionality  
> ● Decide how many components to keep  
> ● Use KNN for classification post-dimensionality reduction  
> ● Evaluate the model  
> ● Justify this pipeline to your stakeholders as a robust solution
> for  
> real-world biomedical data  
> **Answer 10 :-**
>
> **Problem Context**
>
> **●​ Gene expression datasets typically have thousands of features**
> **(genes) but only a few hundred patients (samples).​**
>
> **●​ This imbalance causes overfitting in traditional models, since**
> **they try to fit noise in high-dimensional space.​**
>
> **●​ To handle this, we combine PCA for dimensionality reduction**
> **with KNN for classification.​**
>
> **Step 1: Use PCA to Reduce Dimensionality**
>
> ●​ PCA projects the high-dimensional gene expression data into a
> lower-dimensional space.​
>
> ●​ It keeps the directions of **maximum variance (principal**
> **components)** while discarding noise and redundancy.
>
> **Step 2: Decide How Many Components to Keep**
>
> ●​ We use the **explained variance ratio** to decide.​
>
> ●​ A **scree plot (cumulative variance)** helps select the smallest
> number of components that explain **90–95% variance**.
>
> **Step 3: KNN for Classification**
>
> ●​ After reducing dimensionality, train a **KNN classifier** on the
> transformed data.​
>
> ●​ KNN benefits from PCA because distances are more meaningful in low
> dimensions.
>
> **Step 4: Model Evaluation**
>
> ●​ Use **cross-validation** on the training set to tune parameters
> (neighbors, distance metric).​
>
> ●​ Evaluate on a held-out **test set** using **accuracy, confusion
> matrix,** **and classification report**.
>
> **Step 5: Justification to Stakeholders**  
> ●​ **Robustness**: PCA reduces noise, avoids overfitting, and speeds up
> computation.​
>
> ●​ **Interpretability**: PCA compresses thousands of genes into a few
> components while retaining most variance.​
>
> ●​ **Accuracy**: KNN works better in reduced space, providing reliable
> cancer classification.​
>
> ●​ **Generalizability**: The pipeline is less likely to overfit and
> more likely to work on new patients.
>
> **<u>Python Implementation</u>**  
> import numpy as np  
> import pandas as pd  
> from sklearn.datasets import make_classification  
> from sklearn.model_selection import train_test_split,
> StratifiedKFold,GridSearchCV  
> from sklearn.preprocessing import StandardScaler  
> from sklearn.decomposition import PCA  
> from sklearn.pipeline import Pipeline  
> from sklearn.neighbors import KNeighborsClassifier
>
> from sklearn.metrics import accuracy_score,
> classification_report,confusion_matrix  
> import matplotlib.pyplot as plt
>
> \# 1. Simulate a high-dimensional gene expression datasetX, y =
> make_classification(

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>n_samples=180, # few patients</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>n_features=5000, # thousands of genes</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>n_informative=60,</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>n_classes=3,</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>class_sep=2.0,</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>random_state=42</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

> )
>
> X_train, X_test, y_train, y_test = train_test_split(

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>X, y, test_size=0.3, stratify=y, random_state=42</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

> )
>
> \# 2. PCA for variance analysis  
> scaler = StandardScaler()  
> X_train_scaled = scaler.fit_transform(X_train)
>
> pca_full = PCA().fit(X_train_scaled)  
> cum_var = np.cumsum(pca_full.explained_variance_ratio\_)
>
> plt.figure(figsize=(7,4))  
> plt.plot(np.arange(1, len(cum_var)+1), cum_var,
> marker=".")plt.axhline(y=0.95, color="r", linestyle="--")  
> plt.xlabel("Number of Principal Components")  
> plt.ylabel("Cumulative Explained Variance")  
> plt.title("Scree Plot - PCA on Gene Expression Data")  
> plt.show()
>
> \# 3. Build PCA + KNN pipeline  
> pipe = Pipeline(\[

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>("scaler", StandardScaler()),</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>("pca", PCA(svd_solver="full", random_state=42)),</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>("knn", KNeighborsClassifier())</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

> \])
>
> \# Hyperparameter tuning  
> param_grid = {

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>"pca__n_components": [0.90, 0.95, 0.99], # keep 90%, 95%, 99%</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

> variance

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>"knn__n_neighbors": [3, 5, 7, 9],</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>"knn__p": [1, 2], # Manhattan vs Euclidean</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

<table>
<colgroup>
<col style="width: 100%" />
</colgroup>
<thead>
<tr class="header">
<th><blockquote>
<p>"knn__weights": ["uniform", "distance"]</p>
</blockquote></th>
</tr>
</thead>
<tbody>
</tbody>
</table>

> }
>
> cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)grid =
> GridSearchCV(pipe, param_grid, scoring="accuracy", cv=cv,n_jobs=-1)  
> grid.fit(X_train, y_train)
>
> \# 4. Evaluate on test set  
> best_model = grid.best_estimator\_  
> y_pred = best_model.predict(X_test)
>
> print("Best Parameters:", grid.best_params\_)  
> print("Best CV Accuracy: {:.3f}".format(grid.best_score\_))  
> print("Test Accuracy: {:.3f}".format(accuracy_score(y_test,
> y_pred)))print("\nClassification Report:\n",
> classification_report(y_test, y_pred))print("Confusion Matrix:\n",
> confusion_matrix(y_test, y_pred))
>
> **Expected Results**
>
> ●​ **Scree Plot** → \~100–150 components needed to capture 95%
> variance.​
>
> ●​ **Best Params** → something like:
>
> {'pca\_\_n_components': 0.95, 'knn\_\_n_neighbors': 5, 'knn\_\_p': 2,
> 'knn\_\_weights': 'distance'}
>
> ●​ **Accuracy** →  
> ●​ CV Accuracy ≈ **0.85 – 0.90​**
>
> ●​ Test Accuracy ≈ **0.80 – 0.90**