### Assignment: Pokémon Data Analysis using Dimensionality Reduction and Clustering  
**Objective**: Explore the Pokémon dataset using dimensionality reduction techniques (PCA, UMAP, t-SNE) and clustering algorithms (k-means, DBSCAN, Agglomerative Clustering) to uncover patterns in Pokémon attributes and types.  

---

### **Tasks**  
#### **1. Data Preprocessing**  
- Load the dataset and handle missing values (e.g., `Type 2`).  
- Standardize/normalize numerical features (`Total`, `HP`, `Attack`, `Defense`, `Sp. Atk`, `Sp. Def`, `Speed`, `Height`, `Weight`, `BMI`).  
- Encode categorical variables (`Type 1`, `Type 2`, `Generation`) if needed.  

#### **2. Exploratory Data Analysis (EDA)**  
- Visualize distributions of numerical features (e.g., histograms, boxplots).  
- Analyze correlations between features (e.g., `Total` vs. individual stats, `BMI` vs. `Weight`).  
- Compare `Type 1` and `Type 2` distributions across generations.  

#### **3. Dimensionality Reduction**  
Apply **three techniques** to reduce the dataset to 2D:  
1. **PCA**:  
   - Compute principal components and plot explained variance ratio.  
   - Visualize the 2D projection.  
2. **UMAP**:  
   - Tune hyperparameters (e.g., `n_neighbors`, `min_dist`).  
   - Compare results with PCA.  
3. **t-SNE**:  
   - Experiment with perplexity values (e.g., 5, 30, 50).  
   - Visualize embeddings and discuss computational time.  

#### **4. Clustering**  
Apply **three clustering algorithms** on the reduced 2D space and original feature space:  
1. **K-means**:  
   - Use the elbow method to choose `k`.  
   - Compare clusters with Pokémon types/generations.  
2. **DBSCAN**:  
   - Tune `eps` and `min_samples` to handle noise.  
   - Analyze cluster stability.  

#### **5. Evaluation & Interpretation**  
- Use silhouette scores to evaluate clustering quality.  
- Interpret clusters: Do they group Pokémon by type, generation, or stats (e.g., high `Attack` vs. high `Speed`)?  
- Compare results across dimensionality reduction techniques:  
  - Which method preserves local/global structure best?  
  - Do clusters align with known Pokémon categories (e.g., Legendary, Starter)?  

#### **6. Bonus (Optional)**  
- Explore 3D projections using PCA/UMAP.  
- Incorporate `Generation` or `BMI` as a color axis in visualizations.  
- Analyze clusters of Pokémon with dual types (`Type 1` + `Type 2`).  

---


### **Questions to Answer**  
- Does `Generation` correlate with stat improvements (e.g., higher `Total` in later generations)?  
- Are dual-type Pokémon harder to cluster than single-type?  
- Which technique (PCA/UMAP/t-SNE) is most suitable for this dataset? Why?  
- How do clusters relate to in-game roles (e.g., tanks, sweepers)?  

---

