### 🍄 Mushroom Classification Using Machine Learning

#### 📌 Project Overview

This project explores the classification of mushrooms as **edible** or **poisonous** using machine learning. The goal is to build accurate models that can predict mushroom safety based on various physical attributes.

The dataset used contains **categorical features** describing cap shape, gill color, odor, stalk characteristics, and more — making it a great real-world example for applying classification algorithms.

---

#### 🧪 Dataset Details

- **Source**: UCI Mushroom Dataset  
- **Rows**: 8,124 mushroom samples  
- **Columns**: 22 categorical features + 1 target label (`edible` or `poisonous`)  
- **Target Variable**: `class` (e = edible, p = poisonous)

Each mushroom is described by features like:

- `odor`
- `gill-color`
- `stalk-shape`
- `spore-print-color`
- `veil-type`, etc.

---

#### 🧹 Preprocessing

Since the dataset is entirely **categorical**, I applied **Label Encoding** to convert categories into numerical values:

```python
from sklearn.preprocessing import LabelEncoder

label_encoders = {}
for column in df.columns:
    le = LabelEncoder()
    df[column] = le.fit_transform(df[column])
    label_encoders[column] = le

#### MODELS TRAINED
 I experimented with multiple classification algorithms:

**Logistic Regression,**
**Random Forest Classifier,**
**Support Vector Machine (SVM),**
**K-Nearest Neighbors (KNN),**

### 📈 Model Evaluation
Each model was evaluated using:

**Accuracy,**

**Precision,**

**Recall,**

**F1 Score,**

**ROC-AUC Curve,**

**Cross-validation,**

**Confusion Matrix,**

### ✅ Best Model: **Random Forest**
Achieved perfect performance across all metrics, which reflects the dataset's strong predictive features — especially odor.

### 📉 Feature Selection
##### Using Recursive Feature Elimination (RFE) and feature importance from Random Forest, I identified the most impactful features:

**odor**

**spore-print-color**

**gill-size**
###### These features alone provided high classification accuracy, proving their significance.

#### 📊 Visualization
##### Histograms revealed clear trends in feature distributions, especially in odor, where some values are heavily associated with poisonous mushrooms. This supports our model's high performance.



### 🤝 Credits
**Dataset**: UCI Machine Learning Repository

**Tools**: Python, pandas, scikit-learn, matplotlib, seaborn
