# 🍄 Mushroom Classification using Machine Learning
This notebook demonstrates how to build a machine learning model to classify mushrooms as edible or poisonous based on their physical characteristics.
Dataset source: [UCI Mushroom Classification Dataset](https://www.kaggle.com/datasets/uciml/mushroom-classification)

In [None]:
# 📦 Import Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

In [None]:
# 📁 Load Dataset
df = pd.read_csv('mushrooms.csv')
df.head()

In [None]:
# 🔍 Dataset Overview
df.info()
df.describe()

In [None]:
# 🎯 Check for Missing Values
df.isnull().sum()

In [None]:
# 📊 Visualize Class Distribution
sns.countplot(x='class', data=df)
plt.title('Distribution of Edible vs Poisonous Mushrooms')
plt.show()

In [None]:
# 🧹 Preprocessing: Label Encoding
le = LabelEncoder()
for col in df.columns:
    df[col] = le.fit_transform(df[col])
df.head()

In [None]:
# 🧪 Train-Test Split
X = df.drop('class', axis=1)
y = df['class']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [None]:
# 🔄 Feature Scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

In [None]:
# 🌳 Random Forest Model
rf = RandomForestClassifier(random_state=42)
rf.fit(X_train_scaled, y_train)
rf_preds = rf.predict(X_test_scaled)
print("Random Forest Accuracy:", accuracy_score(y_test, rf_preds))

In [None]:
# 🔥 Gradient Boosting Model
gb = GradientBoostingClassifier(random_state=42)
gb.fit(X_train_scaled, y_train)
gb_preds = gb.predict(X_test_scaled)
print("Gradient Boosting Accuracy:", accuracy_score(y_test, gb_preds))

In [None]:
# 🤖 Neural Network Model
mlp = MLPClassifier(random_state=42, max_iter=500)
mlp.fit(X_train_scaled, y_train)
mlp_preds = mlp.predict(X_test_scaled)
print("Neural Network Accuracy:", accuracy_score(y_test, mlp_preds))

In [None]:
# 📈 Evaluation Metrics (Random Forest)
print(confusion_matrix(y_test, rf_preds))
print(classification_report(y_test, rf_preds))

### ✅ Conclusion
- All three models performed very well, with Random Forest achieving the highest accuracy.
- This notebook demonstrates a basic end-to-end ML workflow using a categorical dataset.
- Next steps could include hyperparameter tuning, cross-validation, and feature importance analysis.