# 🌲 Random Forest Classifier Project
This notebook demonstrates how to use a **Random Forest Classifier** to build a predictive model.

We’ll walk through:
- Importing necessary libraries
- Loading the dataset
- Exploratory Data Analysis (EDA)
- Data preprocessing
- Model training and evaluation

**Dataset Used**: `Social_Network_Ads.csv` (or you can replace it with your own)

In [None]:
# 📦 Import Libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, accuracy_score, confusion_matrix
import warnings
warnings.filterwarnings('ignore')

In [None]:
# 📥 Load Dataset
df = pd.read_csv("Social_Network_Ads.csv")  # Replace with your dataset path
df.head()

## 📊 Exploratory Data Analysis (EDA)

In [None]:
df.info()
df.describe()

In [None]:
sns.pairplot(df, hue='Purchased')
plt.show()

## 🧹 Data Preprocessing

In [None]:
# Drop ID/User columns if present
df = df.drop(columns=['User ID'], errors='ignore')

# Encoding categorical variable
df['Gender'] = df['Gender'].map({'Male': 0, 'Female': 1})

In [None]:
# Splitting features and labels
X = df.drop('Purchased', axis=1)
y = df['Purchased']

In [None]:
# 🔀 Train-test Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## 🌲 Train Random Forest Classifier

In [None]:
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

## 📈 Model Evaluation

In [None]:
y_pred = model.predict(X_test)

print("Accuracy:", accuracy_score(y_test, y_pred))
print("\nClassification Report:\n", classification_report(y_test, y_pred))

sns.heatmap(confusion_matrix(y_test, y_pred), annot=True, fmt='d', cmap='Blues')
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.title("Confusion Matrix")
plt.show()

## ✅ Conclusion
- The Random Forest classifier performs well on this dataset.
- Further tuning with `GridSearchCV` or `RandomizedSearchCV` can improve results.
- Always evaluate with cross-validation on bigger datasets.

➡️ You can now push this notebook to **GitHub** for version control or sharing your project!