# 🌳 Decision Tree Classification
### Internship Task 1 - Machine Learning (EliteTech)

This notebook implements a Decision Tree Classifier using the popular **Iris dataset**. It includes:
- Exploratory Data Analysis (EDA)
- Decision Tree Model Training & Visualization
- Evaluation using Accuracy, Confusion Matrix, and Classification Report
- Interactive Prediction Tool
- Scope for improvement

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from ipywidgets import interact, widgets
import warnings
warnings.filterwarnings('ignore')


In [None]:
# Load Iris dataset
iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['target'] = iris.target
df['target_name'] = df['target'].apply(lambda x: iris.target_names[x])
df.head()


In [None]:
# Statistical Summary
df.describe()


In [None]:
# Visualize class distribution
sns.countplot(x='target_name', data=df)
plt.title("Class Distribution")
plt.show()


In [None]:
sns.heatmap(df.drop(columns='target_name').corr(), annot=True, cmap='coolwarm')
plt.title("Feature Correlation Matrix")
plt.show()


In [None]:
# Feature and target selection
X = df[iris.feature_names]
y = df['target']

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Model training
model = DecisionTreeClassifier(random_state=42)
model.fit(X_train, y_train)


In [None]:
plt.figure(figsize=(12, 8))
plot_tree(model, filled=True, feature_names=iris.feature_names, class_names=iris.target_names)
plt.title("Decision Tree Visualization")
plt.show()


In [None]:
y_pred = model.predict(X_test)
print("Accuracy Score:", accuracy_score(y_test, y_pred))
print("\nConfusion Matrix:\n", confusion_matrix(y_test, y_pred))
print("\nClassification Report:\n", classification_report(y_test, y_pred, target_names=iris.target_names))


### 📊 Evaluation Metrics Explained
- **Accuracy**: Ratio of correct predictions to total predictions.
- **Confusion Matrix**: Table showing correct and incorrect predictions for each class.
- **Precision, Recall, F1-Score**: Detailed per-class performance metrics.


In [None]:
@interact(
    sepal_length=widgets.FloatSlider(min=4.0, max=8.0, step=0.1, value=5.0),
    sepal_width=widgets.FloatSlider(min=2.0, max=4.5, step=0.1, value=3.0),
    petal_length=widgets.FloatSlider(min=1.0, max=7.0, step=0.1, value=4.0),
    petal_width=widgets.FloatSlider(min=0.1, max=2.5, step=0.1, value=1.0)
)
def predict_species(sepal_length, sepal_width, petal_length, petal_width):
    sample = np.array([[sepal_length, sepal_width, petal_length, petal_width]])
    prediction = model.predict(sample)[0]
    print(f"🔍 Predicted Species: {iris.target_names[prediction]}")


### 🔧 Scope for Improvement
- Tune hyperparameters like `max_depth`, `min_samples_split`, etc.
- Try other classifiers like Random Forests or Gradient Boosted Trees.
- Use cross-validation for more robust evaluation.
- Explore feature engineering techniques.
