# ðŸŒ³ Classificazione Iris con Decision Tree e Random Forest  
Obiettivo: classificare i fiori Iris in 3 categorie usando modelli ad albero.  
In questo notebook eseguiamo:

1. Caricamento del dataset Iris  
2. Train/Test split  
3. Standardizzazione delle feature  
4. Addestramento con Decision Tree  
5. Addestramento con Random Forest  
6. Valutazione e confronto  
7. Visualizzazione delle feature importance  

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report

sns.set(style="whitegrid")

## 1. Caricamento del dataset Iris 

In [None]:
iris = load_iris()
X, y = iris.data, iris.target
feature_names = iris.feature_names
target_names = iris.target_names

df = pd.DataFrame(X, columns=feature_names)
df["target"] = y
df["class"] = df["target"].map(dict(enumerate(target_names)))
df.head()

## 2. Train/Test split

In [None]:
X = df[feature_names]
y = df["target"]

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

## 3. Standardizzazione delle feature

In [None]:
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

## 4. Addestramento con Decision tree

In [None]:
tree = DecisionTreeClassifier(random_state=42)
tree.fit(X_train_scaled, y_train)

y_pred_tree = tree.predict(X_test_scaled)
acc_tree = accuracy_score(y_test, y_pred_tree)

print(f"Accuracy Decision Tree: {acc_tree:.3f}")
print(classification_report(y_test, y_pred_tree, target_names=target_names))

## 5. Addestramento con Random Forest

In [None]:
forest = RandomForestClassifier(n_estimators=100, random_state=42)
forest.fit(X_train_scaled, y_train)

y_pred_forest = forest.predict(X_test_scaled)
acc_forest = accuracy_score(y_test, y_pred_forest)

print(f"Accuracy Random Forest: {acc_forest:.3f}")
print(classification_report(y_test, y_pred_forest, target_names=target_names))

## 6. Visualizzazione delle feature importance 

In [None]:
importances = forest.feature_importances_
indices = np.argsort(importances)[::-1]

plt.figure(figsize=(8, 5))
sns.barplot(x=importances[indices], y=np.array(feature_names)[indices])
plt.title("Feature Importance â€” Random Forest")
plt.xlabel("Importanza")
plt.ylabel("Feature")
plt.show()

## âœ… Conclusioni

- Il dataset Iris Ã¨ stato classificato con due modelli: Decision Tree e Random Forest  
- Entrambi i modelli hanno ottenuto ottime performance  
- Random Forest ha mostrato maggiore robustezza e stabilitÃ   
- Le feature piÃ¹ importanti sono state visualizzate nel grafico finale 