# Machine Learning - Advanced Starter

Data loading, train/test split, and basic training already working.

## Your focus:
- Compare multiple algorithms
- Implement cross-validation
- Visualize performance and decision boundaries
- Feature importance and selection
- Understand when models work well vs poorly

In [None]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, classification_report
import pandas as pd

iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(
    iris.data, iris.target, test_size=0.2, random_state=42
)

print(f"Train: {len(X_train)}, Test: {len(X_test)}")

model = DecisionTreeClassifier(random_state=42)
model.fit(X_train, y_train)
print("\nModel trained!")

y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"\nAccuracy: {accuracy * 100:.2f}%")

print("\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=iris.target_names))

print("\nSample Predictions:")
df = pd.DataFrame({
    'Actual': [iris.target_names[i] for i in y_test[:10]],
    'Predicted': [iris.target_names[i] for i in y_pred[:10]]
})
print(df)

print("\nNow compare models and add visualizations!")

## Next Steps

Ask Claude.ai to:
1. Compare 3+ classifiers (KNN, SVM, Random Forest)
2. Implement cross-validation
3. Create confusion matrix heatmap
4. Extract and visualize feature importance
5. Visualize decision boundaries

In [None]:
# Your code here!
