# Machine Learning Experiment with Exercises

This notebook extends the basic ML refresher by adding exercises for you to complete. The initial part reviews loading data, training a model, and evaluating it. Below, you'll find template code and **TODO** markers for each exercise.

## Review: Basic Logistic Regression on Iris Dataset

In [None]:
# !pip install pandas
# !pip install scikit-learn
# !pip install matplotlib

In [None]:
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix
import matplotlib.pyplot as plt

# Load dataset
data = load_iris()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target

# Split data
X = df[data.feature_names]
y = df['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train and evaluate
model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)
preds = model.predict(X_test)
print(f"Accuracy: {accuracy_score(y_test, preds):.2f}")

# Plot confusion matrix
cm = confusion_matrix(y_test, preds)
plt.imshow(cm, cmap='Blues')
plt.title('Confusion Matrix')
plt.xlabel('Predicted')
plt.ylabel('True')
for i in range(cm.shape[0]):
    for j in range(cm.shape[1]):
        plt.text(j, i, cm[i, j], ha='center', va='center', color='white')
plt.show()

## Exercises

Complete the following exercises by filling in the `# TODO` sections. Run each cell to test your implementation.


### Exercise 1: Classification Report & Metrics
**Goal**: Generate a classification report including precision, recall, and F1-score.

**Instructions**:
1. Use `sklearn.metrics.classification_report` on the test set.
2. Display the report as a dictionary or a pandas DataFrame.

In [None]:
from sklearn.metrics import classification_report

# TODO: Generate classification report


# TODO: Convert to DataFrame and display


### Exercise 2: Cross-Validation with SVC
**Goal**: Evaluate an SVM classifier using 5-fold cross-validation.

**Instructions**:
1. Import `sklearn.svm.SVC` and `sklearn.model_selection.cross_val_score`.
2. Create an `SVC` model with `kernel='rbf'`.
3. Perform 5-fold CV on the entire dataset `X, y`.
4. Print the mean and standard deviation of the accuracy scores.

In [None]:
from sklearn.svm import SVC
from sklearn.model_selection import cross_val_score

# TODO: Initialize SVC model


# TODO: Perform cross-validation


# TODO: Display results

### Exercise 3: Hyperparameter Tuning with GridSearchCV
**Goal**: Use grid search to find the best hyperparameters for a Random Forest classifier.

**Instructions**:
1. Import `RandomForestClassifier` and `GridSearchCV`.
2. Define a parameter grid for `n_estimators` (e.g., [50, 100]) and `max_depth` (e.g., [None, 5, 10]).
3. Use 3-fold CV within `GridSearchCV`.
4. Fit on the training data, and print the best params and best score.

In [None]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV

# TODO: Define parameter grid
param_grid = {
    'n_estimators': [50, 100],
    'max_depth': [None, 5, 10]
}

# TODO: Initialize GridSearchCV


# TODO: Fit on training data


# TODO: Print best parameters and score

### Exercise 4: ROC Curve for One-vs-Rest
**Goal**: Plot ROC curves for each class using a One-vs-Rest strategy.

**Instructions**:
1. Import `OneVsRestClassifier`, `roc_curve`, and `auc`.
2. Binarize the target labels.
3. Fit a logistic regression within `OneVsRestClassifier`.
4. Compute ROC curve and AUC for each class and plot all curves.

In [None]:
from sklearn.preprocessing import label_binarize
from sklearn.multiclass import OneVsRestClassifier
from sklearn.metrics import roc_curve, auc

# TODO: Binarize labels



# TODO: Initialize OneVsRestClassifier with LogisticRegression



# TODO: Fit model



# TODO: Compute probabilities and ROC/AUC for each class


---

Once you've completed these exercises, you will have practiced key ML workflow tasks: generating detailed metrics, performing cross-validation, tuning hyperparameters, and plotting ROC curves. Feel free to explore additional models and metrics to deepen your understanding!