# Experiments Log

## Notebook Description
The aim of this notebook is to demonstrate how to gather the hyperparameters and the performance of a model, measured in terms of:
- Accuracy
- Precision
- Recall
- F1 Score
- True Positives (TP)
- False Positives (FP)
- True Negatives (TN)
- False Negatives (FN)

These metrics will be logged together in a dictionary and then saved in a CSV file. This approach allows us to easily retrieve which hyperparameters are producing the highest performing model with the current dataset.

In [None]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Load the Iris dataset
data = load_iris()
X, y = data.data, data.target

# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Show the shape of the data
X_train.shape, X_test.shape, y_train.shape, y_test.shape

In [None]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix
import pandas as pd

# Initialize an empty dictionary to store experiment logs
experiment_logs = {}

# Define hyperparameters
n_estimators = 100
max_depth = 5

# Initialize and train the model
clf = RandomForestClassifier(n_estimators=n_estimators, max_depth=max_depth, random_state=42)
clf.fit(X_train, y_train)

# Make predictions
y_pred = clf.predict(X_test)

# Calculate metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')
conf_matrix = confusion_matrix(y_test, y_pred)

# Log the experiment
experiment_logs['experiment_1'] = {
    'n_estimators': n_estimators,
    'max_depth': max_depth,
    'accuracy': accuracy,
    'precision': precision,
    'recall': recall,
    'f1_score': f1,
    'confusion_matrix': conf_matrix.tolist()
}

# Convert logs to DataFrame and save as CSV
pd.DataFrame.from_dict(experiment_logs, orient='index').to_csv('experiment_logs.csv')

# Show the experiment logs
experiment_logs