![TinyLogger Logo](artifacts/littlelogger-logo.png)

# 1. The Problem: Losing Track of Experiments

We've all been here. You're in a notebook, trying to find the best hyperparameters for a model. You try a few combinations, and soon your notebook is a mess and you've lost track of which run gave the best score.

```python
# Which one was best again?
# run 1: 0.82 f1
# run 2: 0.81 f1
# run 3: 0.83 f1 (I think this was max_depth=5? Or 7?)
```

**`LittleLogger` solves this.** It's a zero-setup decorator that automatically logs your function's inputs (params) and outputs (metrics) to a simple file.

# 2. Installation & Setup

First, let's install the logger. If you're running this from inside the cloned project repository, you can install it in editable mode.

In [7]:
# Install the package from PyPI
# %pip install littlelogger

# We also need scikit-learn for this demo
# %pip install scikit-learn pandas

In [8]:
import os

import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score, f1_score
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

import littlelogger
from littlelogger import log_run

In [9]:
littlelogger.__version__

'1.0.0'

In [10]:
# Define our log file path
LOG_FILE = "experiment_log.jsonl"

# Let's delete any old logs to start fresh for this demo
if os.path.exists(LOG_FILE):
    os.remove(LOG_FILE)
    print(f"Removed old log file: {LOG_FILE}")

Removed old log file: experiment_log.jsonl


# 3. Create a Reusable Dataset

We'll create a simple, reusable dataset for our demo. This way, we're not regenerating data inside our training loop.

In [11]:
X, y = make_classification(
    n_samples=1000,
    n_features=20,
    n_informative=10,
    n_redundant=5,
    random_state=42
)

X_train, X_test, y_train, y_test = train_test_split(
    X, y,
    test_size=0.2,
    random_state=42
)

print(f"Created dataset: X_train shape {X_train.shape}, X_test shape {X_test.shape}")

Created dataset: X_train shape (800, 20), X_test shape (200, 20)


# 4. Decorate Your Training Functions

We'll create two *different* model training functions. All we have to do is add `@log_run()` to both.

In [12]:
@log_run(log_file=LOG_FILE)
def train_random_forest(n_estimators, max_depth, min_samples_leaf=1):
    """Train a RandomForestClassifier."""
    print(f"Training RandomForest with n_estimators={n_estimators}, max_depth={max_depth}...")
    model = RandomForestClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth,
        min_samples_leaf=min_samples_leaf,
        random_state=42,
        n_jobs=-1
    )
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)

    accuracy = accuracy_score(y_test, y_pred)
    f1 = f1_score(y_test, y_pred, average='weighted')

    return {"accuracy": round(accuracy, 4), "f1_score": round(f1, 4)}

In [13]:
@log_run(log_file=LOG_FILE)
def train_svc(C, kernel, degree=3):
    """Train an SVC (Support Vector Classifier)."""
    print(f"Training SVC with C={C}, kernel={kernel}...")

    # SVCs are sensitive to feature scale, so we use a Pipeline
    pipeline = Pipeline([
        ('scaler', StandardScaler()),
        ('svc', SVC(
            C=C,
            kernel=kernel,
            degree=degree,
            random_state=42
        ))
    ])

    pipeline.fit(X_train, y_train)
    y_pred = pipeline.predict(X_test)

    accuracy = accuracy_score(y_test, y_pred)
    f1 = f1_score(y_test, y_pred, average='weighted')

    return {"accuracy": round(accuracy, 4), "f1_score": round(f1, 4)}

# 5. Run the Experiments

Now we can run *both* sets of experiments, and all the results will go into the *same* log file.

In [14]:
# RandomForest Sweep
rf_param_grid = [
    {'n_estimators': 50, 'max_depth': 5},
    {'n_estimators': 100, 'max_depth': 10},
    {'n_estimators': 200, 'max_depth': None, 'min_samples_leaf': 2},
]

print("--- Starting RandomForest Sweep ---")

for i, params in enumerate(rf_param_grid):
    print(f"\n--- Starting RF Run {i + 1} ---")
    train_random_forest(**params)

print("\n--- All Experiments Finished ---")

--- Starting RandomForest Sweep ---

--- Starting RF Run 1 ---
Training RandomForest with n_estimators=50, max_depth=5...

--- Starting RF Run 2 ---
Training RandomForest with n_estimators=100, max_depth=10...

--- Starting RF Run 3 ---
Training RandomForest with n_estimators=200, max_depth=None...

--- All Experiments Finished ---


In [21]:
# --- SVC Sweep ---
svc_param_grid = [
    {'C': 1.0, 'kernel': 'rbf'},
    {'C': 1.0, 'kernel': 'linear'},
    {'C': 0.5, 'kernel': 'rbf'},
]

print("\n--- Starting SVC Sweep ---")

for i, params in enumerate(svc_param_grid):
    print(f"\n--- Starting SVC Run {i + 1} ---")
    train_svc(**params)

print("\n--- All Experiments Finished ---")


--- Starting SVC Sweep ---

--- Starting SVC Run 1 ---
Training SVC with C=1.0, kernel=rbf...

--- Starting SVC Run 2 ---
Training SVC with C=1.0, kernel=linear...

--- Starting SVC Run 3 ---
Training SVC with C=0.5, kernel=rbf...

--- All Experiments Finished ---


# 6. The Payoff: Analyzing Your Combined Results

This is the best part. We read the *single* log file into pandas.

In [16]:
# This is the magic line!
df_raw = pd.read_json(LOG_FILE, lines=True)

df_raw

Unnamed: 0,timestamp,function_name,runtime_seconds,params,metrics
0,2025-11-02 07:25:38+00:00,train_random_forest,0.45747,"{'n_estimators': 50, 'max_depth': 5, 'min_samp...","{'accuracy': 0.88, 'f1_score': 0.8799}"
1,2025-11-02 07:25:38+00:00,train_random_forest,0.2738,"{'n_estimators': 100, 'max_depth': 10, 'min_sa...","{'accuracy': 0.915, 'f1_score': 0.915}"
2,2025-11-02 07:25:39+00:00,train_random_forest,0.48362,"{'n_estimators': 200, 'max_depth': None, 'min_...","{'accuracy': 0.92, 'f1_score': 0.92}"
3,2025-11-02 07:25:39+00:00,train_svc,0.071125,"{'C': 1.0, 'kernel': 'rbf', 'degree': 3}","{'accuracy': 0.9450000000000001, 'f1_score': 0..."
4,2025-11-02 07:25:39+00:00,train_svc,0.030865,"{'C': 1.0, 'kernel': 'linear', 'degree': 3}","{'accuracy': 0.805, 'f1_score': 0.804900000000..."
5,2025-11-02 07:25:39+00:00,train_svc,0.025628,"{'C': 0.5, 'kernel': 'rbf', 'degree': 3}","{'accuracy': 0.93, 'f1_score': 0.93}"


Now, when we use `json_normalize`, pandas will *automatically* create columns for all parameters (`n_estimators`, `C`, `kernel`, etc.) and fill in `NaN` for the runs where that parameter didn't apply. 

This makes comparing different model types incredibly easy!

In [17]:
# Use json_normalize and then add_prefix
df_params = pd.json_normalize(df_raw['params']).add_prefix('param_')
df_metrics = pd.json_normalize(df_raw['metrics']).add_prefix('metric_')

# Get the other columns we want
df_main = df_raw[['timestamp', 'function_name', 'runtime_seconds']]

# Join them all together into one clean DataFrame
df = pd.concat([df_main, df_params, df_metrics], axis=1)

# Re-order columns to group params and metrics for clarity
all_cols = (list(df_main.columns) +
           sorted([c for c in df.columns if c.startswith('param_')]) +
           sorted([c for c in df.columns if c.startswith('metric_')]))

df[all_cols]

Unnamed: 0,timestamp,function_name,runtime_seconds,param_C,param_degree,param_kernel,param_max_depth,param_min_samples_leaf,param_n_estimators,metric_accuracy,metric_f1_score
0,2025-11-02 07:25:38+00:00,train_random_forest,0.45747,,,,5.0,1.0,50.0,0.88,0.8799
1,2025-11-02 07:25:38+00:00,train_random_forest,0.2738,,,,10.0,1.0,100.0,0.915,0.915
2,2025-11-02 07:25:39+00:00,train_random_forest,0.48362,,,,,2.0,200.0,0.92,0.92
3,2025-11-02 07:25:39+00:00,train_svc,0.071125,1.0,3.0,rbf,,,,0.945,0.945
4,2025-11-02 07:25:39+00:00,train_svc,0.030865,1.0,3.0,linear,,,,0.805,0.8049
5,2025-11-02 07:25:39+00:00,train_svc,0.025628,0.5,3.0,rbf,,,,0.93,0.93


### Now, finding your *overall* best run is trivial:

In [18]:
# Sort by our key metric to find the best-performing run
df_sorted = df.sort_values(by="metric_f1_score", ascending=False)

df_sorted.head(1)

Unnamed: 0,timestamp,function_name,runtime_seconds,param_n_estimators,param_max_depth,param_min_samples_leaf,param_C,param_kernel,param_degree,metric_accuracy,metric_f1_score
3,2025-11-02 07:25:39+00:00,train_svc,0.071125,,,,1.0,rbf,3.0,0.945,0.945


# 7. Feature: Graceful Error Handling

A key feature of `littlelogger` is that it **will never crash your script.**

If you return something that can't be saved to JSON, it will simply print a warning and continue.

In [22]:
@log_run(log_file=LOG_FILE)
def bad_function():
    # We return `object()`, which is not JSON-serializable
    return {"model_object": object()}

print("\nRunning a function that will fail to log...")

# Note: This will print a UserWarning, but NOT crash!
result = bad_function()

print("\nScript continued successfully!")
print(f"We still got our return value: {result}")


Running a function that will fail to log...

Script continued successfully!
We still got our return value: {'model_object': <object object at 0x000002C268EA6090>}


  result = bad_function()


If we check the log file, we can see that only the 6 successful runs are in it. The 7th, failed run was skipped, and our script was unharmed.

In [20]:
df_final = pd.read_json(LOG_FILE, lines=True)
print(f"Total runs logged: {len(df_final)}")

Total runs logged: 6
