# LG-CoTrain: All Disasters Re-Run

This notebook re-runs the **LG-CoTrain** co-training pipeline across **all 10 disaster
events** with a **configurable pseudo-label source** and stores results in a
**named sub-folder** under `results/`.

### Why a separate notebook?

We may run experiments multiple times with different pseudo-label sets (e.g.,
gpt-4o vs llama-3, or multiple runs of the same model). Each run is stored in
its own sub-folder so results are never overwritten.

### Configuration (Cell 2)

Edit the following variables in the **Configuration** cell before running:

| Variable | Description |
|---|---|
| `PSEUDO_LABEL_SOURCE` | Name of the pseudo-label directory under `data/pseudo-labelled/` |
| `RUN_NAME` | Sub-folder name under `results/` for this run |

### Resume Support

Same two-level resume as notebook 02:

1. **Event level**: Events with all 12 `metrics.json` files are skipped entirely.
2. **Experiment level**: Individual `(budget, seed_set)` combinations with existing
   results are skipped within each event.

In [1]:
import json
import statistics
import sys
import time
from pathlib import Path

def _find_repo_root(marker: str = "lg_cotrain") -> Path:
    for candidate in [Path().resolve()] + list(Path().resolve().parents):
        if (candidate / marker).is_dir():
            return candidate
    raise RuntimeError(
        f"Cannot find repo root: no ancestor directory contains '{marker}/'. "
        "Run the notebook from inside the repository."
    )

repo_root = _find_repo_root()
if str(repo_root) not in sys.path:
    sys.path.insert(0, str(repo_root))

import matplotlib.pyplot as plt
import numpy as np

from lg_cotrain.run_all import BUDGETS, SEED_SETS, run_all_experiments, format_summary_table

print(f"Repo root: {repo_root}")
print(f"Budgets: {BUDGETS}")
print(f"Seed sets: {SEED_SETS}")
print(f"Experiments per event: {len(BUDGETS) * len(SEED_SETS)}")

Repo root: D:\Workspace\Co-Training
Budgets: [5, 10, 25, 50]
Seed sets: [1, 2, 3]
Experiments per event: 12


In [2]:
# ---- User-editable configuration ----
PSEUDO_LABEL_SOURCE = "gpt-4o"      # Change to use different pseudo-labels (e.g. "llama-3")
RUN_NAME = "gpt-4o-run-2"            # Sub-folder name under results/

DATA_ROOT = str(repo_root / "data")
RESULTS_ROOT = str(repo_root / "results" / RUN_NAME)

print(f"Pseudo-label source: {PSEUDO_LABEL_SOURCE}")
print(f"Run name: {RUN_NAME}")
print(f"Data root: {DATA_ROOT}")
print(f"Results root: {RESULTS_ROOT}")

Pseudo-label source: gpt-4o
Run name: gpt-4o-run-2
Data root: D:\Workspace\Co-Training\data
Results root: D:\Workspace\Co-Training\results\gpt-4o-run-2


In [3]:
def is_event_complete(event, results_root):
    """Check if all 12 metrics.json files exist for an event."""
    for budget in BUDGETS:
        for seed_set in SEED_SETS:
            path = Path(results_root) / event / f"{budget}_set{seed_set}" / "metrics.json"
            if not path.exists():
                return False
    return True

# Discover events from data directory
data_dir = Path(DATA_ROOT) / "original"
all_events = sorted(p.name for p in data_dir.iterdir() if p.is_dir())

completed_events = [e for e in all_events if is_event_complete(e, RESULTS_ROOT)]
pending_events = [e for e in all_events if e not in completed_events]

print(f"Found {len(all_events)} events total")
print(f"  Completed: {len(completed_events)} ({len(completed_events) * 12} experiments)")
print(f"  Pending:   {len(pending_events)} (up to {len(pending_events) * 12} experiments)")

if completed_events:
    print(f"\nCompleted events (will be skipped):")
    for e in completed_events:
        print(f"  - {e}")

if pending_events:
    print(f"\nPending events (will be run):")
    for e in pending_events:
        print(f"  - {e}")

Found 10 events total
  Completed: 0 (0 experiments)
  Pending:   10 (up to 120 experiments)

Pending events (will be run):
  - california_wildfires_2018
  - canada_wildfires_2016
  - cyclone_idai_2019
  - hurricane_dorian_2019
  - hurricane_florence_2018
  - hurricane_harvey_2017
  - hurricane_irma_2017
  - hurricane_maria_2017
  - kaikoura_earthquake_2016
  - kerala_floods_2018


## Running Experiments

For each pending event, we call `run_all_experiments` with the configured
`pseudo_label_source` and `results_root` pointing to our named sub-folder.

Individual experiments that already have `metrics.json` are automatically
skipped (useful if the notebook crashed mid-event).

In [None]:
class ProgressTracker:
    """Track global progress across all experiments."""

    def __init__(self, total, already_done, start_time):
        self.total = total
        self.done = already_done
        self.start_time = start_time

    def update(self, event, budget, seed_set, status):
        self.done += 1
        elapsed = time.time() - self.start_time
        pct = 100.0 * self.done / self.total
        elapsed_h = elapsed / 3600

        remaining = self.total - self.done
        if elapsed > 0 and self.done > 0:
            eta_h = (elapsed / self.done) * remaining / 3600
        else:
            eta_h = 0

        print(
            f"[PROGRESS] {self.done}/{self.total} ({pct:.1f}%)"
            f" | Elapsed: {elapsed_h:.2f}h | ETA: {eta_h:.2f}h"
        )

# Count already-completed experiments (from previous runs)
already_done = sum(
    1
    for e in all_events
    for b in BUDGETS
    for s in SEED_SETS
    if (Path(RESULTS_ROOT) / e / f"{b}_set{s}" / "metrics.json").exists()
)
total_experiments = len(all_events) * len(BUDGETS) * len(SEED_SETS)

print(f"Total experiments: {total_experiments}")
print(f"Already completed: {already_done}")
print(f"Remaining: {total_experiments - already_done}")

all_event_results = {}
overall_start = time.time()
tracker = ProgressTracker(total_experiments, already_done, overall_start)

# Run pending events
for i, event in enumerate(pending_events, 1):
    print(f"\n{'=' * 60}")
    print(f"Event {i}/{len(pending_events)}: {event}")
    print(f"{'=' * 60}")

    results = run_all_experiments(
        event,
        pseudo_label_source=PSEUDO_LABEL_SOURCE,
        data_root=DATA_ROOT,
        results_root=RESULTS_ROOT,
        _on_experiment_done=tracker.update,
    )
    all_event_results[event] = results

    print()
    print(format_summary_table(results, event))

# Load results for already-completed events
for event in completed_events:
    results = []
    for budget in BUDGETS:
        for seed_set in SEED_SETS:
            path = Path(RESULTS_ROOT) / event / f"{budget}_set{seed_set}" / "metrics.json"
            with open(path) as f:
                results.append(json.load(f))
    all_event_results[event] = results

overall_elapsed = time.time() - overall_start
print(f"\n{'=' * 60}")
print(f"All events done in {overall_elapsed / 3600:.2f}h")
print(f"Total events with results: {len(all_event_results)}")

Total experiments: 120
Already completed: 0
Remaining: 120

Event 1/10: california_wildfires_2018


  from .autonotebook import tqdm as notebook_tqdm


[1/12] budget=5, seed=1 -- starting...


2026-02-18 13:26:52,260 - lg_cotrain - INFO - Starting LG-CoTrain: event=california_wildfires_2018, budget=5, seed_set=1
2026-02-18 13:26:52,292 - lg_cotrain - INFO - Detected 10 classes for event california_wildfires_2018: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 13:26:52,303 - lg_cotrain - INFO - D_l1: 30, D_l2: 20, D_LG: 5113
2026-02-18 13:26:52,305 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1072.54it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1091.60it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 13:27:10,883 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.0981, m

[1/12] budget=5, seed=1 -- done (macro_f1=0.6405)
[PROGRESS] 1/120 (0.8%) | Elapsed: 0.25h | ETA: 29.76h
[2/12] budget=5, seed=2 -- starting...


2026-02-18 13:41:48,308 - lg_cotrain - INFO - Starting LG-CoTrain: event=california_wildfires_2018, budget=5, seed_set=2
2026-02-18 13:41:48,357 - lg_cotrain - INFO - Detected 10 classes for event california_wildfires_2018: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 13:41:48,366 - lg_cotrain - INFO - D_l1: 30, D_l2: 20, D_LG: 5113
2026-02-18 13:41:48,368 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1047.42it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1150.91it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 13:42:07,741 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.0968, m

[2/12] budget=5, seed=2 -- done (macro_f1=0.6377)
[PROGRESS] 2/120 (1.7%) | Elapsed: 0.50h | ETA: 29.72h
[3/12] budget=5, seed=3 -- starting...


2026-02-18 13:57:01,634 - lg_cotrain - INFO - Starting LG-CoTrain: event=california_wildfires_2018, budget=5, seed_set=3
2026-02-18 13:57:01,682 - lg_cotrain - INFO - Detected 10 classes for event california_wildfires_2018: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 13:57:01,691 - lg_cotrain - INFO - D_l1: 30, D_l2: 20, D_LG: 5113
2026-02-18 13:57:01,693 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1169.82it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1110.97it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 13:57:22,364 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1056, m

[3/12] budget=5, seed=3 -- done (macro_f1=0.6231)
[PROGRESS] 3/120 (2.5%) | Elapsed: 0.75h | ETA: 29.40h
[4/12] budget=10, seed=1 -- starting...


2026-02-18 14:12:01,789 - lg_cotrain - INFO - Starting LG-CoTrain: event=california_wildfires_2018, budget=10, seed_set=1
2026-02-18 14:12:01,845 - lg_cotrain - INFO - Detected 10 classes for event california_wildfires_2018: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 14:12:01,854 - lg_cotrain - INFO - D_l1: 50, D_l2: 50, D_LG: 5063
2026-02-18 14:12:01,856 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1206.87it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1094.29it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 14:12:20,950 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1004, 

[4/12] budget=10, seed=1 -- done (macro_f1=0.6291)
[PROGRESS] 4/120 (3.3%) | Elapsed: 0.99h | ETA: 28.72h
[5/12] budget=10, seed=2 -- starting...


2026-02-18 14:26:12,791 - lg_cotrain - INFO - Starting LG-CoTrain: event=california_wildfires_2018, budget=10, seed_set=2
2026-02-18 14:26:12,839 - lg_cotrain - INFO - Detected 10 classes for event california_wildfires_2018: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 14:26:12,848 - lg_cotrain - INFO - D_l1: 50, D_l2: 50, D_LG: 5063
2026-02-18 14:26:12,850 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1202.79it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1237.68it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 14:26:31,709 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.0967, 

[5/12] budget=10, seed=2 -- done (macro_f1=0.6440)
[PROGRESS] 5/120 (4.2%) | Elapsed: 1.23h | ETA: 28.39h
[6/12] budget=10, seed=3 -- starting...


2026-02-18 14:40:51,520 - lg_cotrain - INFO - Starting LG-CoTrain: event=california_wildfires_2018, budget=10, seed_set=3
2026-02-18 14:40:51,569 - lg_cotrain - INFO - Detected 10 classes for event california_wildfires_2018: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 14:40:51,579 - lg_cotrain - INFO - D_l1: 50, D_l2: 50, D_LG: 5063
2026-02-18 14:40:51,580 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1187.39it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1175.77it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 14:41:10,327 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1066, 

[6/12] budget=10, seed=3 -- done (macro_f1=0.6125)
[PROGRESS] 6/120 (5.0%) | Elapsed: 1.47h | ETA: 27.97h
[7/12] budget=25, seed=1 -- starting...


2026-02-18 14:55:08,220 - lg_cotrain - INFO - Starting LG-CoTrain: event=california_wildfires_2018, budget=25, seed_set=1
2026-02-18 14:55:08,262 - lg_cotrain - INFO - Detected 10 classes for event california_wildfires_2018: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 14:55:08,270 - lg_cotrain - INFO - D_l1: 130, D_l2: 120, D_LG: 4913
2026-02-18 14:55:08,270 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1247.00it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1189.34it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 14:55:27,230 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1076

[7/12] budget=25, seed=1 -- done (macro_f1=0.6596)
[PROGRESS] 7/120 (5.8%) | Elapsed: 1.71h | ETA: 27.53h
[8/12] budget=25, seed=2 -- starting...


2026-02-18 15:09:07,455 - lg_cotrain - INFO - Starting LG-CoTrain: event=california_wildfires_2018, budget=25, seed_set=2
2026-02-18 15:09:07,508 - lg_cotrain - INFO - Detected 10 classes for event california_wildfires_2018: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 15:09:07,516 - lg_cotrain - INFO - D_l1: 130, D_l2: 120, D_LG: 4913
2026-02-18 15:09:07,521 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1202.28it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1182.40it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 15:09:26,466 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1022

[8/12] budget=25, seed=2 -- done (macro_f1=0.6554)
[PROGRESS] 8/120 (6.7%) | Elapsed: 1.95h | ETA: 27.24h
[9/12] budget=25, seed=3 -- starting...


2026-02-18 15:23:32,395 - lg_cotrain - INFO - Starting LG-CoTrain: event=california_wildfires_2018, budget=25, seed_set=3
2026-02-18 15:23:32,440 - lg_cotrain - INFO - Detected 10 classes for event california_wildfires_2018: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 15:23:32,450 - lg_cotrain - INFO - D_l1: 130, D_l2: 120, D_LG: 4913
2026-02-18 15:23:32,453 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1056.38it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1168.84it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 15:23:51,506 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1079

[9/12] budget=25, seed=3 -- done (macro_f1=0.6418)
[PROGRESS] 9/120 (7.5%) | Elapsed: 2.18h | ETA: 26.90h
[10/12] budget=50, seed=1 -- starting...


2026-02-18 15:37:38,363 - lg_cotrain - INFO - Starting LG-CoTrain: event=california_wildfires_2018, budget=50, seed_set=1
2026-02-18 15:37:38,416 - lg_cotrain - INFO - Detected 10 classes for event california_wildfires_2018: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 15:37:38,426 - lg_cotrain - INFO - D_l1: 250, D_l2: 250, D_LG: 4663
2026-02-18 15:37:38,428 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1164.94it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1286.11it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 15:37:57,719 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1094

[10/12] budget=50, seed=1 -- done (macro_f1=0.6379)
[PROGRESS] 10/120 (8.3%) | Elapsed: 2.41h | ETA: 26.52h
[11/12] budget=50, seed=2 -- starting...


2026-02-18 15:51:26,826 - lg_cotrain - INFO - Starting LG-CoTrain: event=california_wildfires_2018, budget=50, seed_set=2
2026-02-18 15:51:26,876 - lg_cotrain - INFO - Detected 10 classes for event california_wildfires_2018: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 15:51:26,885 - lg_cotrain - INFO - D_l1: 250, D_l2: 250, D_LG: 4663
2026-02-18 15:51:26,887 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1243.79it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1249.68it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 15:51:46,145 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1000

[11/12] budget=50, seed=2 -- done (macro_f1=0.6578)
[PROGRESS] 11/120 (9.2%) | Elapsed: 2.64h | ETA: 26.17h
[12/12] budget=50, seed=3 -- starting...


2026-02-18 16:05:14,856 - lg_cotrain - INFO - Starting LG-CoTrain: event=california_wildfires_2018, budget=50, seed_set=3
2026-02-18 16:05:14,900 - lg_cotrain - INFO - Detected 10 classes for event california_wildfires_2018: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 16:05:14,911 - lg_cotrain - INFO - D_l1: 250, D_l2: 250, D_LG: 4663
2026-02-18 16:05:14,911 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1147.85it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1155.68it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 16:05:34,349 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1073

[12/12] budget=50, seed=3 -- done (macro_f1=0.6349)
[PROGRESS] 12/120 (10.0%) | Elapsed: 2.88h | ETA: 25.94h

Batch complete: 12 ran, 0 skipped, 0 failed (10373.9s total)

=== Results for california_wildfires_2018 ===

Budget    Seed 1              Seed 2              Seed 3                  Mean       Std
           ErrR%  MacF1   ErrR%  MacF1   ErrR%  MacF1     ErrR%     MacF1
-------------------------------------------------------------------------
     5     26.90 0.6405   28.54 0.6377   27.58 0.6231  27.68+/-0.83   0.6338+/-0.0093
    10     27.24 0.6291   28.54 0.6440   29.77 0.6125  28.52+/-1.27   0.6286+/-0.0158
    25     27.79 0.6596   28.06 0.6554   27.38 0.6418  27.74+/-0.34   0.6522+/-0.0093
    50     29.91 0.6379   26.76 0.6578   28.75 0.6349  28.47+/-1.59   0.6435+/-0.0124

Event 2/10: canada_wildfires_2016
[1/12] budget=5, seed=1 -- starting...


2026-02-18 16:19:45,948 - lg_cotrain - INFO - Starting LG-CoTrain: event=canada_wildfires_2016, budget=5, seed_set=1
2026-02-18 16:19:46,006 - lg_cotrain - INFO - Detected 8 classes for event canada_wildfires_2016: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 16:19:46,011 - lg_cotrain - INFO - D_l1: 24, D_l2: 16, D_LG: 1529
2026-02-18 16:19:46,012 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1063.32it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1005.63it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 16:19:52,787 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1048, mean_prob2=0.1358
2026-02-18 16:19:58,273 - lg_cotrain - INFO -

[1/12] budget=5, seed=1 -- done (macro_f1=0.5683)
[PROGRESS] 13/120 (10.8%) | Elapsed: 2.96h | ETA: 24.34h
[2/12] budget=5, seed=2 -- starting...


2026-02-18 16:24:14,344 - lg_cotrain - INFO - Starting LG-CoTrain: event=canada_wildfires_2016, budget=5, seed_set=2
2026-02-18 16:24:14,375 - lg_cotrain - INFO - Detected 8 classes for event canada_wildfires_2016: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 16:24:14,380 - lg_cotrain - INFO - D_l1: 24, D_l2: 16, D_LG: 1529
2026-02-18 16:24:14,380 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1042.84it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|████████████████| 199/199 [00:00<00:00, 961.99it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 16:24:21,504 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1398, mean_prob2=0.1292
2026-02-18 16:24:27,234 - lg_cotrain - INFO -

[2/12] budget=5, seed=2 -- done (macro_f1=0.5202)
[PROGRESS] 14/120 (11.7%) | Elapsed: 3.03h | ETA: 22.97h
[3/12] budget=5, seed=3 -- starting...


2026-02-18 16:28:51,436 - lg_cotrain - INFO - Starting LG-CoTrain: event=canada_wildfires_2016, budget=5, seed_set=3
2026-02-18 16:28:51,463 - lg_cotrain - INFO - Detected 8 classes for event canada_wildfires_2016: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 16:28:51,467 - lg_cotrain - INFO - D_l1: 24, D_l2: 16, D_LG: 1529
2026-02-18 16:28:51,468 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1159.62it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1133.46it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 16:28:58,021 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1292, mean_prob2=0.1199
2026-02-18 16:29:03,698 - lg_cotrain - INFO -

[3/12] budget=5, seed=3 -- done (macro_f1=0.5625)
[PROGRESS] 15/120 (12.5%) | Elapsed: 3.11h | ETA: 21.77h
[4/12] budget=10, seed=1 -- starting...


2026-02-18 16:33:21,493 - lg_cotrain - INFO - Starting LG-CoTrain: event=canada_wildfires_2016, budget=10, seed_set=1
2026-02-18 16:33:21,526 - lg_cotrain - INFO - Detected 8 classes for event canada_wildfires_2016: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 16:33:21,532 - lg_cotrain - INFO - D_l1: 40, D_l2: 40, D_LG: 1489
2026-02-18 16:33:21,533 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1133.13it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1077.62it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 16:33:28,219 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1075, mean_prob2=0.1384
2026-02-18 16:33:33,769 - lg_cotrain - INFO 

[4/12] budget=10, seed=1 -- done (macro_f1=0.6029)
[PROGRESS] 16/120 (13.3%) | Elapsed: 3.18h | ETA: 20.70h
[5/12] budget=10, seed=2 -- starting...


2026-02-18 16:37:50,185 - lg_cotrain - INFO - Starting LG-CoTrain: event=canada_wildfires_2016, budget=10, seed_set=2
2026-02-18 16:37:50,212 - lg_cotrain - INFO - Detected 8 classes for event canada_wildfires_2016: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 16:37:50,217 - lg_cotrain - INFO - D_l1: 40, D_l2: 40, D_LG: 1489
2026-02-18 16:37:50,219 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1161.52it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1156.27it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 16:37:56,781 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1304, mean_prob2=0.1303
2026-02-18 16:38:02,373 - lg_cotrain - INFO 

[5/12] budget=10, seed=2 -- done (macro_f1=0.5938)
[PROGRESS] 17/120 (14.2%) | Elapsed: 3.26h | ETA: 19.75h
[6/12] budget=10, seed=3 -- starting...


2026-02-18 16:42:21,062 - lg_cotrain - INFO - Starting LG-CoTrain: event=canada_wildfires_2016, budget=10, seed_set=3
2026-02-18 16:42:21,093 - lg_cotrain - INFO - Detected 8 classes for event canada_wildfires_2016: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 16:42:21,098 - lg_cotrain - INFO - D_l1: 40, D_l2: 40, D_LG: 1489
2026-02-18 16:42:21,099 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1034.52it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1012.78it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 16:42:27,885 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1236, mean_prob2=0.1257
2026-02-18 16:42:33,428 - lg_cotrain - INFO 

[6/12] budget=10, seed=3 -- done (macro_f1=0.6116)
[PROGRESS] 18/120 (15.0%) | Elapsed: 3.33h | ETA: 18.89h
[7/12] budget=25, seed=1 -- starting...


2026-02-18 16:46:51,381 - lg_cotrain - INFO - Starting LG-CoTrain: event=canada_wildfires_2016, budget=25, seed_set=1
2026-02-18 16:46:51,408 - lg_cotrain - INFO - Detected 8 classes for event canada_wildfires_2016: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 16:46:51,413 - lg_cotrain - INFO - D_l1: 98, D_l2: 91, D_LG: 1380
2026-02-18 16:46:51,414 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1062.22it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1120.91it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 16:46:58,416 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1074, mean_prob2=0.1377
2026-02-18 16:47:04,087 - lg_cotrain - INFO 

[7/12] budget=25, seed=1 -- done (macro_f1=0.6072)
[PROGRESS] 19/120 (15.8%) | Elapsed: 3.41h | ETA: 18.10h
[8/12] budget=25, seed=2 -- starting...


2026-02-18 16:51:06,741 - lg_cotrain - INFO - Starting LG-CoTrain: event=canada_wildfires_2016, budget=25, seed_set=2
2026-02-18 16:51:06,769 - lg_cotrain - INFO - Detected 8 classes for event canada_wildfires_2016: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 16:51:06,774 - lg_cotrain - INFO - D_l1: 98, D_l2: 91, D_LG: 1380
2026-02-18 16:51:06,774 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1210.95it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1157.01it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 16:51:13,572 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1423, mean_prob2=0.1325
2026-02-18 16:51:19,165 - lg_cotrain - INFO 

[8/12] budget=25, seed=2 -- done (macro_f1=0.5994)
[PROGRESS] 20/120 (16.7%) | Elapsed: 3.48h | ETA: 17.39h
[9/12] budget=25, seed=3 -- starting...


2026-02-18 16:55:31,574 - lg_cotrain - INFO - Starting LG-CoTrain: event=canada_wildfires_2016, budget=25, seed_set=3
2026-02-18 16:55:31,602 - lg_cotrain - INFO - Detected 8 classes for event canada_wildfires_2016: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 16:55:31,607 - lg_cotrain - INFO - D_l1: 98, D_l2: 91, D_LG: 1380
2026-02-18 16:55:31,608 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1213.58it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1186.71it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 16:55:38,309 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1237, mean_prob2=0.1271
2026-02-18 16:55:43,919 - lg_cotrain - INFO 

[9/12] budget=25, seed=3 -- done (macro_f1=0.6146)
[PROGRESS] 21/120 (17.5%) | Elapsed: 3.55h | ETA: 16.75h
[10/12] budget=50, seed=1 -- starting...


2026-02-18 16:59:55,324 - lg_cotrain - INFO - Starting LG-CoTrain: event=canada_wildfires_2016, budget=50, seed_set=1
2026-02-18 16:59:55,334 - lg_cotrain - INFO - Detected 8 classes for event canada_wildfires_2016: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 16:59:55,334 - lg_cotrain - INFO - D_l1: 182, D_l2: 182, D_LG: 1205
2026-02-18 16:59:55,334 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1181.12it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1137.46it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 17:00:02,381 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1182, mean_prob2=0.1448
2026-02-18 17:00:08,231 - lg_cotrain - INF

[10/12] budget=50, seed=1 -- done (macro_f1=0.6098)
[PROGRESS] 22/120 (18.3%) | Elapsed: 3.63h | ETA: 16.15h
[11/12] budget=50, seed=2 -- starting...


2026-02-18 17:04:22,056 - lg_cotrain - INFO - Starting LG-CoTrain: event=canada_wildfires_2016, budget=50, seed_set=2
2026-02-18 17:04:22,086 - lg_cotrain - INFO - Detected 8 classes for event canada_wildfires_2016: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 17:04:22,092 - lg_cotrain - INFO - D_l1: 182, D_l2: 182, D_LG: 1205
2026-02-18 17:04:22,093 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1030.63it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1144.36it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 17:04:29,848 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1365, mean_prob2=0.1344
2026-02-18 17:04:36,222 - lg_cotrain - INF

[11/12] budget=50, seed=2 -- done (macro_f1=0.6061)
[PROGRESS] 23/120 (19.2%) | Elapsed: 3.70h | ETA: 15.59h
[12/12] budget=50, seed=3 -- starting...


2026-02-18 17:08:39,639 - lg_cotrain - INFO - Starting LG-CoTrain: event=canada_wildfires_2016, budget=50, seed_set=3
2026-02-18 17:08:39,667 - lg_cotrain - INFO - Detected 8 classes for event canada_wildfires_2016: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 17:08:39,672 - lg_cotrain - INFO - D_l1: 182, D_l2: 182, D_LG: 1205
2026-02-18 17:08:39,673 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1125.20it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1104.85it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 17:08:47,012 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1162, mean_prob2=0.1397
2026-02-18 17:08:53,222 - lg_cotrain - INF

[12/12] budget=50, seed=3 -- done (macro_f1=0.6280)
[PROGRESS] 24/120 (20.0%) | Elapsed: 3.77h | ETA: 15.09h

Batch complete: 12 ran, 0 skipped, 0 failed (3202.2s total)

=== Results for canada_wildfires_2016 ===

Budget    Seed 1              Seed 2              Seed 3                  Mean       Std
           ErrR%  MacF1   ErrR%  MacF1   ErrR%  MacF1     ErrR%     MacF1
-------------------------------------------------------------------------
     5     22.70 0.5683   31.91 0.5202   26.29 0.5625  26.97+/-4.64   0.5503+/-0.0262
    10     24.49 0.6029   22.92 0.5938   23.82 0.6116  23.75+/-0.79   0.6027+/-0.0089
    25     23.82 0.6072   23.60 0.5994   21.80 0.6146  23.07+/-1.11   0.6071+/-0.0076
    50     21.35 0.6098   24.27 0.6061   22.47 0.6280  22.70+/-1.47   0.6146+/-0.0117

Event 3/10: cyclone_idai_2019
[1/12] budget=5, seed=1 -- starting...


2026-02-18 17:13:08,264 - lg_cotrain - INFO - Starting LG-CoTrain: event=cyclone_idai_2019, budget=5, seed_set=1
2026-02-18 17:13:08,299 - lg_cotrain - INFO - Detected 10 classes for event cyclone_idai_2019: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 17:13:08,305 - lg_cotrain - INFO - D_l1: 30, D_l2: 20, D_LG: 2703
2026-02-18 17:13:08,307 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1000.51it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1084.58it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 17:13:20,034 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.0924, mean_prob2=0.1170

[1/12] budget=5, seed=1 -- done (macro_f1=0.5733)
[PROGRESS] 25/120 (20.8%) | Elapsed: 3.90h | ETA: 14.83h
[2/12] budget=5, seed=2 -- starting...


2026-02-18 17:20:54,938 - lg_cotrain - INFO - Starting LG-CoTrain: event=cyclone_idai_2019, budget=5, seed_set=2
2026-02-18 17:20:54,977 - lg_cotrain - INFO - Detected 10 classes for event cyclone_idai_2019: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 17:20:54,983 - lg_cotrain - INFO - D_l1: 30, D_l2: 20, D_LG: 2703
2026-02-18 17:20:54,985 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1115.40it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1111.41it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 17:21:05,763 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1068, mean_prob2=0.1222

[2/12] budget=5, seed=2 -- done (macro_f1=0.6294)
[PROGRESS] 26/120 (21.7%) | Elapsed: 4.03h | ETA: 14.58h
[3/12] budget=5, seed=3 -- starting...


2026-02-18 17:28:42,536 - lg_cotrain - INFO - Starting LG-CoTrain: event=cyclone_idai_2019, budget=5, seed_set=3
2026-02-18 17:28:42,582 - lg_cotrain - INFO - Detected 10 classes for event cyclone_idai_2019: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 17:28:42,590 - lg_cotrain - INFO - D_l1: 30, D_l2: 20, D_LG: 2703
2026-02-18 17:28:42,592 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1004.37it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1153.02it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 17:28:53,878 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.0998, mean_prob2=0.0980

[3/12] budget=5, seed=3 -- done (macro_f1=0.5878)
[PROGRESS] 27/120 (22.5%) | Elapsed: 4.16h | ETA: 14.34h
[4/12] budget=10, seed=1 -- starting...


2026-02-18 17:36:35,335 - lg_cotrain - INFO - Starting LG-CoTrain: event=cyclone_idai_2019, budget=10, seed_set=1
2026-02-18 17:36:35,360 - lg_cotrain - INFO - Detected 10 classes for event cyclone_idai_2019: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 17:36:35,367 - lg_cotrain - INFO - D_l1: 50, D_l2: 50, D_LG: 2653
2026-02-18 17:36:35,368 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1148.81it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1084.30it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 17:36:46,675 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.0945, mean_prob2=0.112

[4/12] budget=10, seed=1 -- done (macro_f1=0.6149)
[PROGRESS] 28/120 (23.3%) | Elapsed: 4.29h | ETA: 14.10h
[5/12] budget=10, seed=2 -- starting...


2026-02-18 17:44:18,466 - lg_cotrain - INFO - Starting LG-CoTrain: event=cyclone_idai_2019, budget=10, seed_set=2
2026-02-18 17:44:18,493 - lg_cotrain - INFO - Detected 10 classes for event cyclone_idai_2019: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 17:44:18,508 - lg_cotrain - INFO - D_l1: 50, D_l2: 50, D_LG: 2653
2026-02-18 17:44:18,509 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1051.14it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1186.38it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 17:44:29,274 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1078, mean_prob2=0.107

[5/12] budget=10, seed=2 -- done (macro_f1=0.5451)
[PROGRESS] 29/120 (24.2%) | Elapsed: 4.42h | ETA: 13.87h
[6/12] budget=10, seed=3 -- starting...


2026-02-18 17:52:03,554 - lg_cotrain - INFO - Starting LG-CoTrain: event=cyclone_idai_2019, budget=10, seed_set=3
2026-02-18 17:52:03,599 - lg_cotrain - INFO - Detected 10 classes for event cyclone_idai_2019: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 17:52:03,606 - lg_cotrain - INFO - D_l1: 50, D_l2: 50, D_LG: 2653
2026-02-18 17:52:03,607 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1130.88it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1161.36it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 17:52:14,281 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.0962, mean_prob2=0.103

[6/12] budget=10, seed=3 -- done (macro_f1=0.6207)
[PROGRESS] 30/120 (25.0%) | Elapsed: 4.55h | ETA: 13.65h
[7/12] budget=25, seed=1 -- starting...


2026-02-18 17:59:49,740 - lg_cotrain - INFO - Starting LG-CoTrain: event=cyclone_idai_2019, budget=25, seed_set=1
2026-02-18 17:59:49,763 - lg_cotrain - INFO - Detected 10 classes for event cyclone_idai_2019: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 17:59:49,769 - lg_cotrain - INFO - D_l1: 124, D_l2: 114, D_LG: 2515
2026-02-18 17:59:49,771 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1223.06it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1116.50it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 18:00:00,835 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.0994, mean_prob2=0.1

[7/12] budget=25, seed=1 -- done (macro_f1=0.6207)
[PROGRESS] 31/120 (25.8%) | Elapsed: 4.68h | ETA: 13.44h
[8/12] budget=25, seed=2 -- starting...


2026-02-18 18:07:34,560 - lg_cotrain - INFO - Starting LG-CoTrain: event=cyclone_idai_2019, budget=25, seed_set=2
2026-02-18 18:07:34,601 - lg_cotrain - INFO - Detected 10 classes for event cyclone_idai_2019: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 18:07:34,609 - lg_cotrain - INFO - D_l1: 124, D_l2: 114, D_LG: 2515
2026-02-18 18:07:34,610 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1127.75it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1240.80it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 18:07:45,462 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1017, mean_prob2=0.1

[8/12] budget=25, seed=2 -- done (macro_f1=0.6023)
[PROGRESS] 32/120 (26.7%) | Elapsed: 4.80h | ETA: 13.21h
[9/12] budget=25, seed=3 -- starting...


2026-02-18 18:15:00,748 - lg_cotrain - INFO - Starting LG-CoTrain: event=cyclone_idai_2019, budget=25, seed_set=3
2026-02-18 18:15:00,786 - lg_cotrain - INFO - Detected 10 classes for event cyclone_idai_2019: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 18:15:00,792 - lg_cotrain - INFO - D_l1: 124, D_l2: 114, D_LG: 2515
2026-02-18 18:15:00,793 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1206.68it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1190.31it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 18:15:11,680 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1000, mean_prob2=0.1

[9/12] budget=25, seed=3 -- done (macro_f1=0.6124)
[PROGRESS] 33/120 (27.5%) | Elapsed: 4.93h | ETA: 12.99h
[10/12] budget=50, seed=1 -- starting...


2026-02-18 18:22:24,949 - lg_cotrain - INFO - Starting LG-CoTrain: event=cyclone_idai_2019, budget=50, seed_set=1
2026-02-18 18:22:24,971 - lg_cotrain - INFO - Detected 10 classes for event cyclone_idai_2019: ['caution_and_advice', 'displaced_people_and_evacuations', 'infrastructure_and_utility_damage', 'injured_or_dead_people', 'missing_or_found_people', 'not_humanitarian', 'other_relevant_information', 'requests_or_urgent_needs', 'rescue_volunteering_or_donation_effort', 'sympathy_and_support']
2026-02-18 18:22:24,971 - lg_cotrain - INFO - D_l1: 227, D_l2: 226, D_LG: 2300
2026-02-18 18:22:24,971 - lg_cotrain - INFO - === Phase 1: Weight Generation ===
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1188.14it/s, Materializing param=bert.pooler.dense.weight]
Loading weights: 100%|███████████████| 199/199 [00:00<00:00, 1216.71it/s, Materializing param=bert.pooler.dense.weight]
2026-02-18 18:22:36,569 - lg_cotrain - INFO - Phase 1 epoch 1/7: mean_prob1=0.1059, mean_prob2=0.1

## Cross-Disaster Results

We now aggregate results across all events to compare how the pipeline
performs on different disaster types and how performance scales with the
labeled data budget.

In [None]:
# Build cross-disaster summary: event -> budget -> mean macro-F1
summary = {}
for event in sorted(all_event_results.keys()):
    results = all_event_results[event]
    by_budget = {b: [] for b in BUDGETS}
    for r in results:
        if r is not None:
            by_budget[r["budget"]].append(r)
    summary[event] = {}
    for b in BUDGETS:
        f1s = [r["test_macro_f1"] for r in by_budget[b]]
        errs = [r["test_error_rate"] for r in by_budget[b]]
        summary[event][b] = {
            "f1_mean": statistics.mean(f1s) if f1s else None,
            "f1_std": statistics.stdev(f1s) if len(f1s) >= 2 else None,
            "err_mean": statistics.mean(errs) if errs else None,
            "err_std": statistics.stdev(errs) if len(errs) >= 2 else None,
            "n_seeds": len(f1s),
        }

# Print grand summary table
header = f"{'Event':<35}"
for b in BUDGETS:
    header += f" | B={b:<11}"
print(header)
print("-" * len(header))

for event in sorted(summary.keys()):
    row = f"{event:<35}"
    for b in BUDGETS:
        s = summary[event][b]
        if s["f1_mean"] is not None and s["f1_std"] is not None:
            row += f" | {s['f1_mean']:.3f}+/-{s['f1_std']:.3f}"
        elif s["f1_mean"] is not None:
            row += f" | {s['f1_mean']:.3f}      "
        else:
            row += f" | {'N/A':<11}"
    print(row)

# Line plot: Macro-F1 by budget, one line per event
fig, ax = plt.subplots(figsize=(10, 6))

for event in sorted(summary.keys()):
    means = [summary[event][b]["f1_mean"] or 0 for b in BUDGETS]
    stds = [summary[event][b]["f1_std"] or 0 for b in BUDGETS]
    ax.errorbar(BUDGETS, means, yerr=stds, marker="o", capsize=3, label=event)

ax.set_xlabel("Budget (labeled samples per class)")
ax.set_ylabel("Test Macro-F1 (mean +/- std across seeds)")
ax.set_title(f"LG-CoTrain Performance — {PSEUDO_LABEL_SOURCE}")
ax.set_xticks(BUDGETS)
ax.legend(bbox_to_anchor=(1.05, 1), loc="upper left", fontsize=8)
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Heatmap: events (rows) x budgets (columns), colored by mean macro-F1
events_sorted = sorted(summary.keys())
heatmap_data = np.zeros((len(events_sorted), len(BUDGETS)))

for i, event in enumerate(events_sorted):
    for j, b in enumerate(BUDGETS):
        val = summary[event][b]["f1_mean"]
        heatmap_data[i, j] = val if val is not None else 0

fig, ax = plt.subplots(figsize=(8, 8))
im = ax.imshow(heatmap_data, cmap="YlOrRd", aspect="auto")

ax.set_xticks(range(len(BUDGETS)))
ax.set_xticklabels([f"B={b}" for b in BUDGETS])
ax.set_yticks(range(len(events_sorted)))
ax.set_yticklabels(events_sorted, fontsize=9)
ax.set_title(f"Mean Test Macro-F1 — {PSEUDO_LABEL_SOURCE}")

for i in range(len(events_sorted)):
    for j in range(len(BUDGETS)):
        val = heatmap_data[i, j]
        color = "white" if val > 0.6 else "black"
        ax.text(j, i, f"{val:.3f}", ha="center", va="center", color=color, fontsize=9)

fig.colorbar(im, ax=ax, label="Macro-F1")
plt.tight_layout()
plt.show()

In [None]:
from lg_cotrain.dashboard import collect_all_metrics, generate_html

metrics = collect_all_metrics(RESULTS_ROOT)
html = generate_html(metrics, RESULTS_ROOT)
dashboard_path = Path(RESULTS_ROOT) / "dashboard.html"
dashboard_path.parent.mkdir(parents=True, exist_ok=True)
dashboard_path.write_text(html)
print(f"Dashboard written to: {dashboard_path}")
print(f"Metrics loaded: {len(metrics)} experiments")

## Summary

This notebook ran experiments for all disaster events using:
- **Pseudo-label source**: configured via `PSEUDO_LABEL_SOURCE`
- **Results folder**: `results/{RUN_NAME}/`

Results are stored separately from previous runs, enabling side-by-side
comparison via the multi-tab dashboard.

### CLI equivalent
```bash
# Run all experiments for all events with custom pseudo-label source and output folder
python -m lg_cotrain.run_experiment \
    --events california_wildfires_2018 canada_wildfires_2016 cyclone_idai_2019 \
            hurricane_dorian_2019 hurricane_florence_2018 hurricane_harvey_2017 \
            hurricane_irma_2017 hurricane_maria_2017 kaikoura_earthquake_2016 \
            kerala_floods_2018 \
    --pseudo-label-source gpt-4o \
    --output-folder results/gpt-4o-run1
```

### Generate multi-tab dashboard from the CLI
```bash
python -m lg_cotrain.dashboard --results-root results/
```