# Training, Evaluation, and Logging

This notebook documents current experiment execution and artifact generation.


## 1) Canonical Training Entry

- `run_experiment6_tape` in `src/notebook_helpers/tcn_phase1.py`.

Main cycle:

- rollout collection,
- PPO updates,
- curriculum updates,
- per-episode and summary metric logging,
- checkpointing and run-manifest JSON output.


## 2) Current PPO Defaults

- `max_total_timesteps = 100000`
- `timesteps_per_ppo_update = 250`
- `num_ppo_epochs = 5`
- `batch_size_ppo = 256`
- `actor_lr = 7e-4`
- `critic_lr = 7e-4`
- `entropy_coef = 0.01`


## 3) Evaluation Behavior

Primary function:

- `evaluate_experiment6_checkpoint`

Tracks:

- deterministic evaluation (`det_mode` and `det_mean` when comparison is enabled),
- stochastic evaluation (sampled actions across multiple runs).

Important toggles:

- `deterministic_eval_mode` / `compare_deterministic_modes`,
- `stochastic_eval_mode`,
- `num_eval_runs`,
- `checkpoint_path_override`.


## 4) Core Logged Outputs

- metadata JSON manifest,
- per-episode CSV,
- summary CSV,
- actor/critic checkpoint weights,
- evaluation CSV rows with deterministic and stochastic metrics.


## 5) Diagnostics to Watch

Common diagnostics include:

- `action_uniques`,
- `argmax_alpha_uniques`,
- `alpha_le1_fraction`,
- `drawdown_lambda` snapshots,
- turnover and max drawdown trends.


In [None]:
from pathlib import Path
logs = Path('tcn_results/logs')
if logs.exists():
    metas = sorted(logs.glob('*_metadata.json'))
    print('metadata files:', len(metas))
    for p in metas[-5:]:
        print(' -', p.name)
else:
    print('No tcn_results/logs folder found in this workspace.')
