v0.3.2
[0.3.2] - 2026-03-11
Added
- Adaptive learning rate controller: EMA-based z-score spike detection, patience-based plateau detection, and linear regression divergence detection — automatically adjusts LR multiplier during training to recover from loss spikes, reduce LR on plateaus, and halt on divergence
- Manual LR override via TUI: Press
Lin Training, Distillation, or GRPO tabs to set a custom learning rate mid-run; uses atomic control file protocol ({output_dir}/.lr_control.json) for safe subprocess communication - WSD (Warmup-Stable-Decay) scheduler: New
LrSchedulerType::Wsdwith configurablestable_ratio— holds peak LR for a plateau phase before linear decay, popular for large-scale pretraining - GRPO adaptive LR + callbacks:
GrpoTrainernow supports adaptive LR,TrainingCallbacklifecycle events, andStepMetricsemission for live TUI monitoring - HuggingFace Hub search (
pmetal search): CLI command and TUI integration (pressSin Models tab) to search HF Hub for text-generation models with download counts, parameter estimates, and memory fit assessment - Memory fit estimation: New
pmetal-hubmodule estimates inference/training memory requirements, tok/s throughput, and color-coded fit levels (green/yellow/red) based on device specs and model architecture - Model detail panel: Models tab shows memory breakdown — weights, KV cache, overhead, training estimate, and recommended batch size
- Distillation metrics callbacks:
DistillationTrainernow emits step-by-step metrics viaTrainingCallback, enabling live TUI dashboard during distillation runs - Command logging in Jobs tab: Spawned commands are logged with the full CLI invocation for easier debugging
Fixed
- NaN/Inf loss guard: Adaptive LR skips EMA updates on non-finite losses to prevent EMA poisoning — returns scheduled LR unchanged
- EMA variance bias correction: Early-training z-scores now use bias-corrected variance (
raw_var / (1 - alpha^n)), matching Adam's moment correction — prevents false spike detection in first ~20 steps - Zero-variance z-score fallback: When loss variance is near zero (std_dev < 1e-8), uses absolute deviation threshold instead of division-by-zero; returns z=10 for >50% deviation, z=0 otherwise
- Atomic control file protocol: LR control file is renamed to
.lr_control.claimedbefore reading and deleted after — prevents race conditions between TUI writer and training subprocess reader - Distillation metrics LR: Distillation step metrics now report post-adaptive LR instead of pre-adjustment scheduled LR
- Adaptive LR in all training paths:
apply_adaptive_lr()now called inrun_metal_fused(),run_compiled(),run_jit_compiled(), andrun_packed()paths (was only inrun_standard()) - TUI LR override validation: LR range check now accepts 1.0 (was exclusive upper bound); shows error modal on invalid input instead of silent log warning
- Distillation/GRPO job routing: Status updates were always routed to the Training tab regardless of job type. Added
active_job_typetracking to route metrics, completion, and failure to the correct tab (Distill, GRPO, or Training) - Distillation CLI args: TUI sent
--lora-alphaand--log-metricsflags that the CLI didn't accept, causing immediate exit code 2. Added both args to theDistillcommand and--log-metricstoGrpo - Parquet dataset support in distill/GRPO: Distillation and GRPO commands only supported JSONL datasets. Now auto-detect
.parquetfiles and route to the parquet loader, matching the training command's behavior - Tab click targeting: Mouse clicks on Monitor, Inference, and Jobs tabs selected the wrong tab due to hardcoded fixed-width hit-testing. Now computes actual tab widths from rendered text
- Error diagnostics: Failed jobs now show the last 5 stderr lines in the tab status panel instead of just "Process exited with code N", with a hint to check the Jobs tab for full output
- UTF-8 safe string truncation:
truncate_strused byte indexing which panics on multi-byte characters; switched tochars()iterator - Leaked channel in HF search:
search_hf()created a sender/receiver pair even without a CommandRunner, silently dropping results - Integer overflow in fit estimation:
estimate_params_from_configused plain multiplication; switched tosaturating_mul/saturating_add - Context length truncation: u64→u32 cast could wrap for extreme values; capped at 1M before cast
Improved
- TUI tab ordering: System (formerly Device) is now the default first tab; Dashboard renamed to Monitor
- Empty state messaging: Monitor tab shows actionable guidance ("Start a run from Training, Distill, or GRPO tab") instead of "Waiting for training data..."
- Idle state hint: Tabs show "Press S to start" instead of "Press S to start training" (generic across all job types)
Security
- Bounded API responses:
bounded_json()caps HF API response bodies at 4MB to prevent heap exhaustion - Model ID validation:
is_valid_model_id()rejects path traversal, URL injection, and malformed values in HF API paths
Full Changelog: v0.3.1...v0.3.2