Release Backpropagate v1.1.0 · mcp-tool-shop-org/backpropagate

A minor release that takes the project from "polished v1" to "real v1" via a 10-wave dogfood swarm. Bug + security pass, proactive health pass, UX humanization, full UI redesign (Gradio → Reflex), 5 P0 features.

Added

Reflex web UI — the optional [ui] extra now installs Reflex (Radix UI) instead of Gradio. Pure-Python implementation, WebSocket-driven live state, refined Ocean Mist palette, full dark + light mode, WCAG 2.4.7 focus indicators, 30 SVG icons, heartbeat / sparkline / event-log / structured-error / recovery-banner patterns
Hugging Face Hub push — backprop push <local> --repo <owner/name> + backprop export --push-to-hub <repo> for one-shot export+push. Adapter-only by default; --include-base for the full merged model. Token resolution from HF_TOKEN / HUGGING_FACE_HUB_TOKEN / HF CLI cache. model_card.md is mirrored to the repo's README.md so HF picks it up as the model card
Resume from checkpoint — backprop resume <run_id> (and backprop train --resume <run_id> / backprop multi-run --resume) reconstructs a crashed or interrupted run from RunHistoryManager + the atomic checkpoint manifest. A 5-run multi-run that crashes at run 4 is now recoverable
Run history — RunHistoryManager is now actually wired into Trainer + MultiRunTrainer. New backprop list-runs (with --json, --status, --limit filters + aligned columns) and backprop show-run <run_id> (partial-prefix matching) subcommands surface the history
Model card generation — every export emits a model_card.md following the HF model-card schema, with full provenance (run_id, base model, dataset hash, seed, training duration, ASCII loss sparkline, Ship Gate trust signals). Opt out via --no-model-card
Experiment tracking auto-wired — [monitoring] extra (W&B, TensorBoard) now actually integrates. report_to defaults to "auto" (detect what's installed); the run shows up with name backprop-<run_id_short> for cross-system correlation
Atomic checkpoint writes — Trainer.save / SLAOMerger.save / export_lora / export_gguf all write to <path>.partial then rename to final. Disk-full mid-write no longer leaves corrupt artifacts
OOM auto-recovery — Trainer(oom_recovery=True) (default-on) halves batch_size + doubles gradient_accumulation_steps on torch.cuda.OutOfMemoryError, preserving effective batch. Aborts after 3 consecutive failures at batch=1
HF Hub transient retry — every from_pretrained / load_dataset / snapshot_download retries on 5xx / 429 / connection errors with exponential backoff. 401 / 403 / 404 surface in < 1s with cause-classified hints
GPU pause-on-overheat — Trainer(pause_on_overheat=True) now actually pauses training (the wiring was a no-op in v1.0)
Unsloth fallback — Trainer(unsloth_fallback=True) (default-on) falls back to AutoModelForCausalLM + peft on Unsloth failures
run_id correlation — every training run mints a UUID4 that flows through every log line + checkpoint manifest + SLAO merge record
Stable error codes — BackpropagateError.code is now an explicit Ship Gate registry-prefixed identifier on every subclass. 28-entry ERROR_CODES catalog visible via backprop info --error-codes. cause_category enum on ModelLoadError surfaces cause-specific remediation hints
CLI exit codes — proper 0 / 1 user-error / 2 runtime-error / 3 partial-success / 130 SIGINT contract
Stage C humanization — structured errors with actionable hints, progress feedback on long ops, bare backprop prints help, backprop info --json for support attachments, friendly first-run messages
CI hardening — every third-party GitHub Action SHA-pinned. PyPI publish via OIDC trusted publishing (Sigstore provenance). Docker image digest-pinned + HEALTHCHECK. Multi-OS test matrix (Linux + Windows + macOS + Python 3.13). pip-audit + Trivy + Bandit + Semgrep + TruffleHog all gate on findings
Documentation — new handbook pages: error-codes.md, troubleshooting.md, env-vars.md, cli-reference.md. README Troubleshooting + Reporting bugs + Web UI subsections. examples/quickstart.jsonl so the "3 lines" Quick Start runs on a clean install

Changed

Default model — Trainer() (and backprop train / multi-run CLI defaults) now use Qwen/Qwen2.5-7B-Instruct instead of unsloth/Qwen2.5-7B-Instruct-bnb-4bit. The non-quantized form works without bitsandbytes; users who want the bnb-4bit speedup install [unsloth] and pass --model unsloth/... explicitly
safe_path stricter — absolute path + .. segment + no allowed_base argument now raises PathTraversalError instead of warn-only-and-pass-through
Multi-run validation-overlap fix — _get_data_chunk and _get_replay_samples now hard-cap at the train/validation boundary. Silent contamination is impossible; ConfigurationError surfaces a clear "reduce samples or increase dataset" hint
Random state isolation — multi-run replay sampling uses a local random.Random(seed) instead of mutating the global Python RNG
SLAO NaN/inf detection — SLAOMerger.merge raises SLAO_MERGE_DIVERGED with run_index + run_id + offending layer on non-finite weights
Rate limiter Address handling — _extract_client_ip now correctly reads .host from Starlette's Address namedtuple (was including :port, giving every TCP connection its own bucket)
UI output dir denylist — BACKPROPAGATE_UI__OUTPUT_DIR is validated against a denylist (/etc, ~/.ssh, etc.) on first use
--share + --auth gating — backprop ui --share now requires --auth user:pass (or explicit env-var opt-out with 5-second grace period + loud warning)
Scorecard re-audited — B (Error Handling) row 3/7* → 5/7. Total 23/31 → 25/31

Removed

Gradio web UI — moved to backpropagate/ui_gradio_legacy.py with a DEPRECATED docstring. Preserved for v1.1 reference; will be removed in v1.2. backpropagate.launch / create_backpropagate_theme / get_theme_info / get_css now raise ImportError with the migration message

Tests

1654 → 1766 (+112): regression tests for every Stage A/B contract that landed and every P0 feature that shipped. Coverage threshold holds at 50%.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Backpropagate v1.1.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Added

Changed

Removed

Tests

Uh oh!