Skip to content

Backpropagate v1.3.0

Choose a tag to compare

@github-actions github-actions released this 24 May 09:58
· 82 commits to main since this release
7733e36

Fixed

  • CLI flag-vs-runtime mismatches. Two CLI flags advertised functionality the runtime never delivered: (a) --host <addr> was accepted and validated but never threaded to the Reflex subprocess argv — the UI silently stayed loopback-only since v1.1.0; (b) --share was a no-op post-Reflex-migration (Gradio's gradio.com tunnel was removed in v1.1.0 and nothing replaced it). v1.3 fixes (a) by wiring --host through to the Reflex backend bind via the --backend-host argument that landed in reflex 0.9.2, and fixes (b) by implementing real cloudflared-based tunneling for --share (consumes the existing BACKPROPAGATE_UI_SHARE_HOST Origin-allowlist plumbing the auth middleware was already wired for). Operators relying on either flag should re-verify their deployment surface; the SSH-port-forward pattern in handbook/security.md remains an alternative for --share for operators without cloudflared installed.
  • Web UI auth-success now leaves an audit trail. v1.1.x had no log line for a successful cookie-set on the GHSA-f65r-h4g3-3h9h surface — operators could see failed-auth lines (close code 4401 / 4403) but never knew which cookie just succeeded. v1.3 emits one auth_success INFO line per session at the cookie-set sites (both TOKEN_AUTO and EXPLICIT_CREDS / PRODUCTION modes), with {user, mode, host} fields. Per-request validation passes log at DEBUG. No cookie value, no password, no Basic-header bytes are recorded; the line is safe to ship to a central log aggregator.
  • Test surface no longer silently green-passes on regression. Eight tests across the auth-middleware + SLAO-integration + GPU-emergency-callback + Hypothesis-property-test families were tautological — they if-gated on the value under test and silently skipped the assertion when the value was falsy, so a regression that returned None from the helper would still report green. v1.3 converts them to assert ... is not None precondition + property checks. Notable: test_token_lock_file_mode_0600 and the cookie-hardening test pair now skip-with-reason (not silently green) when the underlying surface is not yet wired; they re-engage when the Wave 5/6 auth-middleware polish lands.

Added

  • Recommended isolated-install path documented in README. The Installation section now leads with pipx install backpropagate / uv tool install backpropagate as the recommended modes (isolated venv + automatic PATH integration), with pip install retained for users managing their own venv. The original pip install backpropagate[extra] table is preserved below as a reference.
  • bin/backpropagate.js is now a friendly-error shim. Running backpropagate after npm install -g backpropagate prints clear install guidance for the supported channels and exits with code 2 (configuration error). Operators who land here from an old README copy get a single screen of next-step commands rather than a silent download failure.
  • Coverage floor is now a single source of truth. pyproject.toml [tool.coverage.report].fail_under is authoritative; the CI workflow reads it via tomllib at run time so bumping the floor in one place takes effect in both surfaces.
  • release.yml is idempotent and re-runnable. The gh release create step now precheck-skips when the release already exists (no more HTTP 422 "Release.tag_name already exists" on retry), and workflow_dispatch is enabled so a maintainer can re-run from the GitHub UI without an extra tag-push. A concurrency: block with cancel-in-progress: false serializes per-tag releases so a mid-flight npm publish is never cut off after the Sigstore attestation is signed.
  • Bandit scan now uploads a JSON artifact. The gating Bandit step writes a structured bandit-gating.json alongside the txt output, and the artifact is uploaded on every run (including failures) so the maintainer can grep the JSON for test_id / filename / line on a red run without re-running CI locally.
  • Nightly train smoke CI workflow. .github/workflows/nightly-train-smoke.yml runs Trainer(model='Qwen/Qwen2.5-0.5B-Instruct').train(max_steps=1) on a CPU runner at 04:00 UTC each night, with a 15-minute hard timeout. Asserts checkpoint write + run_history entry + finite loss. The workflow opens / appends a ci + nightly-smoke labeled GitHub Issue on failure (collapsed onto the same issue across consecutive red nights, no spam). Observability only — never gates a release. Runner script: scripts/nightly_train_smoke.py.
  • Post-publish smoke workflow. .github/workflows/post-publish-smoke.yml fires after Publish completes successfully and runs pip install backpropagate==<tag> + backprop --version across {ubuntu, macos, windows} × {3.10, 3.11, 3.12, 3.13} plus a docker run ghcr.io/.../backpropagate:<tag> --version smoke. 10-minute per-cell timeout. PyPI CDN lag handled with a 5-attempt × 30s-backoff retry loop. Failure opens / appends a post-publish-smoke labeled issue.
  • verify.sh is now consumed by CI. New verify-smoke job in ci.yml runs verify.sh --format=json, uploads verify.json + per-stage .log files as an artifact, and gates on .first_failed_stage being null. The job is soft-gated (continue-on-error: true) for the first rotation of v1.3.x patches to bed in — the gate flips to strict after 3+ green runs confirm the JSON shape is stable.
  • CycloneDX SBOM attached to every GitHub Release. The release workflow now generates backpropagate-sbom.cdx.json (Python install closure via cyclonedx-py environment) and a best-effort backpropagate-npm-sbom.cdx.json (npm shim closure via @cyclonedx/cyclonedx-npm), uploaded to the GH Release after creation with --clobber for idempotent re-runs. Sigstore SBOM attestation already rides the existing --provenance flag on npm publish; the standalone files give auditors a grep-able artifact without re-running install.
  • OpenSSF Scorecard workflow. .github/workflows/scorecard.yml runs the ossf/scorecard-action@v2 analysis weekly (Mon 06:00 UTC) and on push to main, publishes results to scorecard.dev (publish_results: true, OIDC-authed) AND the GitHub Security tab (SARIF upload). Badge added to the README badges row.
  • PR template. .github/PULL_REQUEST_TEMPLATE.md collects Summary / Test plan / Breaking changes / Related issues + advisories / Doctrine touchpoints. The Doctrine section's checklist mirrors the four checks scripts/check_doc_drift.py enforces, so a contributor sees the drift surface they need to update BEFORE the gate fires on their PR.
  • Issue templates upgraded. bug_report.yml now requires run_id, error code, backprop info output, traceback (with BACKPROPAGATE_DEBUG=1 hint), repro steps, and install-channel dropdown — matching the load-bearing context fields named in the README "Reporting bugs" section + CONTRIBUTING.md. feature_request.yml collects use-case-first framing + proposed API + backward-compat impact. config.yml disables blank issues and routes security reports to the private advisory form.
  • Opt-in pytest-xdist parallel execution. All three pytest invocations in ci.yml now respect the BACKPROPAGATE_PYTEST_PARALLEL repository variable — set to 1 to enable -n auto --dist worksteal. Defaults to 0 (serial), so the existing 1865-test baseline + coverage measurement is byte-identical until an operator opts in. Test agent owns the per-test serial / xdist_group marker audit that lets the suite go green under -n auto; this YAML is the consumer.
  • Updated-quality LoRA defaults shipped (rank 256, target_modules="all-linear", 10× learning rate scale). Per Biderman 2024 — "LoRA Learns Less and Forgets Less" and Thinking Machines 2025 — "LoRA Without Regret", this configuration matches full fine-tuning quality on most post-training tasks at ~67% of the compute. v1.2.x defaulted to rank 16 / q+v target — leaving 15–20% quality on the table. The new defaults are the largest free quality win in the v1.3 release.
  • DoRA support. LoraConfig.use_dora field (default False); enable via Python Trainer(..., use_dora=True) or CLI --use-dora. peft's LoraConfig(use_dora=True) underneath. Rank-8 DoRA matches rank-32 LoRA on standard evals (+2.8% on LLaMA-7B); merges to zero inference overhead.
  • Sample packing default-on. SFTConfig.packing=True by default in v1.3 — 1.7–3× documented throughput on variable-length conversational datasets. Opt-out via Python Trainer(..., packing=False) or CLI --no-packing. Attention-backend agnostic (FA2 / FA3 / xFormers / SDPA).
  • Paged 8-bit Adam auto-detected on consumer GPUs. optim="paged_adamw_8bit" becomes the default on detected RTX 40/50-series cards (Ada / Blackwell). +25% throughput per arXiv:2509.12229. Override via --optim adamw_torch (or any other transformers.TrainingArguments optim string).
  • PiSSA / LoftQ LoRA initialization flags. LoraConfig.init_lora_weights accepts "default" | "pissa" | "loftq"; CLI flag --init-lora-weights. Free quality recovery on QLoRA runs; pairs cleanly with DoRA.
  • Ada-architecture mixed-precision tuning. RTX 40/50-series autodetection switches the default mixed-precision dtype from bf16 to fp16 per arXiv:2509.12229's peer-reviewed RTX 4060 study (bf16 underperforms fp16 on Ada cards). bf16 remains the default on Hopper / Ampere / non-Ada cards. Override unchanged via Trainer(..., mixed_precision="bf16").
  • Three new model presets. Phi-4-mini-3.8B (MIT license — best-in-class reasoning / math / code at ≤4B), Qwen-3.5-4B (Apache 2.0 — current sub-5B leader, MMLU-Pro 79.1, native long context), SmolLM3-3B (Apache 2.0 — fully open recipe, native 64K context). All three are Unsloth-supported and 4-bit-quantization clean.
  • Multi-run subcommand surface. Three new CLI subcommands consuming RunHistoryManager: backprop diff-runs <A> <B> (side-by-side config / loss / hyperparameter diff, colorized by default; --format=json for machine consumption), backprop replay <run-id> (re-run with the same config + dataset; --override key=value for surgical tweaks), backprop export-runs --format=jsonl (bulk export of all run history for offline analytics / W&B-MLflow pipeline integration / disaster-recovery snapshots).
  • README rewrite landed. The README opens with "Train an adapter. Ship it to Ollama. Move on." instead of feature-bullet shorthand, positions backpropagate against Axolotl / LLaMA-Factory / Unsloth / torchtune explicitly, adds an honest 16GB capability envelope table, an explicit "what backpropagate is NOT for" anti-pitch section with citations, and a References section. Re-translation runs at Phase 10.
  • Web UI request-logging middleware. New ASGI middleware emitting one structured access log per request (method, path, status, duration_ms, auth_mode, auth_user, remote_addr). Opt-in via BACKPROPAGATE_UI_REQUEST_LOG=1; defaults off because Reflex's own logging already covers most surfaces. Integrates with the existing structlog pipeline. Lands AFTER auth in the middleware chain so the log record captures the resolved username (or "anonymous").
  • Web UI rate-limit middleware. slowapi-shaped per-IP limiter on the /_event WS upgrade and the POST / PUT / PATCH / DELETE HTTP surface. Default 100 req/min per IP, 10 WS upgrades per IP per minute; tunable via BACKPROPAGATE_UI_RATE_LIMIT_* env vars. Lands BEFORE auth in the middleware chain so brute-force attempts can't exhaust the HMAC budget.
  • Run-history UI drill-down. Per-run page at /runs/<run_id> exposing run metadata, hyperparameter table, training-metrics chart, checkpoint list, log tail, and action buttons (Diff vs ..., Replay, Delete, Export). Strictly read-only at the data layer — Delete + Replay shell out to backprop replay <run_id> / backprop delete-run <run_id> rather than mutating run-history in the UI process.
  • GitHub Discussions enabled. Discussions categories (Announcements / Q&A / Ideas / Show & Tell) configured, pinned welcome post links the bug-vs-question routing + GHSA private-advisory path. CONTRIBUTING.md names Discussions as the canonical Q&A channel.
  • Docker images now multi-arch (linux/amd64 + linux/arm64). release.yml uses docker/setup-qemu-action + docker/setup-buildx-action + docker/build-push-action with platforms: linux/amd64,linux/arm64. Apple Silicon and ARM Linux operators get a native image instead of the prior x86-64-only push.
  • compose.yaml at repo root for "UI in a container". Canonical Docker Compose service exposes the Reflex web UI on port 7860 with a persistent ~/.backpropagate volume mount and the standard BACKPROPAGATE_UI_AUTH / BACKPROPAGATE_UI_HOST_BIND env-var passthrough. docker compose up brings the UI up in one command. Cross-referenced from the handbook deployment page.
  • CITATION.cff rewritten. Title aligned with the README h1, authors set to the org-level "MCP Tool Shop authors" collective with the contact email, keywords mirror pyproject.toml, and the references: section adds Biderman 2024 + the foundational LoRA paper (Hu 2021) alongside the existing SLAO paper (Qiao & Mahdavi 2025). DOI placeholder commented; mint on the v1.3.0 GitHub release.

Changed

  • verify.sh accepts --format=human|json. Default stays human (the existing banner output). --format=json emits one JSON object per stage ({"stage", "status", "exit_code", "duration_seconds"}) plus a final aggregate object — CI can parse the stream without screen-scraping. Each stage's stdout/stderr is captured to verify-<stage>.log so the JSON channel stays parseable.
  • CI workflow action SHAs aligned across workflows. release.yml's actions/setup-python pin moved from v6.0.0 → v6.2.0 (already in use by ci.yml + publish.yml) so the Node-runtime parity across workflows is consistent.
  • Default --lora-r from 16 → 256. Backward-compat available via --lora-preset=fast (rank 16 / q+v target / 1× LR — the v1.2.x footprint). Operators who pinned LORA__R=16 in their env or Trainer(lora_r=16, ...) calls are unaffected; only the implicit default changes.
  • Default packing=True. SFTConfig.packing flips from off to on. Opt out with --no-packing (CLI) or Trainer(..., packing=False) (Python). Operators with strict-determinism requirements who used packing-off implicitly should pass the explicit flag in v1.3.
  • Qwen-2.5-3B preset now boots with a license caveat. The preset is preserved for backward compatibility (existing CLI / Python users with --model Qwen/Qwen2.5-3B-Instruct continue to work) but the loader emits one WARN preset_license_caveat preset=qwen-2.5-3b license=qwen-research notes="research license — commercial use restricted; consider Qwen-3.5-4B (Apache 2.0) or SmolLM3-3B (Apache 2.0) for commercial deployments." line on first use. The preset table in the README and handbook flag the same caveat.

Removed

  • npm distribution deprecated. The bin/backpropagate.js shim used to bootstrap a Linux venv or download PyInstaller binaries from a GitHub Release via @mcptoolshop/npm-launcher. The binary build pipeline failed three consecutive times in v1.2.0 and the v1.0/v1.1/v1.2 release tags have zero attached binary assets — the launcher would 404. The shim now prints install guidance for the supported channels (pipx install backpropagate recommended, plus uv tool install backpropagate and pip install backpropagate) and exits 2. The npm package stays published so this message reaches existing npm install -g backpropagate users. The @mcptoolshop/npm-launcher runtime dependency was dropped from package.json (every npm install was pulling dead code).
  • PyInstaller binary distribution retired (full migration). The v1.0–v1.2 install story shipped PyInstaller binaries from a GitHub Release, pulled via @mcptoolshop/npm-launcher. Linux bootstrapped a managed venv instead because libtorch_cpu.so blew past GitHub's 2 GB release-asset cap, and the binary build pipeline failed three consecutive times in v1.2.0 — the v1.0 / v1.1 / v1.2 release tags shipped zero binary assets, so npm install -g backpropagate would download then 404. v1.3 retires the path completely, in three steps the v1.3 brief tracked under D2 SPLIT:
    1. Wave 1 — bin/backpropagate.js rewritten as a friendly-error shim. npm install -g backpropagate still works; the shim prints install guidance for the three supported channels (pipx install backpropagate recommended, plus uv tool install backpropagate and pip install backpropagate) and exits 2. The @mcptoolshop/npm-launcher runtime dependency was dropped from package.json so each npm install no longer pulls dead code.
    2. Wave 3.5 — .github/workflows/release-binaries.yml deleted. It had failed 4 of the last 5 release runs and produced no shipped assets at any v1.x tag. Surviving comment references in publish.yml / release.yml / ci.yml / bin/backpropagate.js were updated to reflect retirement.
    3. **Wave 6a — PyInstaller .spec files removed from the repo root and the v1.2.x → v1.3 handbook migration page (site/src/content/docs/handbook/migrations.md) extended with a new "Switching off the PyInstaller / npm binary install" section that walks operators from the pre-deprecation npm install -g backpropagate install line through pipx install backpropagate, the optional-extras checklist, and the deprecation timeline (npm package itself remains published indefinitely so the shim's guidance reaches stragglers). .gitignore keeps *.spec so a stray local PyInstaller build doesn't leak back in via a future commit.

Security

  • Transitive-dependency CVE sweep. Bumped 14 packages in uv.lock to close 33 open advisories surfaced on the GitHub Security tab. High-severity closures: urllib3 2.6.3 → 2.7.0 (CVE-2026-44431, CVE-2026-44432), python-multipart 0.0.22 → 0.0.29 (CVE-2026-42561, CVE-2026-40347), GitPython 3.1.46 → 3.1.50 (CVE-2026-42284, CVE-2026-42215, CVE-2026-44243, CVE-2026-44244, plus GHSA-only advisory 57), PyJWT 2.11.0 → 2.13.0 (CVE-2026-32597 — this advisory affects the [security] extra's JWTManager helper in ui_security.py, which is a separate optional layer never reached by the auth middleware that closed GHSA-f65r-h4g3-3h9h; the bump still ships because operators who import JWTManager directly are on the user-facing path), pillow 12.1.1 → 12.2.0 (CVE-2026-40192, CVE-2026-42311, CVE-2026-42308, CVE-2026-42309, CVE-2026-42310). Medium / low closures: aiohttp 3.13.3 → 3.13.5 (10 CVEs), cryptography 46.0.5 → 48.0.0 (CVE-2026-34073, CVE-2026-39892), Pygments 2.19.2 → 2.20.0 (CVE-2026-4539), idna 3.11 → 3.16 (CVE-2026-45409), pip 26.0.1 → 26.1.1 (CVE-2026-3219, CVE-2026-6357), pytest 9.0.2 → 9.0.3 (CVE-2025-71176), python-dotenv 1.2.1 → 1.2.2 (CVE-2026-28684), requests 2.32.5 → 2.34.2 (CVE-2026-25645). Full test suite (1981 passed, 6 skipped) and the auth-middleware regression set (23 passed, 3 skipped) remain green across both tiers of bumps; no behavioral change observed in CI.
  • Two CVEs deferred to v1.4 (upstream blockers). diffusers 0.36.0 → 0.37.1 partially advances but the patched 0.38.0 requires safetensors>=0.8.0rc0 (a pre-release); enabling pre-releases for a security bump trades one risk for another. CVE-2026-44513 and CVE-2026-45804 will close when the safetensors 0.8.0 GA lands or when unsloth loosens its safetensors floor. Mitigation: diffusers is transitive via unsloth and is not imported by backpropagate/**/*.py — there is no reachable codepath from operator-facing surface into the vulnerable image-decode functions. transformers 4.57.6 is held back from the 5.0.0rc3 fix for CVE-2026-1839 by the same pre-release policy; this one IS a direct dependency, so the codepath argument does not apply — the bump is held only on the major-version-compat work that 5.0 would require across trainer.py + datasets.py. Both deferred items are tracked in the v1.3 brief for a v1.4 paired bump.
  • One CVE dismissed (no upstream patch). diskcache CVE-2025-69872 has no first_patched_version in the advisory feed and no newer release exists on PyPI. diskcache is transitive via llama-cpp-python (only pulled in by the [export] extra) and is not imported by backpropagate/**/*.py; the alert will close automatically when upstream ships a fix.
  • CRITICAL-only pip-audit + Trivy floor preserved from v1.2.0. No relaxation in v1.3; the same hard gates surface CRITICAL transitive CVEs while the advisory MEDIUM+ feed continues to populate the GitHub Security tab. The Bandit gating step now emits both txt (gating surface) and JSON (post-mortem artifact) for MEDIUM severity + MEDIUM confidence and above.
  • SECURITY.md is the canonical reporting policy + supported-versions surface; the operator-facing threat model + auth-middleware mode matrix is at handbook/security.md. CONTRIBUTING.md's historical pointer to SECURITY_AUDIT_REPORT.md (which became a stub in Wave 1) now points at the live docs instead.

Known issues / tech debt

  • The pre-v1.3 ERROR-severity Trivy alert cohort (incl. PyJWT CVE-2026-32597) was closed by the v1.3 dep-sweep above; two CVEs (diffusers 0.38.0, transformers 5.0.0rc3) are deferred to v1.4 because the upstream fixes are pre-release-only.
  • Python 3.10 reaches upstream EOL October 2026. v1.3 still supports 3.10 (CI matrix runs 3.10 / 3.11 / 3.12 / 3.13). A future release (target: v1.4) will drop 3.10 to align with the upstream EOL. Operators standing up new installs should prefer Python 3.11 or 3.12 — 3.11 is the most-tested floor (the UI + Windows + macOS smoke cells all run on 3.11).
  • The PyInstaller binary distribution migration is complete in v1.3 (see "Removed" above). The migration handbook page at site/src/content/docs/handbook/migrations.md is the operator-facing landing point for anyone still running npm install -g backpropagate.
  • uv.lock CI-consumption migration deferred to v1.4. CI today installs via pip install -e ".[dev,full]", which IGNORES uv.lock — the lockfile is scanned by Trivy as a security surface but never used to materialize the install. CIDOCS-F-011 / F-012 (Wave 5 audit) proposed migrating CI to uv sync --frozen so the lockfile becomes the install contract. The migration is non-trivial on the current 6-cell matrix (Linux × {3.10, 3.11, 3.12, 3.13} + Windows 3.11 + macOS 3.11) because uv sync --extra semantics differ from pip's bracket-extras syntax and the cross-platform [tool.uv] resolution markers need a separate audit. v1.3 ships with the pip install path unchanged; v1.4 will pick this up against a single cell first (Linux 3.11), then expand once the cross-platform behavior is validated.