PUMA v2.3.0 — dashboard production-quality + docs structure
PUMA v2.3.0 Release Notes
Release date: 2026-05-13
Previous release: v2.2.0 (2026-05-13)
Branch: develop → main (post-tag)
Summary
This release consolidates Sprint 6 (dashboard polish + structural
refactor) and retrospective documentation work (INDEX.md +
docs/overview.md + README branding) onto the v2.2.0 base. With this
release, Phase C of the master plan is fully complete.
Highlights
Dashboard production-quality (Sprint 6)
Major refactor: app.py reduced from 803 LOC monolithic to 168 LOC
router (-79 %). View logic delegated to seven modules in
src/puma/dashboard/views/. Each view is independently importable
and testable; the router publishes filters to st.session_state and
dispatches via a VIEWS dict.
Ten polish improvements applied:
| # | Improvement | Impact |
|---|---|---|
| 1 | @st.cache_data(ttl=60) on 7 loaders |
Performance |
| 2 | st.spinner on slow operations |
UX |
| 3 | CSV export on 4 tables | Productivity |
| 4 | Tooltips on ≈ 12 metric cards | UX |
| 5 | Unified empty-filtered-state component | UX |
| 6 | Friendly expander titles in Overview | UX |
| 7 | Module-level imports (no more inline) | Code quality |
| 8 | Emoji prefixes consistent across 7 view titles | UX |
| 9 | Dark-mode dataframe text legibility | UX (bug fix) |
| 10 | Empty-selectbox guard in Instance Drill-down | Robustness |
Plus: first-visit guided tour with view overview and tips
(download CSV, dark mode, tooltips). Persistent dismiss via
st.session_state["tour_dismissed"]; "📖 Show tour" button in the
sidebar to re-open.
Documentation structure (Phase E.bis retrospective + Phase E.ter)
INDEX.md(root, uppercase): project status, phases, releases,
debt tracking, architecture entry points. Created in Phase E.bis;
this release updates it for v2.3.0 status.docs/overview.md(new location): preserves the 256 LOC of
architectural content from the legacy lowercaseindex.md.README.md: branded header with PUMA logo, descriptive
blockquote, and Related-Resources section linking to puma-vault,
the published knowledge garden, releases, INDEX.md, and
docs/overview.md.
Quality
- Tests: 318 passing (up from 313 in v2.2.0; +5 dashboard smoke
tests covering view module integrity, polish helpers, cache
decorator presence, and the end-to-end AppTest render with the
live database). - Coverage: 58 % (up from 55 % in v2.2.0).
- Pre-commit: 10/10 hooks green.
- CI: green on both
mainanddevelop. - Baseline reproducibility: F1 = 0.5867 ± 0.01 holds; verified
viapuma validate-baseline(PASS at 0.5831, delta −0.0036).
Methodological findings (academic traceability)
Sprint 6 surfaced one additional finding consistent with the
meta-pattern documented in docs/known_debt.md ("symptom in layer
N, root cause in layer M ≠ N"):
- Dark-mode dataframe invisibility. The CSS rule applied
light-mode colours globally; under dark mode, table text inherited
light-mode colours against the dark background, rendering tables
nearly unreadable. Symptom (invisible tables) appeared in the
dashboard layer; root cause (CSS scope without theme awareness)
was in the styling layer. Resolved in the same commit as the
refactor by adding a theme-aware CSS override
(color: #E5E7EB+background-color: #16213Ewhen
dark_mode == True).
This brings the meta-pattern catalogue to five instances (D15, D18,
D21, D22, and this CSS scope issue); the fifth is retired in the
same commit that surfaced it.
CI workflow hygiene
The .github/workflows/release.yml fix introduced in Phase E.bis
(commit 863c166) is now exercised end-to-end by the v2.3.0 tag
push. After the tag was pushed and gh release create ran, exactly
one release was created (no duplicate draft). The fix is verified
effective for v2.X.0 releases going forward.
Debt tracking
- No new open debt introduced by this release.
- Total resolved across v2.0.0 → v2.3.0: 15 of 24 items (62 %).
- Phase C: ✓ COMPLETE (was the last open phase; all five
Gate-C criteria met).
Full inventory and diagnostic write-ups in
docs/known_debt.md.
Known limitations
- Single hardware tier evaluated (
gpu-entry); models requiring
gpu-midand above (qwen2.5:14b, gemma3:27b, deepseek-r1:14b,
thegemma4family, llama3.1:70b) catalogued but not yet
empirically evaluated. - AMD ROCm and Apple Metal backends not yet detected (development
hardware is NVIDIA-only). - TAWOS SHA-256 end-to-end fetch test pending (Gate D criterion 3).
input_textnot persisted intriage_jirainstances (D22, Low —
future data-pipeline enhancement). The Dashboard Instance
Drill-down handles this gracefully with an informative message.
Master plan status (post-v2.3.0)
| Phase | Status |
|---|---|
| A — Foundations | ✓ COMPLETE |
| B — Multi-model sweep | ✓ COMPLETE |
| C — Professional dashboard | ✓ COMPLETE (this release) |
| D — Technical depth | ✓ ~95 % (ROCm/Metal n/a in current hardware) |
| E — Documentation and releases | ✓ COMPLETE (v2.0.0, v2.1.0, v2.2.0, v2.3.0) |
All five phases of the original master plan are now complete or
effectively complete (Phase D's remaining items are
hardware-dependent or scope-deferred).
Upgrade notes
- No breaking changes to the public CLI or YAML run-spec schema.
- Dashboard refactor is internal; user-facing behaviour is preserved.
- Existing run-specs and CLI invocations work unchanged.
- The dashboard module structure has changed (
app.pyis now a
router; each view lives insrc/puma/dashboard/views/<name>.py).
Any external tooling that imported view code fromapp.pyshould
migrate to the new module paths.
Acknowledgments
Development assistance provided by generative AI tooling. All commits
are attributed to the project's git identity per repository
convention.