PUMA v4.0.0 — Sprint 12 closure
PUMA is a local, reproducible benchmarking framework for open LLMs on
project-management tasks (issue triage and effort estimation), run entirely
on your own hardware via Ollama — deterministic, offline by default, and
verifiable end to end.
This release is validated by a real-world milestone. PUMA's federated
community-submission infrastructure is now operational and was proven end to
end by the first official production submission — qwen2.5:3b on
triage_jira / zero_shot, F1-macro 0.3898 — which landed at
pumacp/puma-community#8
(merge SHA 111cee36) and mirrored to the Hugging Face submissions dataset.
That F1 is the reproducible floor anchor for the zero-shot strategy on
triage_jira; see docs/first-submission.md.
Sprint 12 highlights
- #45 — PyPI + Docker (ghcr.io) publishing workflows +
Dockerfile.publish;pyproject.tomlhardened (distribution namepuma-cp) - #46 — Multi-model comparison dashboard view + corporate Streamlit palette
- #47 — README channel-directory restructure; acrostic visual flexibility
- #48 — mkdocs full content sync (nav 6 → 28 pages); D30 resolved
- #49 — Manual IDE contribution workflow docs
- #50 — Security audit MVP: pip-audit + bandit + gitleaks + Trivy +
SECURITY.md+ threat model - #51 — Consolidated technical reference (~5100 words, 17-decision timeline)
- #52 — Inaugural production submission documented (
docs/first-submission.md)
Installation
pip install puma-cp==4.0.0The container image (ghcr.io/pumacp/puma:4.0.0) is not yet published for
this release: the Trivy gate in the publish pipeline blocked it on 3 HIGH-severity
base-image CVEs (0 CRITICAL). This is the security gate working as designed; the
image will be re-published once the base image is patched (tracked for S12.19).
Use the PyPI package in the meantime.
Quick start
See the documentation site and the
getting-started overview.
What's in v4.0.0
Added — federated submission infrastructure (S12.15) validated by the
inaugural submission; PyPI + ghcr.io publishing; multi-model dashboard view;
consolidated technical reference; manual IDE contribution workflow; security
audit MVP; corporate monochrome visual identity; acrostic visual flexibility.
Changed — mkdocs nav 6 → 28 public pages; read-only puma models
sub-group (D30 resolved); pyproject.toml hardened; project version → 4.0.0.
Security — pip-audit / bandit / gitleaks on every push; Trivy on every
container publish (blocks HIGH/CRITICAL); SECURITY.md disclosure policy;
9-check submission validation pipeline.
Infrastructure — Dockerfile.publish (multi-stage, non-root, OCI labels);
GitHub Pages live with a 28-page nav; Hugging Face dataset mirror operational.
Full detail in CHANGELOG.md.
Links
- Documentation: https://pumacp.github.io/puma
- Community repository: https://github.com/pumacp/puma-community
- Hugging Face submissions dataset: https://huggingface.co/datasets/pumaproject/puma-community-submissions
- Hugging Face leaderboard Space: https://huggingface.co/spaces/pumaproject/puma-leaderboard
Known limitations (deferred to S12.19 / post-Sprint-12)
See docs/known_debt.md.
- D38 —
validate-submissionworkflow references a non-existent action version. - D39 —
verify-integrityworkflow broken bygradio_clientAPI drift; the inaugural submission is thereforeself-attested. - D40 —
puma share-resultsCLI hangs after the Review panel. - Container publish blocked by 3 HIGH base-image CVEs (re-publish after patching).
notify-discordlacks theDISCORD_WEBHOOKsecret (optional integration).
Acknowledgments
Thanks to everyone who tested the submission pipeline end to end and helped land
the first official community submission — the milestone that validates this release.