Skip to content

PUMA v4.0.0 — Sprint 12 closure

Latest

Choose a tag to compare

@pumacp pumacp released this 02 Jun 02:50
· 3 commits to main since this release

PUMA v4.0.0 — Sprint 12 closure

PUMA is a local, reproducible benchmarking framework for open LLMs on
project-management tasks (issue triage and effort estimation), run entirely
on your own hardware via Ollama — deterministic, offline by default, and
verifiable end to end.

This release is validated by a real-world milestone. PUMA's federated
community-submission infrastructure is now operational and was proven end to
end by the first official production submissionqwen2.5:3b on
triage_jira / zero_shot, F1-macro 0.3898 — which landed at
pumacp/puma-community#8
(merge SHA 111cee36) and mirrored to the Hugging Face submissions dataset.
That F1 is the reproducible floor anchor for the zero-shot strategy on
triage_jira; see docs/first-submission.md.

Sprint 12 highlights

  • #45 — PyPI + Docker (ghcr.io) publishing workflows + Dockerfile.publish; pyproject.toml hardened (distribution name puma-cp)
  • #46 — Multi-model comparison dashboard view + corporate Streamlit palette
  • #47 — README channel-directory restructure; acrostic visual flexibility
  • #48 — mkdocs full content sync (nav 6 → 28 pages); D30 resolved
  • #49 — Manual IDE contribution workflow docs
  • #50 — Security audit MVP: pip-audit + bandit + gitleaks + Trivy + SECURITY.md + threat model
  • #51 — Consolidated technical reference (~5100 words, 17-decision timeline)
  • #52 — Inaugural production submission documented (docs/first-submission.md)

Installation

pip install puma-cp==4.0.0

The container image (ghcr.io/pumacp/puma:4.0.0) is not yet published for
this release: the Trivy gate in the publish pipeline blocked it on 3 HIGH-severity
base-image CVEs (0 CRITICAL). This is the security gate working as designed; the
image will be re-published once the base image is patched (tracked for S12.19).
Use the PyPI package in the meantime.

Quick start

See the documentation site and the
getting-started overview.

What's in v4.0.0

Added — federated submission infrastructure (S12.15) validated by the
inaugural submission; PyPI + ghcr.io publishing; multi-model dashboard view;
consolidated technical reference; manual IDE contribution workflow; security
audit MVP; corporate monochrome visual identity; acrostic visual flexibility.

Changed — mkdocs nav 6 → 28 public pages; read-only puma models
sub-group (D30 resolved); pyproject.toml hardened; project version → 4.0.0.

Security — pip-audit / bandit / gitleaks on every push; Trivy on every
container publish (blocks HIGH/CRITICAL); SECURITY.md disclosure policy;
9-check submission validation pipeline.

InfrastructureDockerfile.publish (multi-stage, non-root, OCI labels);
GitHub Pages live with a 28-page nav; Hugging Face dataset mirror operational.

Full detail in CHANGELOG.md.

Links

Known limitations (deferred to S12.19 / post-Sprint-12)

See docs/known_debt.md.

  • D38validate-submission workflow references a non-existent action version.
  • D39verify-integrity workflow broken by gradio_client API drift; the inaugural submission is therefore self-attested.
  • D40puma share-results CLI hangs after the Review panel.
  • Container publish blocked by 3 HIGH base-image CVEs (re-publish after patching).
  • notify-discord lacks the DISCORD_WEBHOOK secret (optional integration).

Acknowledgments

Thanks to everyone who tested the submission pipeline end to end and helped land
the first official community submission — the milestone that validates this release.