feat: SDK-driven pipeline + plain-HTTP frontend, with PR-gating CI hardening#1
Open
feat: SDK-driven pipeline + plain-HTTP frontend, with PR-gating CI hardening#1
Conversation
added 9 commits
May 7, 2026 09:35
…cket sentinel)
End state every `pip install beava` user gets, end-to-end:
- Server: beava v2 binary built from current workspace (deploy/Dockerfile.beava).
- Pipeline: defined as `@bv.event PageView` + `@bv.table SiteMetrics` (no key=,
one global row per ADR-003). Registered via the SDK wire format the
binary natively accepts — `{nodes:[{kind:"event"...},{kind:"derivation"...}]}`.
- Frontend: pure `fetch('/api/push/PageView', {path, dwell_ms})` on pagehide.
Zero JS SDK shim. Zero bucket sentinel. Zero per-path fan-out.
- Read: `POST /api/get {table:"SiteMetrics", key:""}` returns the row,
ADR-003 empty-string-key route into the global bucket. Mirror of Python
SDK `app.get("SiteMetrics")` with no key kwarg.
Hetzner deployment changes (live):
- deploy/Dockerfile.beava: builds beava:next from `crates/*` workspace.
- beava-website/deploy/beava.yaml: listen 8080, admin 8090, wal+snapshot to /data.
- beava-website/deploy/docker-compose.prod.yml: drops the env-var soup, mounts yaml.
- beava-website/deploy/Caddyfile: /api/push/* → :8080, /api/get → :8080,
/api/registry → :8090. No more Authorization injection — register goes
through public POST per the v2 binary contract.
- beava-website/deploy/site-metrics-pipeline.json: the registered shape,
re-runnable for cold-start setups.
Removed: beava-website/project/js/beava-client.js (no longer needed —
frontend talks plain HTTP directly).
Python SDK touch: docstring example on `App.get(table)` showing the
`@bv.table` (no key=) → `app.get(...)` (no key) symmetry per ADR-003.
Verified live on beava.dev: `curl -X POST .../api/get -d '{"table":"SiteMetrics","key":""}'`
returns `{"site_med_dwell_1h":7400.0,"site_views_1h":17,"site_views_today":17}`.
`pip install beava`, `brew install beava-dev/tap/beava`, and the prebuilt ghcr.io image don't exist yet — there's no v0.0.0 release tag, no PyPI package, no published image. Replacing the install snippets with paths that actually work today against the source repo: pip: pip install git+https://github.com/beava-dev/beava#subdirectory=python cargo: cargo install --git https://github.com/beava-dev/beava beava-server docker: git clone ... && docker build -f deploy/Dockerfile.beava -t beava . The brew/pip-from-PyPI/static-binary forms come back when v0.0.0 ships (homebrew-bump.yml is wired and waiting for the tag). A blockquote on /docs/install/ flags this explicitly so visitors aren't left guessing.
Workflow uses GITHUB_TOKEN (no extra secret) to push the image to GHCR under the org. Triggers on changes to crates/ + Dockerfile + Cargo + the workflow itself; manual workflow_dispatch is also wired. Tags every push as :edge (rolling) + :sha-<commit> (immutable, for rollback). After this lands on main, the homepage Docker tab points at the prebuilt image (`docker run -p 8080:8080 ghcr.io/beava-dev/beava:edge`) instead of the build-from-source dance. Note: the first push to main with this workflow is what bootstraps the image. Until that run completes, the docker tab points at a tag that 404s — minor pre-release rough edge, healed on first merge.
DOCKERHUB_USERNAME + DOCKERHUB_TOKEN secrets are already wired on beava-dev/beava (set 2026-05-07 per gh secret list). Earlier commit 7df9179f locked beavadev/beava as the Docker namespace. Workflow now authenticates against Docker Hub and pushes there; GitHub-side perms drop to contents:read since GHCR is no longer involved. Homepage docker tab + /docs/install/ doc both updated to `docker run -p 8080:8080 beavadev/beava:edge`.
The pipeline JSON in beava-website/deploy/site-metrics-pipeline.json is the source of truth for the SiteMetrics shape. Until this commit, changing it required SSH + manual curl POST. Wired into deploy-hetzner.yml so any push to main that touches the JSON re-registers it on the live binary (with force=true, so schema-changing edits land cleanly). Workflow now: 1. rsync project/ static files 2. rsync deploy/site-metrics-pipeline.json + POST to beava:8080/register 3. smoke: homepage + /docs/ + POST /api/get for SiteMetrics row Path filter expanded to fire on either project/** or the pipeline JSON.
The job ran `cargo test -p beava --test sharding_parity` — a test against a `beava` package that doesn't exist in the v2 workspace (it's split into beava-core / beava-server / beava-persistence / beava-runtime-core / beava-bench). The underlying sharding-parity test never landed in the v2 greenfield reset, and per the locked architectural decision (`project_no_sharded_apply`, locked 2026-04-27) the project deliberately ships single-threaded apply forever — sharded apply is a Redis-cluster- shape operator concern, not a data-plane concern. The check was failing on every PR and gating nothing real. Phase 52 D-14 / TPC-CORR-05 is no longer in scope. Reminder for the operator: remove sharding-parity-smoke from required status checks in GitHub repo Settings → Branches → main if it's listed there, otherwise PRs will stay blocked on a job that no longer exists.
Adds CODEOWNERS (catch-all + per-tree mappings to @petrpan26) so GitHub auto-requests code-owner review on every PR, and gates the heavy Rust + Python check jobs behind the `ok-to-test` label. How it composes: - CODEOWNERS makes review-by-code-owner mandatory before merge (when branch protection is on). - pr.yml `rust` job runs only if (a) PR is from the same repo (i.e. a branch by a code owner / collaborator) OR (b) the `ok-to-test` label is present. `python` cascades via `needs: rust`. - `pull_request: types: [..., labeled, ...]` so adding the label re-fires the workflow without a fresh push. External contributors get cheap-fast review: code owner reads the diff, adds `ok-to-test` if it looks reasonable, the workflow runs. PR template: replaces the old skeleton with a Local checks section showing the exact `cargo fmt / clippy / nextest`, `pytest`, `docker build`, and `npm run build` commands matching what CI runs. Saves the "why is this red?" round-trip on every external PR.
Even owner-authored PRs require the explicit `ok-to-test` label before any check fires. Drops the `head.repo.full_name == repository` escape hatch in pr.yml; gates every job in ci.yml on PR events with the same condition. Push-to-main short-circuits via `github.event_name == 'push'` so post-merge CI on `main` runs unconditionally — no label management for direct merges.
Why: contributors land PRs with checks failing because they don't run the same gates CI runs. With heavy CI gated on `ok-to-test`, a code owner has to label-and-wait for every untested attempt. Friction for the reviewer. What: - `.github/scripts/check.sh` — one-shot that runs `cargo fmt --check` + `cargo clippy -D warnings` + `cargo nextest` (falls back to `cargo test` if nextest absent) + `pytest python/tests`. Prints a PASS/FAIL summary block sized for direct paste into a PR body. `--fast` skips cargo test (~10× faster). Full log to `~/.beava-check.log`. - PR template gains a "Verification" section that asks for the paste-back. Reviewer eyeballs the summary before labeling. The script + template together compress the contributor loop: clone → bash check.sh → paste output → push → reviewer reads, labels.
added 7 commits
May 7, 2026 10:19
Earlier commits wrote the new PR template (with `bash check.sh` one-shot + Verification block) to `.github/pull_request_template.md` — but git's index already tracked `.github/PULL_REQUEST_TEMPLATE.md` (uppercase). On macOS's case-insensitive filesystem the two paths look identical, so the writes appeared to land while git was actually tracking a sibling file. GitHub's PR-template lookup hits the uppercase path, so contributors were still seeing the old skeleton. Removes the lowercase ghost; consolidates content into the canonical uppercase file.
Closes the lint-coverage gap from the audit: - ruff: fully wired in CI (ci.yml + pr.yml) and check.sh. Fast, fails the build on any lint regression. Reads existing config from python/pyproject.toml [tool.ruff] (line 100, target py310, E/F/I/W/B). - mypy --strict: wired but `continue-on-error: true` in CI and `WARN` in check.sh. Strict mode catches a lot, the existing codebase isn't fully clean yet. Drop the flag once green; the gate stays in place. Local check.sh now runs both with graceful skips if the tools aren't installed (`pip install ruff mypy` to enable). Output formatted to paste straight into the PR Verification block. Rust side already covered (cargo fmt + clippy -D warnings); this brings Python to parity except for the strict-mode hold-back.
Three-layer review tuned for this codebase: 1. Real bugs — off-by-one, .unwrap() in non-test code, mio-token leaks in the hand-rolled runtime, .concurrent races violating the locked single-threaded-apply invariant, wire-format breaks across the SDK <-> binary boundary, TDD discipline (test: before feat: per CLAUDE.md §Conventions). 2. Architectural invariants from CLAUDE.md / memory — mio-only hot path, events-only v0, Redis-shaped processing-time-only, no-same-key sketch batching, no AI attribution in commits. 3. AI slop — hollow/phantom code, disconnected pipelines, defensive-but- unreachable conditions, what-not-why comments, em-dash density, hedge-y openings. Severity scaled (BLOCK / WARN / NIT) so reviewers don't drown in NITs. Output format is fixed (Summary / BLOCK / WARN / NIT / Looks good) keyed by file:line so follow-up automation can grep it. Sources: - Antislop (ICLR 2026) — slop frequency baselines - AI-SLOP Detector — 27 adversarial pattern checks - LLM code-review early results (arXiv 2404.18496) - CLAUDE.md + memory for beava-specific invariants .gitignore updated to ship `.claude/skills/beava-pr-review/` while keeping the rest of `.claude/` excluded — contributors using Claude Code in the repo get the skill auto-discoverable.
Run after `ruff check . --fix` then hand-fix the remaining 6: - 3× E501 wrap long error-message strings in tests/conformance/test_cross_sdk.py - 2× E501 shrink long docstrings (test_app_lifecycle, test_pipeline_dsl) - 1× B007 rename unused loop var `i` to `_` (test_point_ordinal) ruff check . now reports "All checks passed".
…on flags Two fixes: 1. `pytest tests/` was bypassing pyproject's `testpaths = ["tests/v0"]` by passing an explicit path. CI runs plain `pytest -q` which honors testpaths and only gates the curated v0 acceptance suite (100 tests). Aligning local check.sh with CI: drop the `tests/` arg. The 76 failures in tests/internal/, tests/bench/, tests/integration/, tests/conformance/ are pre-Phase-13.5.1 SDK rewrite drift, explicitly documented in pyproject.toml as v0.0.x cleanup backlog — not gated. 2. Add `--rust` / `--python` flags so contributors who only touch one stack can skip the long-running cargo step. Mutually exclusive; default (no flag) still runs everything. Updates PR template to match the new flags + verification block format.
… all-features pr.yml shipped with `--all-features`, but ci.yml had already documented that flag triggers pre-existing borrow-checker errors in the io_uring backend (v0.0.x backlog). ci.yml works around it with `--features testing`; pr.yml was missing the same workaround. Result: PR checks failed with 6 compile errors in beava-runtime-core (unused imports, E0283/E0499/E0502 borrow violations) that don't reproduce with `--features testing`. This commit makes pr.yml mirror ci.yml's clippy + test invocations exactly.
Heavy gate now runs once per ok-to-test label-add. Subsequent pushes to the PR do NOT auto-rerun. To revalidate a new tip, a code owner removes + re-adds the `ok-to-test` label. Tradeoff: explicit re-approval per commit (no CI burn from rapid pushes) at the cost of a small UX hit for active iteration. Matches the spirit of the label-gate: code-owner approval is per-commit, not "I trust this branch forever."
added 4 commits
May 7, 2026 11:49
Previous commit dropped synchronize. This drops opened/reopened. Only `labeled` remains: the workflow fires exactly when a code owner adds `ok-to-test`, and only then. One-shot per label-add — no auto-rerun on push, no auto-fire on PR open. To revalidate a new tip: remove + re-add `ok-to-test`.
…l gates Two-step heavy-CI flow: 1. Code owner adds `ok-to-test` (per CODEOWNERS) — the gate. 2. PR owner: Actions → "PR Checks" → Run workflow → PR number → Run. A new `gate` job validates `ok-to-test` is present at run time, looks up the PR's head SHA, and exposes it for downstream checkout. Removing the label revokes permission for future runs, even mid-iteration. Behavioral changes vs previous trigger: - No auto-fire on label-add (label unlocks; doesn't trigger) - No auto-fire on push, PR open, reopen, synchronize - Manual click in Actions tab is the trigger - Concurrency keyed on inputs.pr_number (not event.pull_request.number) - Adds `pull-requests: read` to default-deny token - Python step now honors pyproject testpaths (matches local check.sh fix) Updates PR template to document the two-step trigger flow.
…spatch escape hatch
Researched what tokio / ruff / polars / arrow / huggingface actually do.
Universal finding: there is no native "PR-owner clicks button to start
CI" feature in GitHub. The closest thing is "Approve and run workflows"
in the PR's Checks tab, but that's a maintainer button surfaced by repo
Settings → Actions → "Require approval for all outside collaborators."
This commit aligns pr.yml with the standard pattern:
on:
pull_request:
types: [labeled] # ok-to-test → CI fires once
workflow_dispatch: # maintainer escape hatch w/ PR number
inputs:
pr_number: ...
jobs:
gate:
if: workflow_dispatch || labels contain 'ok-to-test'
# plus run-time recheck via gh pr view → revoking the label
# mid-flight kills future runs
Differences from canonical OSS (tokio etc.):
- DROP `synchronize` — most projects auto-rerun on push when the label
is set; here we keep one-shot semantics per maintainer's request.
To revalidate new commits: remove + re-add ok-to-test.
- Run-time gate job re-checks the label via gh pr view (defense in
depth — protects against label-removed-after-trigger races).
Required repo setting (one-time, manual): Settings → Actions → General →
"Fork pull request workflows from outside collaborators" → "Require
approval for all outside collaborators." This unlocks GitHub's native
"Approve and run" button as the second physical gate for fork PRs.
Updates PR template to document the standard flow.
Native fork-PR approval (just enabled via gh api PUT
.../actions/permissions/fork-pr-contributor-approval = all_external_contributors)
is the security boundary. Label gate on internal PRs added friction
without security value for a single-maintainer repo where the maintainer
trusts their own pushes.
Changes:
- pr.yml: simple pull_request[main] trigger; no label, no workflow_dispatch,
no gate job. Just rust + python.
- ci.yml: drop the `[opened, synchronize, reopened, labeled]` types
filter and the per-job `if: push || ok-to-test` gate. Standard PR
auto-CI now.
- PULL_REQUEST_TEMPLATE.md: replace label-flow language with the
"fork → Approve and run, internal → auto" model.
The `ok-to-test` label itself is left in place (harmless; can be deleted
later if not used). Repo setting `Require approval for all outside
collaborators` carries the security gate.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
End-to-end clean state for
pip install beavausers. The bv.dev homepage now uses the SDK's wire format end-to-end — no shim layer, no bucket-field sentinel, no JS client SDK on the frontend.First PR woo testing some flow.
Pipeline
Wire (what flies on the network)
POST /api/push/PageViewwith{path, dwell_ms}. Pure HTTP from the browser, no SDK shim.POST /api/getwith{table: "SiteMetrics", key: ""}— empty-key sentinel for the keyless table per ADR-003.{nodes: [{kind:"event"...}, {kind:"derivation"...}]}and POSTs to/register. The binary on the box accepts it natively.Hetzner deployment changed (live)
deploy/Dockerfile.beava— builds beava:next fromcrates/*workspace.beava-website/deploy/{beava.yaml, docker-compose.prod.yml, Caddyfile}— yaml-driven config (listen 8080, admin 8090), drops the env-var soup, drops the admin-token Authorization injection.beava-website/deploy/site-metrics-pipeline.json— registered shape, re-runnable for cold-start.Frontend
beava-website/project/js/track-pageview.js— pure HTTP, ~30 lines, no shim.beava-website/project/js/beava-client.js— deleted. We tried it, then dropped it once the binary upgrade let the frontend talk plain HTTP.beava-website/project/index.html— LiveMetrics pollsPOST /api/getfor the global SiteMetrics row.Verified
Followups (separate)
PageMetricsderivation once feature names are namespaced.beava-website_beava-statevolume (still has old format data hanging around).v0.0.0to fire the homebrew-bump workflow.