feat(api): AIN-179 launch hardening — public-audit cache + AIN-183 P0-1 by hizrianraz · Pull Request #45 · ainfera-ai/api

hizrianraz · 2026-05-19T09:17:55Z

Summary

Two safe pre-launch changes against the public surface of the api repo, bundled because they share scope (security audit, P0 launch readiness) and ship into the same Railway deploy.

1. /v1/audit/public — explicit short-TTL cache

Live probe at 09:08 UTC showed cf-cache-status: DYNAMIC (no cache header set), meaning every public-feed hit lands on Railway origin. Fine at steady state — but the route is advertised as a "live audit chain" and we expect launch-day traffic from social/Show HN.

Setting Cache-Control: public, max-age=10, s-maxage=10, stale-while-revalidate=60:

10s edge cache absorbs thundering-herd
60s SWR keeps the page responsive during origin blips
Advertised "live" lag stays ≤10s

/v1/audit/{agent_id} is auth-gated (CurrentTenant), not public, and is NOT touched.

2. AIN-183 P0-1 remediation for api repo

The web/ side of P0-1 was fixed by 7bbb8a0 on 2026-05-19. The api/ side was missed: git ls-files .claude/ still returned .claude/CLAUDE.md, leaking founder Workspace email + internal agent fleet topology to anyone with read access to github.com/ainfera-ai/api.

Same fix shape as web/:

Drop !.claude/CLAUDE.md / SKILL.md / AGENTS.md whitelist from .gitignore
git rm --cached .claude/CLAUDE.md removes from HEAD index. File stays on disk for local sessions.

The file remains in git history. Force-push / BFG sweep is deferred to AIN-183 PR F republish lane.

Test plan

CI green (ruff + mypy --strict + pytest passed locally pre-commit)
After Railway redeploy, curl -sI https://api.ainfera.ai/v1/audit/public shows cache-control: public, max-age=10, s-maxage=10, stale-while-revalidate=60
After Railway redeploy, two curl hits within 10s return identical top event_id (edge cache hit)
gh api repos/ainfera-ai/api/contents/.claude/CLAUDE.md --jq . returns 404 on main post-merge
Founder confirms local .claude/CLAUDE.md still loads (untouched on disk)

Closes: AIN-179 audit-feed launch hardening child, AIN-183 P0-1 api-side
Discipline: #1, #3, #11

…-1 .claude untrack Two safe pre-launch changes against the public surface of the api repo. 1. /v1/audit/public — explicit short-TTL Cache-Control. Live probe at 2026-05-19 09:08 UTC showed the route running with no Cache-Control header (cf-cache-status: DYNAMIC), meaning every hit lands on the Railway origin. At steady state this is fine — top events are ~minutes old, no staleness exists — but the prompt CC dispatched against this PR was diagnosed on an earlier transient; the underlying gap (no explicit launch-time cache policy on a public endpoint advertised as a live audit chain) is still real and worth closing before traffic spikes. public, max-age=10, s-maxage=10, stale-while-revalidate=60 gives: - 10s edge cache absorbs thundering-herd from Show HN / social - 60s SWR keeps the page responsive during upstream blips - advertised "live audit chain" lag stays ≤10s — well within what a reasonable observer would call "live" /v1/audit/{agent_id} is NOT touched — that route requires CurrentTenant auth and is not a public surface. 2. AIN-183 P0-1 remediation for the api repo. The 2026-05-19 11:59 commit on web/ (7bbb8a0) untracked .claude/CLAUDE.md on the marketing repo but the equivalent fix was never applied here. git ls-files .claude/ on this repo still returned .claude/CLAUDE.md — leaking founder Workspace email, internal agent fleet topology (Manwe/Namo/Aule/Tulkas), and brand v1.3 spec to anyone with read access to the public github.com/ainfera-ai/api repo. Fix is the same shape as the web/ fix: - Drop the !.claude/CLAUDE.md / !.claude/SKILL.md / !.claude/AGENTS.md whitelist lines from .gitignore. None of those three were ever needed tracked in a public repo; the whitelist was defense-in-the-wrong- direction. - git rm --cached .claude/CLAUDE.md removes from HEAD index. The file stays on disk so local Claude sessions in this checkout continue to load it as session memory. This commit only removes from HEAD; the file remains in git history and the founder will need a force-push or BFG sweep to scrub from history. That work is deferred to AIN-183 follow-up (PR F republish lane). Discipline: #1 (production claim ≤10s lag matches reality), #3 (founder PII off public surface), #11 (cache policy + advertised liveness align). Closes: AIN-179 child (audit feed launch hardening), AIN-183 P0-1 api-side

linear-code · 2026-05-19T09:18:00Z

AIN-179 🚀 Session 3.5 DELIVERY CHARTER — Close all Linear tickets by delivering, except payment + regulation lanes (Aule's canonical execution reference)

Severity: URGENT 🚀 — AULE'S CANONICAL EXECUTION CHARTER FOR THIS DELIVERY WAVE

Founder directive 2026-05-18 PM: "Patch now and close all linear tickets now by delivering except payment and regulation. Tell CC."

This ticket is Aule's persistent reference. Read at every poll. Updates appear as comments.

What this means in one paragraph

Close every open Linear ticket by shipping it (PR merged + prod deploy verified + Tulkas re-probe passes), EXCEPT tickets in the payment lane (deferred to AIN-129 CDP unlock) or regulation lane (deferred to legal review). All other open tickets are in scope — bug fixes, Phase 0 epics, v6.1 docs, fleet agent legs, founder-decision-pending items get clear recommendations from Aule, and ghost-state cleanups.

DEFER lanes (do NOT touch)

Payment lane (gated by founder AIN-129 CDP signup ~10 min)

Ticket	What
AIN-128	USDC x402 ship (parent)
AIN-129	CDP signup (founder action)
AIN-130	Wire x402 USDC settlement
AIN-131	Dashboard /wallet topup UI
AIN-132	Marketing /pricing honesty
AIN-86	Wallet topup end-to-end
AIN-111	Manwe migration (touches wallet caps)

Plus: any new ticket that requires USDC/Stripe/Xendit/wallet-topup work. Flag and skip.

Regulation lane (gated by legal review)

EU AI Act Annex IV mapping work
MAS PSA legal review (pre-Seed)
SG IPOS trademark filings
USDC settlement legal review
Anything explicitly tagged "legal review required" or that would create regulatory exposure if shipped without counsel

No current tickets in this lane. Flag-and-skip if new ones surface during this delivery wave.

DELIVERY queue (SHIP ALL)

Already In Progress (Aule executing — 5 of 6 since 13:46–14:06 UTC)

Ticket	Started	What
AIN-173 🔴	13:51	Bug 1 catalog/inference drift fix
AIN-175 🟠	13:46	Bug 2 agent_handle resolver in error path
AIN-176 🟡	13:49	Bug 3 finish_reason normalization
AIN-177 🟡	13:49	Bug 4 content shape normalization
AIN-178 🔴	14:06	Tulkas Phase 0 batteries (Phase 1 cron deferred to Lock U)

Pending pickup (Backlog → In Progress when Aule reaches them)

Ticket	What
AIN-174 🔴	Bug 5 native SSE streaming (highest-risk fix, STAGING canary mandatory)

Phase 0 epics (designed + locked, founder-gate-free)

Ticket	What	Time
AIN-152	Waitlist GitHub OAuth scaffold (Lock P unlocks live flow)	60 min
AIN-153	Dashboard 8-page Unicorn (D19/D20/D21 locked)	90 min
AIN-154	Router hardening Phase 0 (D22-D26 locked) — parent of 5 bug fixes	90 min
AIN-158	Marketing 5-page Ampersend (D34-D37 locked)	90 min
AIN-161	Models leaderboard 24h (D27-D30 locked)	75 min

In Review (close after merge — 8 tickets)

Ticket	What	Action
AIN-83	Pre-launch polish	Audit PRs #11/#12 sdk+web via `gh pr list --state merged --search "AIN-83"`; if merged → Done
AIN-84	Dashboard end-to-end fix	Verify against prod, close as Done
AIN-120	Aule GitHub MCP transport decision	Close per session-3 decision OR file follow-up
AIN-151	Yavanna heuristics	PR #38; founder merge → Done
AIN-160	Fleet tools wiring	PR #39/40/41/42; founder merge → Done
AIN-169	v6.1 patch Discipline #6 corollary	PR #36; founder merge → Done
AIN-170	INC-2026-05-18-002 re-rotation	Verify token rotated, audit clean → Done
AIN-172	SPOF env override generalization	PR #37; founder merge → Done

Founder-decision-pending (Aule writes recommendation, founder picks)

Ticket	What	Aule's recommendation
AIN-87	Settings API Keys + Tenant schema	Per Aule's 14:24 comment: demote to v1.9 schema spike, split into AIN-87a (api_keys migration + endpoints), AIN-87b (tenant PATCH + region column), AIN-87c (dashboard wiring). OR close as blocked. Founder decision needed within 24h to avoid ghost-state.

Ghost-state to verify clean

Aule's session-3 close claimed "AIN-87/88/89/90/124 closeouts filed." Verified:

AIN-87 NOT a closeout (see above)
AIN-88, AIN-89, AIN-90 status not yet spot-checked — Aule verify next poll
AIN-124 status not yet spot-checked — Aule verify next poll

If any of AIN-88/89/90/124 closeout comments recommend Done state transition, Aule moves them. If any recommend a decision, Aule files a comment summarizing for founder.

Execution sequencing (Aule's 6-PR ship plan)

Per AIN-154 comment + AIN-160 comment from 13:39 UTC:

PR #1 → AIN-173 catalog audit + AIN-178 Phase 0 Tulkas batteries
        STAGING canary per When-stuck #19 → Tulkas validates → prod
      
PR #2 → AIN-174 native SSE streaming (highest risk)
        STAGING canary + Tulkas validation → prod
      
PR #3 → AIN-175 agent_handle resolver in error path (low risk)
        Direct to prod after pre-commit gates
      
PR #4 → AIN-176 + AIN-177 bundled response normalization
        STAGING canary + Tulkas validation → prod
      
PR #5 → AIN-178 Phase 1 Tulkas cron activation
        ⚠️ Hetzner systemd is shared-infra per Discipline #6 corollary
        STOP and request Lock U itemized auth before opening PR
      
PR #6 → After all 5 bugs fixed: re-validate PR #40/#41/#42 (Yavanna/Aule/Namo legs)
        against fixed /v1/inference surface before founder merges them

After PR #1-4 land:

Replay Tulkas Battery docs(adr): ADR-012/013/014 — AA integration roadmap #4 (cross-framework simulation) against fixed /v1/inference
If all 5 frameworks pass ≥95% success rate → notify founder to merge PR feat(api): AIN-152 Phase B1 · /v1/waitlist signup + counter endpoints (re-PR off main) #40/feat(api): AIN-161 Phase 2 · /v1/stats/public/leaderboard endpoint #41/fix(deploy): Dockerfile run alembic upgrade head on container startup #42
If any framework fails → file a child ticket under AIN-160 with reproducer

After PR #5 (Tulkas Phase 1):

Verify cron schedule running on Hetzner CX42
Monitor 24h: audit chain shows ≥100 tulkas.* events/hour
If Tulkas hits daily cap unexpectedly → file bug + revisit caps

After Phase 0 epics deliver (AIN-152/153/154/158/161):

Each epic's Phase 1 is a separate planning cycle, not part of this delivery wave
Mark each Phase 0 ticket as Done when PR merged + smoke passes
File Phase 1 tickets if appropriate, but do NOT auto-execute Phase 1

Disciplines still gating Aule (do NOT relitigate)

Discipline #1 — No Done without proof

Curl + browser + test must all pass. No "PR merged" = "Done." Verify deploy with explicit curl against prod surface, browser smoke if UI-touching, all pre-commit gates green.

Discipline #6 corollary — Shared-infra config requires itemized auth

Even with this blanket "deliver all" directive, the following STILL require explicit per-change auth:

Docker compose changes (any service touched)
Kubernetes / Hetzner systemd / Railway env changes
DNS records / Cloudflare zones / Vercel projects
IAM / KMS / Doppler vault scope changes
1Password vault scope changes
Prod database migrations (use staging-canary first per When-stuck fix(api): AIN-141 strip retired AAMC model literals (code-side only; DB migration deferred) #19)

Canonical case this wave: AIN-178 Phase 1 Tulkas cron on Hetzner systemd. STOP and request Lock U before shipping.

Discipline #11 corollary — No catalog vs runtime drift

Bug 1 (AIN-173) IS this corollary in production. When fixing, also add catalog-runtime sync verification: any change to models.capabilities or cost_per_call_estimate_usd MUST be followed by a smoke against /v1/inference that auto-routes through that capability/cost path. Otherwise we're shipping a drift.

Discipline #12 ESCALATE — Moat decisions stay founder-only

Even with deliver-all auth, the following are NOT in scope:

AAMC voter weights or composition changes
Settlement model (deferred with payment anyway)
ATS dimension weights or veto thresholds
Founder identity surface (PUBLIC LOCK — absolute, no exceptions)
Public scoring methodology
OSS license terms
Cross-tenant data sharing rules

If a fix touches any of these, STOP and surface to founder before shipping.

Discipline #14 corollary — Secrets clipboard-only

Per INC-2026-05-18-001 + INC-2026-05-18-002. If any bug fix surfaces a secret value (env var, token, hash), DO NOT echo it in transcripts. Use pbcopy / xclip and tell founder "copied to clipboard."

When-stuck #19 — STAGING canary before prod for catalog/threshold/router migrations

AIN-173 catalog fix is the canonical case for this wave. STAGING canary first, then prod. Verify Tulkas Battery #1 passes against STAGING before deploying to prod.

Communication protocol

When you ship a PR

Post a comment on this ticket (AIN-179) with:

PR URL
Tickets it closes
Production verification curl output (sanitized of secrets)
Tulkas Battery results if applicable

When you hit a Discipline #6 or #12 wall

Post a comment on this ticket asking for itemized auth. Don't proceed.

When you find a NEW bug class (not regression of existing)

File a new child ticket under AIN-154 (router hardening) or AIN-160 (fleet tools), per the bug's surface. Add to delivery queue.

When a founder-decision ticket needs decision

Post a comment summarizing the decision + your recommendation. Stop on that ticket, continue with others.

Scope NOT in this charter (out of scope or already done)

AIN-115 D30-D45 launch posture lock — stays as posture anchor, don't close
AIN-163 v1.9-CP Universal Control Plane — backlog, post-v1.8 (~Aug 2026)
AIN-133 Sprint v1.7 PARENT — already Done
AIN-171 production 500 incident — already Done
Anything cancelled in session-3 — leave cancelled

Verification gate before declaring "all delivered"

This charter is closed (state → Done) when:

Founder authorization trail

Session 3 close: "All tickets should be delivered except payment."
Session 3.5 close: "Patch now and close all linear tickets now by delivering except payment and regulation. Tell CC." (2026-05-18 PM)
All cited disciplines remain in force per CC Super Prompt v6 + v6.1 patches.

Aule: start at PR #1 in the sequence. Update this ticket as you ship.

AIN-183 🟠 All-repos audit sweep — 14 repos × spec-vs-built + Discipline #3/#4/#11/#17 verification

Severity: 🟠 HIGH — comprehensive repo-level Discipline #1 + #17 audit

Filed 2026-05-18 PM after AIN-153 + AIN-158 spec-vs-built audit revealed Aule shipping less than spec'd then marking parents Done. Pattern requires systematic verification across ALL 14 repos in the ainfera-ai GitHub org.

Founder directive: "Hard revert to In Progress and force the missing work. Also check all repos."

Scope

For each of the 14 repos in ainfera-ai org, Aule audits:

Spec vs built — does the repo's stated purpose match what's actually in main?
Tickets marked Done — for each Done ticket touching this repo in last 14d, verify the deliverables match the ticket's acceptance gates
Surfaces live in production — does the prod deployment match what main branch's app/ or src/ declares?
Founder PII grep — 0 matches for "Hizrian|Izzy|Raz|founder|fibromyalgia|ADHD|snowboard|Julius Baer" in any public surface or repo README/docs
Internal agent name grep — public-facing surfaces should NOT mention Manwe/Yavanna/Namo/Aule/Tulkas (Varda is acceptable per PUBLIC build(deps): Bump astral-sh/setup-uv from 5 to 7 #1 lock)
Lock compliance — D7-D37 locks honored in any new code
Discipline fix(orm): use postgresql.ARRAY so capabilities.contains() emits @> #11 — Notion canonical vs code drift

14 repos to audit

Production-facing (Phase 1, urgent)

#	Repo	Audit focus
1	`ainfera-ai/web`	Marketing + dashboard pages — match D14 + Part 2 spec
2	`ainfera-ai/api`	FastAPI surfaces, /v1/* endpoints match docs.ainfera.ai claims
3	`ainfera-ai/ainfera-os`	Monorepo root, docs/ folder, SKILL.md files, package versions
4	`ainfera-ai/mcp`	mcp.ainfera.ai FastMCP server, tool surface matches docs
5	`ainfera-ai/sdk` (or python-sdk if named differently)	PyPI `ainfera` package matches API surface, exported symbols

Agent-implementation (Phase 2)

#	Repo	Audit focus
6	`ainfera-ai/varda` (or similar)	NemoClaw orchestrator config, GPT-5.5 binding, public surface compliance
7	`ainfera-ai/aule`	Claude SDK + Opus 4.7 xhigh config, author override in commit history
8	`ainfera-ai/yavanna`	LangGraph + Sonnet 4.6 + Grok 4, response_review heuristics
9	`ainfera-ai/namo`	Letta + Gemini 3.1 Pro, memory consolidation
10	`ainfera-ai/tulkas`	Garak + Mistral Large 3, probe batteries

Customer + tooling (Phase 3)

#	Repo	Audit focus
11	`ainfera-ai/hermes-agent` (Manwe Customer #1 fork)	v0.14.0 SHA a91a57fa matches reported state
12	`ainfera-ai/examples`	5 example agent repos per AIN-78
13	`ainfera-ai/specs` (or similar)	CC-BY 4.0 spec docs match Notion canonical
14	`ainfera-ai/.github` or org-level	Issue templates, PR templates, CODEOWNERS, branch protection

Audit deliverable per repo

For each repo, Aule produces a comment on this ticket with:

## Repo: ainfera-ai/<name>

### Spec match
- Stated purpose: <from README>
- Actual state: <from main branch traversal>
- Drift: <none / specifics>

### Recent Done tickets touching this repo
- AIN-XYZ — claimed: "..." → actual state: ✅ matches / ⚠️ partial / ❌ missing

### Production-vs-main drift
- Last deploy SHA: <sha>
- main HEAD: <sha>
- Drift: <none / files differ / etc>

### Grep results
- Founder PII: <count> matches → <files>
- Internal agent names in public: <count> matches → <files>

### Lock compliance
- D7-D37 references: <count> 
- Discipline #6 corollary violations: <count>

### Recommendation
- ✅ Clean / ⚠️ Cleanup needed / 🔴 Active violation

### Tickets to file
- <list of child tickets needed if cleanup work surfaces>

Audit commands Aule runs per repo

cd ~/code/ainfera-ai/<repo>
git pull origin main

# Discipline #3 grep
rg -i "hizrian|izzy|raz|fibromyalgia|adhd|snowboard|julius baer|sommelier" \
   --type-not lock \
   --type-not log \
   -l

# Internal agent naming in public surfaces
rg -i "manwe|yavanna|namo|aule|tulkas" \
   src/ app/ public/ docs/ README.md \
   --type-not lock \
   -l

# Discipline #4 author override check (last 50 commits)
git log -50 --pretty=format:'%h %an <%ae>' | rg -v "Aule <aule@" | head -20

# Production-vs-main drift (for deployed repos)
gh api repos/ainfera-ai/<repo>/deployments --jq '.[0:3] | .[].sha'
# Compare against `git rev-parse main`

# Spec vs files
ls -la docs/
cat README.md | head -30

Acceptance gates

All 14 repos audited
Audit comment posted per repo on this ticket
Any 🔴 Active violation gets immediate child ticket filed
Any ⚠️ Cleanup ticket gets queued for AIN-179 delivery wave
Summary comment at end: total findings + per-severity counts + recommended next actions
Aule author override on the audit branch
No remediation in this ticket — only findings. Remediation = follow-up tickets.

Out of scope

Code refactoring (only audit-and-report)
Linter sweeps (separate cleanup pass)
Test coverage analysis (covered by AIN-118 or successor)
Performance audit (separate concern)

Connection

Triggered by: AIN-153 + AIN-158 spec-vs-built mismatch finding (2026-05-18 PM session 3.5 audit)
Authority: AIN-179 delivery wave + "Also check all repos" directive
Discipline references: build(deps): Bump astral-sh/setup-uv from 5 to 7 #1 (no Done without proof) + fix(orm): use postgresql.ARRAY so capabilities.contains() emits @> #11 (no drift) + feat(phase-6): PR-J6a · signing-material endpoint + JWS verify middleware #17 (verify before claim)
Pattern reference: This is Discipline feat(phase-6): PR-J6a · signing-material endpoint + JWS verify middleware #17 operationalized at repo level

Founder authorization

Per "Hard revert to In Progress and force the missing work. Also check all repos" (2026-05-18 session 3.5 PM).

Review in Linear

cursor · 2026-05-19T09:18:01Z

You have used all Bugbot PR reviews included in your free trial for your GitHub account on this workspace.

To continue using Bugbot reviews, enable Bugbot for your team in the Cursor dashboard.

hizrianraz merged commit ba86479 into main May 19, 2026
3 checks passed

Conversation

hizrianraz commented May 19, 2026

Summary

1. /v1/audit/public — explicit short-TTL cache

2. AIN-183 P0-1 remediation for api repo

Test plan

Uh oh!

linear-code Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Severity: URGENT 🚀 — AULE'S CANONICAL EXECUTION CHARTER FOR THIS DELIVERY WAVE

What this means in one paragraph

DEFER lanes (do NOT touch)

Payment lane (gated by founder AIN-129 CDP signup ~10 min)

Regulation lane (gated by legal review)

DELIVERY queue (SHIP ALL)

Already In Progress (Aule executing — 5 of 6 since 13:46–14:06 UTC)

Pending pickup (Backlog → In Progress when Aule reaches them)

Phase 0 epics (designed + locked, founder-gate-free)

In Review (close after merge — 8 tickets)

Founder-decision-pending (Aule writes recommendation, founder picks)

Ghost-state to verify clean

Execution sequencing (Aule's 6-PR ship plan)

Disciplines still gating Aule (do NOT relitigate)

Discipline #1 — No Done without proof

Discipline #6 corollary — Shared-infra config requires itemized auth

Discipline #11 corollary — No catalog vs runtime drift

Discipline #12 ESCALATE — Moat decisions stay founder-only

Discipline #14 corollary — Secrets clipboard-only

When-stuck #19 — STAGING canary before prod for catalog/threshold/router migrations

Communication protocol

When you ship a PR

When you hit a Discipline #6 or #12 wall

When you find a NEW bug class (not regression of existing)

When a founder-decision ticket needs decision

Scope NOT in this charter (out of scope or already done)

Verification gate before declaring "all delivered"

Founder authorization trail

Severity: 🟠 HIGH — comprehensive repo-level Discipline #1 + #17 audit

Scope

14 repos to audit

Production-facing (Phase 1, urgent)

Agent-implementation (Phase 2)

Customer + tooling (Phase 3)

Audit deliverable per repo

Audit commands Aule runs per repo

Acceptance gates

Out of scope

Connection

Founder authorization

Uh oh!

cursor Bot commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

linear-code Bot commented May 19, 2026 •

edited

Loading