Skip to content

feat(api): AIN-179 launch hardening — public-audit cache + AIN-183 P0-1#45

Merged
hizrianraz merged 1 commit into
mainfrom
feat/ain-179-audit-cache-control-and-p01-remediation
May 19, 2026
Merged

feat(api): AIN-179 launch hardening — public-audit cache + AIN-183 P0-1#45
hizrianraz merged 1 commit into
mainfrom
feat/ain-179-audit-cache-control-and-p01-remediation

Conversation

@hizrianraz
Copy link
Copy Markdown
Contributor

Summary

Two safe pre-launch changes against the public surface of the api repo, bundled because they share scope (security audit, P0 launch readiness) and ship into the same Railway deploy.

1. /v1/audit/public — explicit short-TTL cache

Live probe at 09:08 UTC showed cf-cache-status: DYNAMIC (no cache header set), meaning every public-feed hit lands on Railway origin. Fine at steady state — but the route is advertised as a "live audit chain" and we expect launch-day traffic from social/Show HN.

Setting Cache-Control: public, max-age=10, s-maxage=10, stale-while-revalidate=60:

  • 10s edge cache absorbs thundering-herd
  • 60s SWR keeps the page responsive during origin blips
  • Advertised "live" lag stays ≤10s

/v1/audit/{agent_id} is auth-gated (CurrentTenant), not public, and is NOT touched.

2. AIN-183 P0-1 remediation for api repo

The web/ side of P0-1 was fixed by 7bbb8a0 on 2026-05-19. The api/ side was missed: git ls-files .claude/ still returned .claude/CLAUDE.md, leaking founder Workspace email + internal agent fleet topology to anyone with read access to github.com/ainfera-ai/api.

Same fix shape as web/:

  • Drop !.claude/CLAUDE.md / SKILL.md / AGENTS.md whitelist from .gitignore
  • git rm --cached .claude/CLAUDE.md removes from HEAD index. File stays on disk for local sessions.

The file remains in git history. Force-push / BFG sweep is deferred to AIN-183 PR F republish lane.

Test plan

  • CI green (ruff + mypy --strict + pytest passed locally pre-commit)
  • After Railway redeploy, curl -sI https://api.ainfera.ai/v1/audit/public shows cache-control: public, max-age=10, s-maxage=10, stale-while-revalidate=60
  • After Railway redeploy, two curl hits within 10s return identical top event_id (edge cache hit)
  • gh api repos/ainfera-ai/api/contents/.claude/CLAUDE.md --jq . returns 404 on main post-merge
  • Founder confirms local .claude/CLAUDE.md still loads (untouched on disk)

Closes: AIN-179 audit-feed launch hardening child, AIN-183 P0-1 api-side
Discipline: #1, #3, #11

…-1 .claude untrack

Two safe pre-launch changes against the public surface of the api repo.

1. /v1/audit/public — explicit short-TTL Cache-Control.

   Live probe at 2026-05-19 09:08 UTC showed the route running with no
   Cache-Control header (cf-cache-status: DYNAMIC), meaning every hit lands
   on the Railway origin. At steady state this is fine — top events are
   ~minutes old, no staleness exists — but the prompt CC dispatched against
   this PR was diagnosed on an earlier transient; the underlying gap (no
   explicit launch-time cache policy on a public endpoint advertised as a
   live audit chain) is still real and worth closing before traffic spikes.

   public, max-age=10, s-maxage=10, stale-while-revalidate=60 gives:
     - 10s edge cache absorbs thundering-herd from Show HN / social
     - 60s SWR keeps the page responsive during upstream blips
     - advertised "live audit chain" lag stays ≤10s — well within
       what a reasonable observer would call "live"

   /v1/audit/{agent_id} is NOT touched — that route requires CurrentTenant
   auth and is not a public surface.

2. AIN-183 P0-1 remediation for the api repo.

   The 2026-05-19 11:59 commit on web/ (7bbb8a0) untracked .claude/CLAUDE.md
   on the marketing repo but the equivalent fix was never applied here.
   git ls-files .claude/ on this repo still returned .claude/CLAUDE.md —
   leaking founder Workspace email, internal agent fleet topology
   (Manwe/Namo/Aule/Tulkas), and brand v1.3 spec to anyone with read access
   to the public github.com/ainfera-ai/api repo.

   Fix is the same shape as the web/ fix:
   - Drop the !.claude/CLAUDE.md / !.claude/SKILL.md / !.claude/AGENTS.md
     whitelist lines from .gitignore. None of those three were ever needed
     tracked in a public repo; the whitelist was defense-in-the-wrong-
     direction.
   - git rm --cached .claude/CLAUDE.md removes from HEAD index. The file
     stays on disk so local Claude sessions in this checkout continue to
     load it as session memory.

   This commit only removes from HEAD; the file remains in git history and
   the founder will need a force-push or BFG sweep to scrub from history.
   That work is deferred to AIN-183 follow-up (PR F republish lane).

Discipline: #1 (production claim ≤10s lag matches reality), #3 (founder
PII off public surface), #11 (cache policy + advertised liveness align).

Closes: AIN-179 child (audit feed launch hardening), AIN-183 P0-1 api-side
@linear-code
Copy link
Copy Markdown

linear-code Bot commented May 19, 2026

AIN-179 🚀 Session 3.5 DELIVERY CHARTER — Close all Linear tickets by delivering, except payment + regulation lanes (Aule's canonical execution reference)

Severity: URGENT 🚀 — AULE'S CANONICAL EXECUTION CHARTER FOR THIS DELIVERY WAVE

Founder directive 2026-05-18 PM: "Patch now and close all linear tickets now by delivering except payment and regulation. Tell CC."

This ticket is Aule's persistent reference. Read at every poll. Updates appear as comments.


What this means in one paragraph

Close every open Linear ticket by shipping it (PR merged + prod deploy verified + Tulkas re-probe passes), EXCEPT tickets in the payment lane (deferred to AIN-129 CDP unlock) or regulation lane (deferred to legal review). All other open tickets are in scope — bug fixes, Phase 0 epics, v6.1 docs, fleet agent legs, founder-decision-pending items get clear recommendations from Aule, and ghost-state cleanups.


DEFER lanes (do NOT touch)

Payment lane (gated by founder AIN-129 CDP signup ~10 min)

Ticket What
AIN-128 USDC x402 ship (parent)
AIN-129 CDP signup (founder action)
AIN-130 Wire x402 USDC settlement
AIN-131 Dashboard /wallet topup UI
AIN-132 Marketing /pricing honesty
AIN-86 Wallet topup end-to-end
AIN-111 Manwe migration (touches wallet caps)

Plus: any new ticket that requires USDC/Stripe/Xendit/wallet-topup work. Flag and skip.

Regulation lane (gated by legal review)

  • EU AI Act Annex IV mapping work
  • MAS PSA legal review (pre-Seed)
  • SG IPOS trademark filings
  • USDC settlement legal review
  • Anything explicitly tagged "legal review required" or that would create regulatory exposure if shipped without counsel

No current tickets in this lane. Flag-and-skip if new ones surface during this delivery wave.


DELIVERY queue (SHIP ALL)

Already In Progress (Aule executing — 5 of 6 since 13:46–14:06 UTC)

Ticket Started What
AIN-173 🔴 13:51 Bug 1 catalog/inference drift fix
AIN-175 🟠 13:46 Bug 2 agent_handle resolver in error path
AIN-176 🟡 13:49 Bug 3 finish_reason normalization
AIN-177 🟡 13:49 Bug 4 content shape normalization
AIN-178 🔴 14:06 Tulkas Phase 0 batteries (Phase 1 cron deferred to Lock U)

Pending pickup (Backlog → In Progress when Aule reaches them)

Ticket What
AIN-174 🔴 Bug 5 native SSE streaming (highest-risk fix, STAGING canary mandatory)

Phase 0 epics (designed + locked, founder-gate-free)

Ticket What Time
AIN-152 Waitlist GitHub OAuth scaffold (Lock P unlocks live flow) 60 min
AIN-153 Dashboard 8-page Unicorn (D19/D20/D21 locked) 90 min
AIN-154 Router hardening Phase 0 (D22-D26 locked) — parent of 5 bug fixes 90 min
AIN-158 Marketing 5-page Ampersend (D34-D37 locked) 90 min
AIN-161 Models leaderboard 24h (D27-D30 locked) 75 min

In Review (close after merge — 8 tickets)

Ticket What Action
AIN-83 Pre-launch polish Audit PRs #11/#12 sdk+web via gh pr list --state merged --search "AIN-83"; if merged → Done
AIN-84 Dashboard end-to-end fix Verify against prod, close as Done
AIN-120 Aule GitHub MCP transport decision Close per session-3 decision OR file follow-up
AIN-151 Yavanna heuristics PR #38; founder merge → Done
AIN-160 Fleet tools wiring PR #39/40/41/42; founder merge → Done
AIN-169 v6.1 patch Discipline #6 corollary PR #36; founder merge → Done
AIN-170 INC-2026-05-18-002 re-rotation Verify token rotated, audit clean → Done
AIN-172 SPOF env override generalization PR #37; founder merge → Done

Founder-decision-pending (Aule writes recommendation, founder picks)

Ticket What Aule's recommendation
AIN-87 Settings API Keys + Tenant schema Per Aule's 14:24 comment: demote to v1.9 schema spike, split into AIN-87a (api_keys migration + endpoints), AIN-87b (tenant PATCH + region column), AIN-87c (dashboard wiring). OR close as blocked. Founder decision needed within 24h to avoid ghost-state.

Ghost-state to verify clean

Aule's session-3 close claimed "AIN-87/88/89/90/124 closeouts filed." Verified:

  • AIN-87 NOT a closeout (see above)
  • AIN-88, AIN-89, AIN-90 status not yet spot-checked — Aule verify next poll
  • AIN-124 status not yet spot-checked — Aule verify next poll

If any of AIN-88/89/90/124 closeout comments recommend Done state transition, Aule moves them. If any recommend a decision, Aule files a comment summarizing for founder.


Execution sequencing (Aule's 6-PR ship plan)

Per AIN-154 comment + AIN-160 comment from 13:39 UTC:

PR #1 → AIN-173 catalog audit + AIN-178 Phase 0 Tulkas batteries
        STAGING canary per When-stuck #19 → Tulkas validates → prod
      
PR #2 → AIN-174 native SSE streaming (highest risk)
        STAGING canary + Tulkas validation → prod
      
PR #3 → AIN-175 agent_handle resolver in error path (low risk)
        Direct to prod after pre-commit gates
      
PR #4 → AIN-176 + AIN-177 bundled response normalization
        STAGING canary + Tulkas validation → prod
      
PR #5 → AIN-178 Phase 1 Tulkas cron activation
        ⚠️ Hetzner systemd is shared-infra per Discipline #6 corollary
        STOP and request Lock U itemized auth before opening PR
      
PR #6 → After all 5 bugs fixed: re-validate PR #40/#41/#42 (Yavanna/Aule/Namo legs)
        against fixed /v1/inference surface before founder merges them

After PR #1-4 land:

  1. Replay Tulkas Battery docs(adr): ADR-012/013/014 — AA integration roadmap #4 (cross-framework simulation) against fixed /v1/inference
  2. If all 5 frameworks pass ≥95% success rate → notify founder to merge PR feat(api): AIN-152 Phase B1 · /v1/waitlist signup + counter endpoints (re-PR off main) #40/feat(api): AIN-161 Phase 2 · /v1/stats/public/leaderboard endpoint #41/fix(deploy): Dockerfile run alembic upgrade head on container startup #42
  3. If any framework fails → file a child ticket under AIN-160 with reproducer

After PR #5 (Tulkas Phase 1):

  1. Verify cron schedule running on Hetzner CX42
  2. Monitor 24h: audit chain shows ≥100 tulkas.* events/hour
  3. If Tulkas hits daily cap unexpectedly → file bug + revisit caps

After Phase 0 epics deliver (AIN-152/153/154/158/161):

  1. Each epic's Phase 1 is a separate planning cycle, not part of this delivery wave
  2. Mark each Phase 0 ticket as Done when PR merged + smoke passes
  3. File Phase 1 tickets if appropriate, but do NOT auto-execute Phase 1

Disciplines still gating Aule (do NOT relitigate)

Discipline #1 — No Done without proof

Curl + browser + test must all pass. No "PR merged" = "Done." Verify deploy with explicit curl against prod surface, browser smoke if UI-touching, all pre-commit gates green.

Discipline #6 corollary — Shared-infra config requires itemized auth

Even with this blanket "deliver all" directive, the following STILL require explicit per-change auth:

Canonical case this wave: AIN-178 Phase 1 Tulkas cron on Hetzner systemd. STOP and request Lock U before shipping.

Discipline #11 corollary — No catalog vs runtime drift

Bug 1 (AIN-173) IS this corollary in production. When fixing, also add catalog-runtime sync verification: any change to models.capabilities or cost_per_call_estimate_usd MUST be followed by a smoke against /v1/inference that auto-routes through that capability/cost path. Otherwise we're shipping a drift.

Discipline #12 ESCALATE — Moat decisions stay founder-only

Even with deliver-all auth, the following are NOT in scope:

  • AAMC voter weights or composition changes
  • Settlement model (deferred with payment anyway)
  • ATS dimension weights or veto thresholds
  • Founder identity surface (PUBLIC LOCK — absolute, no exceptions)
  • Public scoring methodology
  • OSS license terms
  • Cross-tenant data sharing rules

If a fix touches any of these, STOP and surface to founder before shipping.

Discipline #14 corollary — Secrets clipboard-only

Per INC-2026-05-18-001 + INC-2026-05-18-002. If any bug fix surfaces a secret value (env var, token, hash), DO NOT echo it in transcripts. Use pbcopy / xclip and tell founder "copied to clipboard."

When-stuck #19 — STAGING canary before prod for catalog/threshold/router migrations

AIN-173 catalog fix is the canonical case for this wave. STAGING canary first, then prod. Verify Tulkas Battery #1 passes against STAGING before deploying to prod.


Communication protocol

When you ship a PR

Post a comment on this ticket (AIN-179) with:

  • PR URL
  • Tickets it closes
  • Production verification curl output (sanitized of secrets)
  • Tulkas Battery results if applicable

When you hit a Discipline #6 or #12 wall

Post a comment on this ticket asking for itemized auth. Don't proceed.

When you find a NEW bug class (not regression of existing)

File a new child ticket under AIN-154 (router hardening) or AIN-160 (fleet tools), per the bug's surface. Add to delivery queue.

When a founder-decision ticket needs decision

Post a comment summarizing the decision + your recommendation. Stop on that ticket, continue with others.


Scope NOT in this charter (out of scope or already done)

  • AIN-115 D30-D45 launch posture lock — stays as posture anchor, don't close
  • AIN-163 v1.9-CP Universal Control Plane — backlog, post-v1.8 (~Aug 2026)
  • AIN-133 Sprint v1.7 PARENT — already Done
  • AIN-171 production 500 incident — already Done
  • Anything cancelled in session-3 — leave cancelled

Verification gate before declaring "all delivered"

This charter is closed (state → Done) when:

  1. All 5 bug fixes (AIN-173/174/175/176/177) shipped + Tulkas re-probes pass
  2. AIN-178 Phase 0 Tulkas batteries live on STAGING + producing audit events
  3. AIN-178 Phase 1 either shipped (Lock U granted) OR explicitly deferred with founder note
  4. All 5 Phase 0 epics (AIN-152/153/154/158/161) shipped to prod with smoke
  5. All 8 In Review tickets either Done or commented with founder-decision-pending
  6. AIN-87 founder-decision posted with clear recommendation
  7. AIN-88/89/90/124 status verified (Done or recommendation comment)
  8. PR feat(api): AIN-152 Phase B1 · /v1/waitlist signup + counter endpoints (re-PR off main) #40/feat(api): AIN-161 Phase 2 · /v1/stats/public/leaderboard endpoint #41/fix(deploy): Dockerfile run alembic upgrade head on container startup #42 re-validated against fixed /v1/inference and either merged or flagged with regression
  9. No new bug regressions on Tulkas Battery docs(adr): ADR-012/013/014 — AA integration roadmap #4 (cross-framework) for 24h
  10. Notion canonical updated with delivery wave summary at end

Founder authorization trail

  • Session 3 close: "All tickets should be delivered except payment."
  • Session 3.5 close: "Patch now and close all linear tickets now by delivering except payment and regulation. Tell CC." (2026-05-18 PM)
  • All cited disciplines remain in force per CC Super Prompt v6 + v6.1 patches.

Aule: start at PR #1 in the sequence. Update this ticket as you ship.

AIN-183 🟠 All-repos audit sweep — 14 repos × spec-vs-built + Discipline #3/#4/#11/#17 verification

Severity: 🟠 HIGH — comprehensive repo-level Discipline #1 + #17 audit

Filed 2026-05-18 PM after AIN-153 + AIN-158 spec-vs-built audit revealed Aule shipping less than spec'd then marking parents Done. Pattern requires systematic verification across ALL 14 repos in the ainfera-ai GitHub org.

Founder directive: "Hard revert to In Progress and force the missing work. Also check all repos."

Scope

For each of the 14 repos in ainfera-ai org, Aule audits:

  1. Spec vs built — does the repo's stated purpose match what's actually in main?
  2. Tickets marked Done — for each Done ticket touching this repo in last 14d, verify the deliverables match the ticket's acceptance gates
  3. Surfaces live in production — does the prod deployment match what main branch's app/ or src/ declares?
  4. Founder PII grep — 0 matches for "Hizrian|Izzy|Raz|founder|fibromyalgia|ADHD|snowboard|Julius Baer" in any public surface or repo README/docs
  5. Internal agent name grep — public-facing surfaces should NOT mention Manwe/Yavanna/Namo/Aule/Tulkas (Varda is acceptable per PUBLIC build(deps): Bump astral-sh/setup-uv from 5 to 7 #1 lock)
  6. Lock compliance — D7-D37 locks honored in any new code
  7. Discipline fix(orm): use postgresql.ARRAY so capabilities.contains() emits @> #11 — Notion canonical vs code drift

14 repos to audit

Production-facing (Phase 1, urgent)

# Repo Audit focus
1 ainfera-ai/web Marketing + dashboard pages — match D14 + Part 2 spec
2 ainfera-ai/api FastAPI surfaces, /v1/* endpoints match docs.ainfera.ai claims
3 ainfera-ai/ainfera-os Monorepo root, docs/ folder, SKILL.md files, package versions
4 ainfera-ai/mcp mcp.ainfera.ai FastMCP server, tool surface matches docs
5 ainfera-ai/sdk (or python-sdk if named differently) PyPI ainfera package matches API surface, exported symbols

Agent-implementation (Phase 2)

# Repo Audit focus
6 ainfera-ai/varda (or similar) NemoClaw orchestrator config, GPT-5.5 binding, public surface compliance
7 ainfera-ai/aule Claude SDK + Opus 4.7 xhigh config, author override in commit history
8 ainfera-ai/yavanna LangGraph + Sonnet 4.6 + Grok 4, response_review heuristics
9 ainfera-ai/namo Letta + Gemini 3.1 Pro, memory consolidation
10 ainfera-ai/tulkas Garak + Mistral Large 3, probe batteries

Customer + tooling (Phase 3)

# Repo Audit focus
11 ainfera-ai/hermes-agent (Manwe Customer #1 fork) v0.14.0 SHA a91a57fa matches reported state
12 ainfera-ai/examples 5 example agent repos per AIN-78
13 ainfera-ai/specs (or similar) CC-BY 4.0 spec docs match Notion canonical
14 ainfera-ai/.github or org-level Issue templates, PR templates, CODEOWNERS, branch protection

Audit deliverable per repo

For each repo, Aule produces a comment on this ticket with:

## Repo: ainfera-ai/<name>

### Spec match
- Stated purpose: <from README>
- Actual state: <from main branch traversal>
- Drift: <none / specifics>

### Recent Done tickets touching this repo
- AIN-XYZ — claimed: "..." → actual state: ✅ matches / ⚠️ partial / ❌ missing

### Production-vs-main drift
- Last deploy SHA: <sha>
- main HEAD: <sha>
- Drift: <none / files differ / etc>

### Grep results
- Founder PII: <count> matches → <files>
- Internal agent names in public: <count> matches → <files>

### Lock compliance
- D7-D37 references: <count> 
- Discipline #6 corollary violations: <count>

### Recommendation
- ✅ Clean / ⚠️ Cleanup needed / 🔴 Active violation

### Tickets to file
- <list of child tickets needed if cleanup work surfaces>

Audit commands Aule runs per repo

cd ~/code/ainfera-ai/<repo>
git pull origin main

# Discipline #3 grep
rg -i "hizrian|izzy|raz|fibromyalgia|adhd|snowboard|julius baer|sommelier" \
   --type-not lock \
   --type-not log \
   -l

# Internal agent naming in public surfaces
rg -i "manwe|yavanna|namo|aule|tulkas" \
   src/ app/ public/ docs/ README.md \
   --type-not lock \
   -l

# Discipline #4 author override check (last 50 commits)
git log -50 --pretty=format:'%h %an <%ae>' | rg -v "Aule <aule@" | head -20

# Production-vs-main drift (for deployed repos)
gh api repos/ainfera-ai/<repo>/deployments --jq '.[0:3] | .[].sha'
# Compare against `git rev-parse main`

# Spec vs files
ls -la docs/
cat README.md | head -30

Acceptance gates

  • All 14 repos audited
  • Audit comment posted per repo on this ticket
  • Any 🔴 Active violation gets immediate child ticket filed
  • Any ⚠️ Cleanup ticket gets queued for AIN-179 delivery wave
  • Summary comment at end: total findings + per-severity counts + recommended next actions
  • Aule author override on the audit branch
  • No remediation in this ticket — only findings. Remediation = follow-up tickets.

Out of scope

  • Code refactoring (only audit-and-report)
  • Linter sweeps (separate cleanup pass)
  • Test coverage analysis (covered by AIN-118 or successor)
  • Performance audit (separate concern)

Connection

Founder authorization

Per "Hard revert to In Progress and force the missing work. Also check all repos" (2026-05-18 session 3.5 PM).

Review in Linear

@cursor
Copy link
Copy Markdown

cursor Bot commented May 19, 2026

You have used all Bugbot PR reviews included in your free trial for your GitHub account on this workspace.

To continue using Bugbot reviews, enable Bugbot for your team in the Cursor dashboard.

@hizrianraz hizrianraz merged commit ba86479 into main May 19, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant