feat(orchestrator-leaderboard): Daydream discovery plans and cache refresh by eliteprox · Pull Request #337 · livepeer/naap

eliteprox · 2026-05-23T05:38:45Z

Summary

Narrows the orchestrator-leaderboard discovery-plan work from #307 into a Daydream-only foundation for PR #337.

Adds generic billingProviderSlug support on DiscoveryPlan, with PR feat(orchestrator-leaderboard): Daydream discovery plans and cache refresh #337 validating and defaulting to daydream only.
Adds plan CRUD/list/get filtering, plan-scoped results, capability catalog, and default/plan-scoped python-gateway discovery endpoints.
Preserves the generic discovery cache path: startup/cron global dataset refresh, plan lazy-eval cache, query caching, and plan-cache invalidation.
Updates the plugin Plans UI and docs to present Daydream as the only provider in this PR.

Out of scope / stacked follow-up

PymtHouse-specific behavior has been removed from this PR and belongs in #338 (feat/pymthouse-integration-followups), including manifest sync/gating, PymtHouse env/docs, provider UI toggle/copy, OIDC/billing APIs, and billing provider seeding.

Test plan

npm --workspace @naap/web-next test
npm --workspace @naap/plugin-orchestrator-leaderboard-frontend run build
Push hook quick checks passed for @naap/plugin-build, @naap/plugin-sdk, and Vercel runtime safety
Vercel preview smoke: plan CRUD, capability catalog, python-gateway discovery

Related: #307, #338

Summary by CodeRabbit

New Features
- Python Gateway discovery endpoints for orchestrator address lookup
- Provider-scoped Capability Catalog and a Plan creation page (/plans/new)
- Billing-provider scoping added across plan APIs and UI
Bug Fixes
- Hardened CRON auth with constant-time checks
- Corrected 401 vs 403 authorization behavior
- Simplified plan-results response envelope for clients
Improvements
- Provider-aware caching and invalidation for discovery results
- Redesigned plan/discovery UI and capability selection components
- Startup dataset refresh and deploy-time seeding workflow updated

Add billingProviderSlug on DiscoveryPlan with API filtering, CRUD support, and default plan seeding for pymthouse. Plans list/get accept an optional billingProviderSlug query param; pymthouse requests also match legacy rows with a null provider slug. Co-authored-by: Cursor <cursoragent@cursor.com>

vercel · 2026-05-23T05:38:48Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
naap-platform	Ready	Preview, Comment	May 27, 2026 6:52pm

github-actions · 2026-05-23T05:38:54Z

🗃️ Database Migration Detected

This PR includes changes to Prisma schema files. Please ensure:

Migration is backward-compatible or a rollback plan exists
Data migration scripts are included if needed
Schema changes will be auto-applied to the preview database (Neon preview branch) during the Vercel preview deployment
Verify the preview deployment works correctly with the new schema
On merge to main, schema changes will auto-promote to production via prisma db push

Preview DB: This PR's Vercel preview deployment uses an isolated Neon database branch. Schema changes are applied automatically via prisma db push during the preview build. The preview branch is reset after each production deploy.

Requesting review from the core team: @livepeer/core

github-actions · 2026-05-23T05:38:54Z

⚠️ This PR is very large (4936 lines changed). Please split it into smaller, focused PRs if possible.

coderabbitai · 2026-05-23T05:38:57Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
✅ Review completed - (🔄 Check again to review again)

📝 Walkthrough

Walkthrough

Adds billing-provider slug support across types/DB/CRUD and seeds; implements provider-scoped capability catalog; adds plan-scoped and stateless python-gateway discovery with tiered shuffling and composite evaluation cache keys; refactors ClickHouse query routing and adapters; hardens cron/admin auth; updates frontend components, hooks, API client, tests, docs, and startup refresh.

Changes

Billing Provider Plan Scoping

Layer / File(s)	Summary
DB schema & npm seed script `packages/database/prisma/schema.prisma`, `package.json`, `bin/seed-discovery-plans.ts`, `bin/db-setup.sh`	Adds `DiscoveryPlan.billingProviderSlug` + index, backfills legacy nulls to `'daydream'`, seeds default discovery plans, and adds `seed:discovery-plans` npm script.
Types & schemas `apps/web-next/src/lib/orchestrator-leaderboard/types.ts`	Exports `BILLING_PROVIDER_SLUGS`, `BillingProviderSlug` type and `BillingProviderSlugSchema`; adjusts plan schemas to accept nullish filters/SLA weights and include billingProviderSlug.
Plan CRUD logic `apps/web-next/src/lib/orchestrator-leaderboard/plans.ts`	Adds billing-provider where helpers and list/get signatures, persists default `'daydream'` on create, tightens mutation scoping, maps null JSON inputs to Prisma.JsonNull, and normalizes DB runtime shape.
Plans API parsing & results `apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans*.ts`, `.../plans/[id]/results/route.ts`	Parses/validates optional `billingProviderSlug` query param (400 on invalid), threads parsed slug into list/get/getPlanResults, flattens results response envelope, and sets DISCOVERY_RESPONSE_CACHE_CONTROL.
Tests & validation `apps/web-next/src/lib/orchestrator-leaderboard/__tests__/*`	Adds/updates tests for billingProviderSlug scoping, schema defaults/null handling, and cache/invalidate behaviors.

Python Gateway Discovery

Layer / File(s)	Summary
Tiered discovery ordering `apps/web-next/src/lib/orchestrator-leaderboard/discovery-order.ts`, `.../discovery-order.test.ts`	Implements tier partitioning, effective tier count, and tier-local shuffling while preserving first-seen deduplication.
Plan-scoped gateway `apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/python-gateway/route.ts`, tests	New GET returning bare JSON array of addresses: authorize, optional slug validation, resolve provider-allowed capabilities, evaluate/cached plan, dedupe/truncate/shuffle addresses, respond with cache/refresh headers.
Stateless gateway `apps/web-next/src/app/api/v1/orchestrator-leaderboard/python-gateway/route.ts`, tests	New GET supporting repeated caps/topN/provider scoping: parse params, provider-filter capabilities, fetch leaderboards, dedupe/trim/tier-shuffle, return bare JSON array with headers and error text handling.

Capability Catalog and ClickHouse

Layer / File(s)	Summary
Provider restrictions `apps/web-next/src/lib/orchestrator-leaderboard/provider-restrictions.ts`, tests	Normalizes billing provider slugs (Daydream-only), returns constant revision, and currently treats Daydream as pass-through for capability filtering.
Capability catalog route `apps/web-next/src/app/api/v1/orchestrator-leaderboard/capability-catalog/route.ts`	GET /capability-catalog: provider-scoped pipeline/model filtering, manifestOnly branch, optional cache bypass, and DISCOVERY_RESPONSE_CACHE_CONTROL.
ClickHouse query helpers `apps/web-next/src/lib/orchestrator-leaderboard/query.ts`, tests	Adds resolveClickhouseQueryTarget and buildOrchestratorClickhouseFetchParams, refactors fetchFromClickHouse to use them with timeout/AbortSignal, and switches to replaceAll for capability placeholder.
Source adapters `apps/web-next/src/lib/orchestrator-leaderboard/sources/clickhouse.ts`, `.../naap-pricing.ts`, tests	Adds parseCapabilityRows helper, direct-target branch, and normalizes NaaP rows; adds integration/unit tests for direct mode.

Caching, Discovery Policy, and Startup

Layer / File(s)	Summary
Discovery policy `apps/web-next/src/lib/orchestrator-leaderboard/discovery-policy.ts`	Adds DiscoveryPolicy types and applyDiscoveryPolicyToOrchestrators to filter/sort/truncate orchestrator rows.
Discovery constants `apps/web-next/src/lib/orchestrator-leaderboard/discovery-constants.ts`	Adds DISCOVERY_RESPONSE_CACHE_CONTROL and fingerprintCapabilityList for cache keys.
Plan evaluation cache keying `apps/web-next/src/lib/orchestrator-leaderboard/refresh.ts`, tests	Introduces composite cache key (plan id + billingProviderSlug + providerRevision + fingerprint), resolves provider capabilities before cache lookup, writes under composite keys, and updates prefix-based cache scan/invalidate.
Startup refresh `apps/web-next/src/lib/orchestrator-leaderboard/global-refresh.ts`, `apps/web-next/src/instrumentation.ts`	Adds refreshGlobalDatasetOnStartup with in-flight dedupe and freshness check; invokes on server startup and logs outcome.

Security and Misc

Layer / File(s)	Summary
Cron and admin auth hardening `apps/web-next/src/app/api/v1/orchestrator-leaderboard/filters/route.ts`, `.../dataset/refresh/route.ts`, `.../sources/route.ts`	Uses timingSafeEqual for cron secret comparison, sets Authorization header only when token present, and splits missing-user (401) vs missing-admin-role (403) checks.
Docs, OpenAPI, tests config, scripts `plugins/.../docs/`, `plugins/.../docs/openapi.yaml`, `vitest.config.ts`, `bin/`	Updates OpenAPI (python-gateway endpoints, BillingProviderSlug param and schemas), docs/how-to/for-ai, Vitest inline deps, db-setup and vercel build scripts, and example env for PymtHouse.

Possibly related PRs

livepeer/naap#227: Related pipeline-catalog/resolver refactor and caching work.
livepeer/naap#213: Prior changes touching plan CRUD and billingProviderSlug scaffolding.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

seanhanca

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/discovery-plan-billing-provider

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

apps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.ts (1)
164-187: ⚡ Quick win

Add a getPlan(..., 'pymthouse') compatibility test for legacy null rows.

You already cover this fallback in listPlans; adding the same assertion for getPlan would protect the special-case behavior from regressions on the detail endpoint.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@apps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.ts`
around lines 164 - 187, Add a new test that calls getPlan('pub-1', {
ownerUserId: 'user-b' }, 'pymthouse') and asserts mockFindFirst was invoked with
a where clause that applies the provider compatibility fallback (i.e., the
billingProviderSlug condition allows legacy null rows). Reuse publicPlan and
mockFindFirst from the existing test, mirroring the provider-scoped assertion
but expect the provider predicate to include billingProviderSlug: null (or an OR
that allows null) for the 'pymthouse' case so the special-case behavior is
covered on the detail endpoint.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/`[id]/route.ts:
- Around line 41-45: The check `if (!raw)` treats an empty query
`?billingProviderSlug=` as missing and bypasses validation; change the check to
`raw === null` so empty strings still reach
`BillingProviderSlugSchema.safeParse(raw.trim().toLowerCase())`, allowing the
route handler to return a 400 on invalid/empty values; update the logic around
the `raw` variable in route.ts (the block that calls
BillingProviderSlugSchema.safeParse) accordingly to ensure only a truly absent
param is treated as missing.

In `@apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/route.ts`:
- Around line 24-29: The current check uses if (!raw) which treats an empty
query value as absent and bypasses validation; change that to a strict null
check (if (raw === null)) so only missing params return { value: null, error:
null }, and let BillingProviderSlugSchema.safeParse(raw.trim().toLowerCase())
handle empty-string validation; update the conditional around raw and keep
references to the same variables (raw) and schema
(BillingProviderSlugSchema.safeParse) so empty ?billingProviderSlug= is treated
as invalid.

In `@apps/web-next/src/lib/orchestrator-leaderboard/plans.ts`:
- Around line 162-167: The early return when scopeWhere is falsy blocks admin
actions on public plans; in updatePlan and deletePlan adjust the logic around
writeScopeWhere(scope) so that if the existing plan's visibility === 'public'
you do not immediately return 'forbidden' but still construct mutationWhere to
allow admin public-plan mutations (i.e., compute scopeWhere =
writeScopeWhere(scope) only for non-public plans, and for public plans let
mutationWhere be { id } so the public-plan admin branch can apply); apply the
same pattern to both updatePlan and deletePlan, referencing writeScopeWhere,
existing.visibility, mutationWhere, updatePlan and deletePlan to locate the
changes.

---

Nitpick comments:
In
`@apps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.ts`:
- Around line 164-187: Add a new test that calls getPlan('pub-1', { ownerUserId:
'user-b' }, 'pymthouse') and asserts mockFindFirst was invoked with a where
clause that applies the provider compatibility fallback (i.e., the
billingProviderSlug condition allows legacy null rows). Reuse publicPlan and
mockFindFirst from the existing test, mirroring the provider-scoped assertion
but expect the provider predicate to include billingProviderSlug: null (or an OR
that allows null) for the 'pymthouse' case so the special-case behavior is
covered on the detail endpoint.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8f87ff80-6fab-45d6-b0ec-65d538bd0093

📥 Commits

Reviewing files that changed from the base of the PR and between 84fa791 and bf645e6.

📒 Files selected for processing (9)

apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/route.ts
apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/route.ts
apps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.ts
apps/web-next/src/lib/orchestrator-leaderboard/__tests__/types-validation.test.ts
apps/web-next/src/lib/orchestrator-leaderboard/plans.ts
apps/web-next/src/lib/orchestrator-leaderboard/types.ts
bin/seed-discovery-plans.ts
package.json
packages/database/prisma/schema.prisma

…tions Treat only absent billingProviderSlug query params as unset so empty values return 400. Allow admin public-plan updates/deletes without a personal write scope. Add getPlan pymthouse null-slug compatibility test. Co-authored-by: Cursor <cursoragent@cursor.com>

Move orchestrator-leaderboard backend, manifest gating, python-gateway routes, discovery-service client, plugin UI, and docs from #307 into this PR. Excludes developer-api and billing OIDC/auth changes. Includes capability catalog, tiered discovery shuffle, default/plan-scoped python-gateway, PymtHouse manifest sync, and removal of demo seed API. Co-authored-by: Cursor <cursoragent@cursor.com>

eliteprox · 2026-05-23T06:27:02Z

@coderabbitai review

coderabbitai · 2026-05-23T06:27:07Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

seanhanca

Review — `feat(orchestrator-leaderboard): discovery plans and PymtHouse manifest gating`

Summary

The three CodeRabbit-flagged fixes all land correctly — empty ?billingProviderSlug= is rejected by zod, and updatePlan / deletePlan now branch on existing.visibility === 'public' so admins no longer fail when their personal scope is empty. The remaining concerns are a non-obvious fail-closed gap for legacy billingProviderSlug = null rows and a chunk of dead code in discovery-service/client.ts. With those two resolved this is a solid feature.

Blocking issues (must fix)

1. Legacy billingProviderSlug = null plans bypass PymtHouse manifest gating end-to-end

plans.ts:74-78 deliberately maps ?billingProviderSlug=pymthouse to OR: [{ slug: 'pymthouse' }, { slug: null }] so legacy rows are queryable as pymthouse-compatible. But the runtime gating treats them as "no provider":

// apps/web-next/src/lib/orchestrator-leaderboard/provider-restrictions.ts:34-43
if (normalizeBillingProviderSlug(billingProviderSlug) !== 'pymthouse') {
  return capabilities;  // bypasses manifest filter for null
}

// apps/web-next/src/lib/orchestrator-leaderboard/refresh.ts:111-120
if (normalizeBillingProviderSlug(plan.billingProviderSlug) === 'pymthouse') {
  await ensurePymthouseManifestFresh(...);
}

Net effect: a personal plan created before this PR (slug null) returned via /plans/:id/python-gateway will (a) skip the manifest refresh and (b) pass its capabilities through verbatim — i.e. excluded pipelines/models will still be discovered. That contradicts the documented invariant in docs/pymthouse-integration.md:90 ("missing manifest denies by default").

Fix: either treat null as pymthouse in normalizeBillingProviderSlug (so the default-pymthouse semantic applies everywhere), or backfill billingProviderSlug = 'pymthouse' for existing rows during prisma db push (a one-shot UPDATE in bin/seed-discovery-plans.ts would be enough — createPlan already defaults new rows to 'pymthouse').

2. apps/web-next/src/lib/discovery-service/client.ts is dead code

No callers anywhere in the repo (verified queryPlanFromDiscoveryService, refreshDiscoveryServiceDataset, isDiscoveryServiceEnabled, and DISCOVERY_SERVICE_URL). It exports a 158-line HTTP client with no tests. Either wire it into refresh.ts / global-refresh.ts and add a vitest, or delete it from this PR. Shipping it unused increases the surface that has to be kept in sync with a service this codebase doesn't actually call.

Important (should fix before merge)

1. Stale doc reference to the removed POST /plans/seed

// plugins/orchestrator-leaderboard/docs/for-ai.md:65
| `POST` | `/plans/seed` | JWT or `gw_` | Seed 4 demo plans (idempotent). Onboarding. |

The route was deleted in this PR; replace this row with a pointer to npm run seed:discovery-plans / bin/seed-discovery-plans.ts.

2. openapi.yaml is not updated for any of the new surface

Searches in plugins/orchestrator-leaderboard/docs/openapi.yaml for billingProviderSlug, python-gateway, and capability-catalog all return 0 hits. The new query param on list/get and the three new routes (/plans/:id/python-gateway, /python-gateway, /capability-catalog) need OpenAPI entries — this is the contract the PR description points external consumers at.

3. capability-catalog/route.ts:89 always forces bypassCache: true

const catalog = await getDashboardPipelineCatalog({ bypassCache: true });

Combined with Cache-Control: private, no-store, must-revalidate, every plan-edit page open recomputes the entire pipeline catalog. Drop bypassCache (or move the bypass behind ?refresh=1) so the facade-level cache helps the hot UI path.

4. Inconsistent empty-slug handling on the python-gateway plan route

// apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/python-gateway/route.ts:32-35
const raw = request.nextUrl.searchParams.get('billingProviderSlug');
if (!raw) {
  return { value: null, error: null };
}

The fixed CRUD/list routes use if (raw === null) so empty strings 400. This route uses if (!raw) so ?billingProviderSlug= is silently treated as missing. Match the strict 400 behavior (preferred) or document the asymmetry — otherwise python-gateway clients sending the param will silently fall back to "no provider override" instead of getting a clear validation error.

Migration & rollout

Schema change is safe (billingProviderSlug String? + nullable index, prisma db push --accept-data-loss will ALTER TABLE without nuking existing rows). Personal plans created before this PR will land with null — see Blocking #1; consider adding a one-time UPDATE "DiscoveryPlan" SET "billingProviderSlug" = 'pymthouse' WHERE "billingProviderSlug" IS NULL to the seed script so the runtime invariant holds without code changes to the gating helpers.

.env.local.example additions reference (commented-out) PYMTHOUSE_* knobs only — no secrets, and PYMTHOUSE_ALLOW_MISSING_MANIFEST_FAIL_OPEN is correctly documented as audited and opt-in.

Verdict

REQUEST_CHANGES — primarily for the legacy-null PymtHouse-gating gap (blocking #1) and the unused discovery-service/client.ts (blocking #2). Once those are resolved and the docs/openapi update lands, the rest are happy "important" cleanups that can ride along.

…points - Updated the capability catalog route to support a `refresh` query parameter for bypassing cache. - Added a test to validate 400 responses when `billingProviderSlug` is empty. - Modified the `parseBillingProviderSlugParam` function to treat null values correctly. - Implemented a backfill function for legacy billing provider slugs in the seeding script. - Introduced new API endpoints for plan-scoped and default python-gateway discovery, along with capability catalog retrieval. Co-authored-by: Cursor <cursoragent@cursor.com>

coderabbitai

Actionable comments posted: 14

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

plugins/orchestrator-leaderboard/docs/openapi.yaml (1)
69-78: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Document the new 400 path for invalid billingProviderSlug.

These operations now accept billingProviderSlug, and the shared parameter explicitly says an empty value returns 400. The spec still only advertises 200/401(/404), so generated clients and external integrators will miss a real validation branch.

Also applies to: 136-154
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/orchestrator-leaderboard/docs/openapi.yaml` around lines 69 - 78, The
OpenAPI operations that reference the shared parameter BillingProviderSlugQuery
must include a "400" response to document the empty/invalid billingProviderSlug
validation branch; update the responses block for the affected operations (the
one shown and the other occurrence around lines 136-154) to add a "400" response
that references the existing bad-request response/schema (e.g.,
components/responses/BadRequest or a suitable error schema) so generated clients
and integrators will surface the validation error for BillingProviderSlugQuery.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@apps/web-next/src/app/api/v1/orchestrator-leaderboard/capability-catalog/route.ts`:
- Around line 62-67: When handling requestedBillingProviderSlug === 'pymthouse'
wrap the call to ensurePymthouseManifestFresh() in a try/catch so that any
exception is swallowed (and optionally logged) and manifestAvailable is set to
false instead of letting the error propagate; after the try/catch still call
getPymthouseManifestSnapshot() only when no error occurred to determine
manifestAvailable, referencing ensurePymthouseManifestFresh(),
getPymthouseManifestSnapshot(), and the manifestAvailable variable to locate
where to add the try/catch.

In
`@apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/`[id]/results/route.ts:
- Around line 29-31: The current check uses `if (!raw)` which treats an empty
query like `?billingProviderSlug=` as missing; change the validation so you
distinguish null vs empty string: keep the null check (`raw === null`) for
missing param, and add an explicit empty-string check (e.g., `raw.trim() ===
''`) to reject empty values and return a 400/validation error. Update the logic
around the `raw` variable in route.ts (the `billingProviderSlug` handling) so
empty strings are rejected rather than treated as missing.

In
`@apps/web-next/src/app/api/v1/orchestrator-leaderboard/python-gateway/route.ts`:
- Around line 145-148: The handler currently returns the raw variable message in
the NextResponse for all errors; change it so that when invalidCapability is
true you still return message with status 400, but when status is 500 return a
generic non-sensitive message (e.g., "Internal server error") instead of the raw
message; keep the original message available for server-side logging (e.g., log
the error before returning) and update the NextResponse creation (referencing
NextResponse, message, invalidCapability) accordingly.
- Around line 87-89: Call to ensurePymthouseManifestFresh() is currently outside
the try/catch so any rejection bypasses your structured API error handling; move
the ensurePymthouseManifestFresh() invocation into the existing try block that
handles the 'pymthouse' billingProvider (the same guarded path that creates/uses
the Python gateway client in route.ts) so its errors are caught and translated
into your API error response rather than escaping the handler.

In `@apps/web-next/src/lib/orchestrator-leaderboard/query.ts`:
- Around line 56-58: The current resolveDirectClickhouseUrl function drops any
configured path/query by doing new URL('/', rawUrl); change it to preserve the
original path and query from CLICKHOUSE_URL: parse rawUrl with new URL(rawUrl),
ensure the pathname ends with a trailing slash if needed (so endpoints like
base/path become base/path/), and return that URL.toString(); update the
function resolveDirectClickhouseUrl to use this logic so direct mode calls the
correct path-prefixed endpoint.

In `@apps/web-next/src/lib/pymthouse-discovery-plans.test.ts`:
- Around line 35-37: The test sets inconsistent env vars using PMTHOUSE_*;
update the vi.stubEnv calls so all keys match the rest of the integration (use
PYMTHOUSE_*). Specifically, change vi.stubEnv('PMTHOUSE_CLIENT_ID', 'app_x') to
vi.stubEnv('PYMTHOUSE_CLIENT_ID', 'app_x') and
vi.stubEnv('PMTHOUSE_M2M_CLIENT_SECRET', 'secret') to
vi.stubEnv('PYMTHOUSE_M2M_CLIENT_SECRET', 'secret'), and then run the test to
ensure no other references still use the old PMTHOUSE_* names.

In `@apps/web-next/src/lib/pymthouse-manifest.test.ts`:
- Around line 114-121: The test "uses server manifestVersion when provided" is
asserting the opposite; update the assertion in the test that calls
computeManifestRevision so it expects the returned revision to equal the
provided server manifestVersion ('server-rev-abc') and, if helpful, rename the
test or adjust its description to remain consistent with the behavior being
validated; target the computeManifestRevision call and its expect(...) assertion
for this change.

In `@apps/web-next/src/lib/pymthouse-manifest.ts`:
- Around line 245-265: The current fetch catch block clears any existing
snapshot by setting body and etag to null before calling applyManifestSnapshot,
which causes transient failures to wipe the last good manifest; change the logic
to preload the last known snapshot into body/etag before attempting the network
call (e.g. call a helper like getLastManifestSnapshot() to populate body and
etag), then only overwrite body/etag when the fetch succeeds and
parseManifestJson returns a valid manifest, and in the catch block do not set
body/etag to null so applyManifestSnapshot(body, etag) will preserve the
previous snapshot; reference parseManifestJson, applyManifestSnapshot and the
fetch block around creds.url when making the change.

In `@plugins/orchestrator-leaderboard/docs/how-to-guide.md`:
- Around line 451-452: The key-takeaway line uses the wrong path
`results.data.capabilities[...]`; update it to match the examples by referencing
`results.capabilities[...]` (or `env.data.capabilities[...]` where the doc uses
`env`) so clients parse the same structure; keep the note about `.orchUri` as
the dialer field.

In `@plugins/orchestrator-leaderboard/docs/openapi.yaml`:
- Around line 311-315: Change the query parameters "caps" and "capability" from
a scalar string to an array schema so generators emit repeatable query params;
update each param (e.g., the param named caps and the one named capability) to
use schema: { type: array, items: { type: string } } and include query
serialization hints (style: form, explode: true) so repeated usages like
?caps=a&caps=b are represented correctly.

In `@plugins/orchestrator-leaderboard/frontend/e2e/leaderboard.spec.ts`:
- Around line 260-262: The test asserts properties on resp.data.plan (e.g.,
expect(resp.data.plan.name)...), but the stubbed response still nests the
payload under resp.data.data.plan; update the mocked/stubbed response object
used in the spec so it returns the plan at resp.data.plan (remove the extra
"data" wrapper) and ensure capabilities are at resp.data.capabilities (so checks
like expect(resp.data.capabilities.noop).toHaveLength(1) pass); locate the stub
in the same spec where the fake response body is constructed and adjust the
shape accordingly.

In `@plugins/orchestrator-leaderboard/frontend/src/globals.css`:
- Around line 109-112: Remove the extra blank line between the background and
color declarations to satisfy the Stylelint rule declaration-empty-line-before:
inside the CSS rule that contains "background: var(--bg-tertiary);" and "color:
`#34d399`;", delete the empty line so the two declarations are adjacent (no blank
line before the color declaration).

In `@plugins/orchestrator-leaderboard/frontend/src/pages/PlanDetailPage.tsx`:
- Around line 125-134: bulkToggleCapabilities currently only compares exact
strings (using current.includes and removeSet) so legacy aliases like "sdxl"
remain selected when their canonical counterparts are removed; update
bulkToggleCapabilities to normalize/expand capabilities during comparison:
create or use a mapping of legacy aliases to canonical capability names (e.g.,
aliasMap) and when deselecting expand removeSet to include both the supplied
capability keys and their known aliases/canonical equivalents, then filter
current by checking against this expanded set (referencing
bulkToggleCapabilities, current, next, and removeSet to locate changes).

---

Outside diff comments:
In `@plugins/orchestrator-leaderboard/docs/openapi.yaml`:
- Around line 69-78: The OpenAPI operations that reference the shared parameter
BillingProviderSlugQuery must include a "400" response to document the
empty/invalid billingProviderSlug validation branch; update the responses block
for the affected operations (the one shown and the other occurrence around lines
136-154) to add a "400" response that references the existing bad-request
response/schema (e.g., components/responses/BadRequest or a suitable error
schema) so generated clients and integrators will surface the validation error
for BillingProviderSlugQuery.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3ceef9c9-0bbb-4e57-82b5-f15ae5ae6e10

📥 Commits

Reviewing files that changed from the base of the PR and between bf645e6 and 09d0fa7.

⛔ Files ignored due to path filters (1)

package-lock.json is excluded by !**/package-lock.json, !package-lock.json

📒 Files selected for processing (68)

apps/web-next/.env.local.example
apps/web-next/src/app/api/v1/dashboard/orchestrators/route.ts
apps/web-next/src/app/api/v1/orchestrator-leaderboard/capability-catalog/route.ts
apps/web-next/src/app/api/v1/orchestrator-leaderboard/dataset/refresh/route.ts
apps/web-next/src/app/api/v1/orchestrator-leaderboard/filters/route.ts
apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/python-gateway/route.test.ts
apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/python-gateway/route.ts
apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/results/route.ts
apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/route.ts
apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/route.ts
apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/seed/route.ts
apps/web-next/src/app/api/v1/orchestrator-leaderboard/python-gateway/route.test.ts
apps/web-next/src/app/api/v1/orchestrator-leaderboard/python-gateway/route.ts
apps/web-next/src/app/api/v1/orchestrator-leaderboard/sources/route.ts
apps/web-next/src/instrumentation.ts
apps/web-next/src/lib/facade/index.ts
apps/web-next/src/lib/facade/resolvers/pipeline-catalog.ts
apps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.ts
apps/web-next/src/lib/orchestrator-leaderboard/__tests__/provider-restrictions.test.ts
apps/web-next/src/lib/orchestrator-leaderboard/__tests__/query.test.ts
apps/web-next/src/lib/orchestrator-leaderboard/__tests__/refresh.test.ts
apps/web-next/src/lib/orchestrator-leaderboard/__tests__/sources/integration.test.ts
apps/web-next/src/lib/orchestrator-leaderboard/discovery-order.test.ts
apps/web-next/src/lib/orchestrator-leaderboard/discovery-order.ts
apps/web-next/src/lib/orchestrator-leaderboard/global-refresh.ts
apps/web-next/src/lib/orchestrator-leaderboard/plans.ts
apps/web-next/src/lib/orchestrator-leaderboard/provider-restrictions.ts
apps/web-next/src/lib/orchestrator-leaderboard/query.ts
apps/web-next/src/lib/orchestrator-leaderboard/refresh.ts
apps/web-next/src/lib/orchestrator-leaderboard/sources/clickhouse.ts
apps/web-next/src/lib/orchestrator-leaderboard/sources/naap-pricing.ts
apps/web-next/src/lib/orchestrators-discovery-policy.test.ts
apps/web-next/src/lib/orchestrators-discovery-policy.ts
apps/web-next/src/lib/pymthouse-discovery-plans.test.ts
apps/web-next/src/lib/pymthouse-discovery-plans.ts
apps/web-next/src/lib/pymthouse-manifest.test.ts
apps/web-next/src/lib/pymthouse-manifest.ts
apps/web-next/tests/leaderboard-posture.spec.ts
apps/web-next/tests/orchestrator-leaderboard.spec.ts
apps/web-next/vitest.config.ts
bin/db-setup.sh
bin/seed-discovery-plans.ts
bin/vercel-build.sh
docs/pymthouse-integration.md
plugins/orchestrator-leaderboard/docs/data-sources.md
plugins/orchestrator-leaderboard/docs/for-ai.md
plugins/orchestrator-leaderboard/docs/how-to-guide.md
plugins/orchestrator-leaderboard/docs/openapi.yaml
plugins/orchestrator-leaderboard/frontend/e2e/leaderboard.spec.ts
plugins/orchestrator-leaderboard/frontend/src/App.tsx
plugins/orchestrator-leaderboard/frontend/src/components/AdminSettings.tsx
plugins/orchestrator-leaderboard/frontend/src/components/CapabilityGroupPicker.tsx
plugins/orchestrator-leaderboard/frontend/src/components/CapabilityTag.tsx
plugins/orchestrator-leaderboard/frontend/src/components/CollapsibleTagList.tsx
plugins/orchestrator-leaderboard/frontend/src/components/EndpointGuide.tsx
plugins/orchestrator-leaderboard/frontend/src/components/FormLabel.tsx
plugins/orchestrator-leaderboard/frontend/src/components/SectionLabel.tsx
plugins/orchestrator-leaderboard/frontend/src/components/StyledCheckbox.tsx
plugins/orchestrator-leaderboard/frontend/src/components/TabNav.tsx
plugins/orchestrator-leaderboard/frontend/src/globals.css
plugins/orchestrator-leaderboard/frontend/src/hooks/useCapabilityCatalog.ts
plugins/orchestrator-leaderboard/frontend/src/hooks/usePlanDetail.ts
plugins/orchestrator-leaderboard/frontend/src/hooks/usePlans.ts
plugins/orchestrator-leaderboard/frontend/src/lib/api.ts
plugins/orchestrator-leaderboard/frontend/src/pages/LeaderboardPage.tsx
plugins/orchestrator-leaderboard/frontend/src/pages/PlanCreatePage.tsx
plugins/orchestrator-leaderboard/frontend/src/pages/PlanDetailPage.tsx
plugins/orchestrator-leaderboard/frontend/src/pages/PlansOverviewPage.tsx

💤 Files with no reviewable changes (1)

apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/seed/route.ts

✅ Files skipped from review due to trivial changes (6)

plugins/orchestrator-leaderboard/frontend/src/components/StyledCheckbox.tsx
plugins/orchestrator-leaderboard/frontend/src/components/TabNav.tsx
plugins/orchestrator-leaderboard/docs/data-sources.md
apps/web-next/tests/orchestrator-leaderboard.spec.ts
bin/vercel-build.sh
docs/pymthouse-integration.md

… routes - Added error handling for PymtHouse manifest refresh failures in the capability catalog route. - Improved validation logic for `billingProviderSlug` in the plans results route. - Updated the Python gateway route to ensure manifest refresh is only attempted when necessary. - Enhanced tests to cover scenarios where manifest fetch fails, preserving the last known good state. - Refactored utility functions to improve clarity and maintainability. Co-authored-by: Cursor <cursoragent@cursor.com>

…rator discovery plans - Replaced PymtHouse manifest dependencies with Daydream-specific logic in the capability catalog and Python gateway routes. - Updated billing provider handling to default to 'daydream' instead of 'pymthouse'. - Introduced new discovery constants and policies to support Daydream-only functionality. - Removed legacy PymtHouse manifest and discovery plan files, streamlining the codebase. - Enhanced the orchestrator discovery policy to apply filters without relying on PymtHouse exclusions. Co-authored-by: Cursor <cursoragent@cursor.com>

seanhanca

Re-review after Daydream-only pivot

Picking this back up after the 8 follow-up commits. Scope-pivot to Daydream-only is clean — PymtHouse manifest/discovery code is gone and BILLING_PROVIDER_SLUGS = ['daydream'] is enforced by Zod end-to-end. 5 of 6 prior blockers are fully resolved. Remaining issues below.

Prior-review blocker status

Legacy billingProviderSlug = null plans bypassed PymtHouse manifest gating — RESOLVED. provider-restrictions.ts no longer does manifest gating; normalizeBillingProviderSlug only treats 'daydream' as valid; bin/seed-discovery-plans.ts:78-91 backfills legacy null rows at deploy time.
Dead apps/web-next/src/lib/discovery-service/client.ts — RESOLVED. Directory deleted.
Stale POST /plans/seed doc reference — RESOLVED. Tables in for-ai.md and how-to-guide.md:636-640 now point at npm run seed:discovery-plans. Route file deleted; no callers remain.
openapi.yaml not updated for new endpoints/params — PARTIALLY RESOLVED. billingProviderSlug (openapi.yaml:796-803), python-gateway (openapi.yaml:252-375), and capability-catalog (openapi.yaml:377-417) are documented. Still wrong: the /plans/{id}/results response schema (openapi.yaml:237-243) keeps the old double-data wrapper — see Blocking #3.
capability-catalog/route.ts:89 always forced bypassCache: true — RESOLVED. Now only bypasses when ?refresh=1 is present.
Inconsistent empty-slug handling on /plans/[id]/python-gateway — RESOLVED. All four route handlers consistently 400 on empty value, and the python-gateway test asserts it (plans/[id]/python-gateway/route.test.ts:84-93).

Blocking issues (must fix)

1. Vitest is red — two tests fail in plans-visibility.test.ts

apps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.ts:111-115 and :167-172 expect listPlans / getPlan to throw 'Invalid billingProviderSlug' when called with a blank slug ' '. The implementation in apps/web-next/src/lib/orchestrator-leaderboard/plans.ts:67-75 does no validation — it constructs { billingProviderSlug } from whatever string it receives and passes it to Prisma. Local run: Tests 2 failed | 137 passed | 1 skipped.

Fix: add a trim/empty-string guard inside billingProviderWhere (or at the call sites in listPlans / getPlan / listPlansWhere) that throws Error('Invalid billingProviderSlug') when a non-null/undefined value normalizes to ''. The tests cast ' ' as never precisely because the type alone doesn't enforce the runtime contract — fix the implementation, not the tests.

2. apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/refresh/route.ts:14-17 CRON auth still uses === string compare

Release notes claim "Hardened CRON auth with constant-time checks," but only dataset/refresh/route.ts:21-31 and filters/route.ts:27-37 use timingSafeEqual. The plans-refresh route is still:

function authorized(request: NextRequest): boolean {
  const auth = request.headers.get('authorization');
  return Boolean(process.env.CRON_SECRET) && auth === `Bearer ${process.env.CRON_SECRET}`;
}

Fix: replicate the timingSafeEqual pattern from dataset/refresh/route.ts:21-31 (length short-circuit + buffered compare). Extract into a shared verifyCronAuth(request) helper so all CRON entry points share the same hardened check and the next addition can't drift.

3. /plans/{id}/results response-shape change is undocumented and contradicts for-ai.md + openapi.yaml

apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/results/route.ts:83 now returns success(withPlanMeta) — single-wrapped { success, data: PlanResults }. Pre-PR it returned success({ data: withPlanMeta }) — double-wrapped. how-to-guide.md was updated correctly, but:

plugins/orchestrator-leaderboard/docs/for-ai.md:36-37 still says "the payload is double-nested: body.data.data.capabilities[<cap>][]. This is intentional, not a bug."
for-ai.md:265-269 reference impl reads out.data from a { data: PlanResults } wrapper.
for-ai.md:309 pitfalls table still tells readers "Path is body.data.data.capabilities — double .data."
plugins/orchestrator-leaderboard/docs/openapi.yaml:237-243 still defines the response as data: { data: PlanResults }.

This is a backward-incompatible break for any client following the docs. Fix: either revert to the double-wrap on the server (low-risk option) or update all four doc sites + the openapi schema in this PR, add a "Breaking change" note to the PR description, and pin the new shape with a regression test (body.data.capabilities defined, body.data.data undefined).

Important (should fix before merge)

plans/[id]/python-gateway/route.ts:129-135 leaks raw error messages in 500 responses. The default /python-gateway route correctly returns 'Internal server error' and logs server-side (python-gateway/route.ts:131-148). The plan-scoped sibling returns err instanceof Error ? err.message : 'Failed to evaluate plan' verbatim. Match the default route's pattern.
dataset/refresh/route.ts:87-93 and sources/route.ts:112-118 also surface err.message in 500 envelopes. Lower severity (admin-only), but the generic-message + server-side console.error pattern should apply across the plugin.
useCapabilityCatalog.ts:30-48 is dead code in this PR. The hook type is BillingProviderSlug = 'daydream' (line 10), so the guard if (billingProviderSlug !== 'pymthouse' || …) is always true and loadManifest always returns early. Drop the manifest… state + first effect (it's all dead until PR #338), or restore the manifest plumbing now. Leaving dead branches makes the next PR riskier.
provider-restrictions.ts:17-22 providerRestrictionRevision accepts a parameter it ignores. Footgun for the cache-key contract in refresh.ts:31-36 (rev will always be 'na'). Remove the parameter or drop the cache-key field.
packages/database/prisma/schema.prisma:2250 doc comment is stale. Still says "pymthouse | daydream — PymtHouse Builder allowlist intersection applies when pymthouse." Update to "daydream" (or note that other slugs land in #338).
openapi.yaml:858 Capability pattern is ^[a-zA-Z0-9_-]+$. Actual Zod regex in types.ts:96 is /^[a-zA-Z0-9_.:\-/]+$/ and types-validation.test.ts:39-50 confirms dots, colons, and / are accepted. OpenAPI under-specifies; generated clients will reject valid plans.
openapi.yaml:882-884 documents SLAWeights defaults 0.4 / 0.3 / 0.3. The Zod schema (types.ts:106-110) sets no defaults. Either remove the defaults from the spec or add them in Zod.

Nits / suggestions

plans-visibility.test.ts:111 casts ' ' as never for a reason — once the implementation throws, the cast can stay but documents the runtime contract.
Slug validation drifts between routes: plans/[id]/results/route.ts:35-38 tests normalized === '' explicitly, while plans/route.ts:29-33 and friends rely on BillingProviderSlugSchema.safeParse('') failing. Consolidate into a single parseBillingProviderSlugParam utility.
provider-restrictions.test.ts:11-13 has an empty afterEach with only a comment. Delete the hook.
python-gateway/route.ts:75-78 accepts both billingProviderSlug and billingProvider aliases via normalizeBillingProviderSlug, but the plan-scoped python-gateway only accepts billingProviderSlug and 400s otherwise. Document the asymmetry inline.
plans/[id]/python-gateway/route.ts:89 always sets X-Pymthouse-Manifest: empty on the empty-allowlist branch — there is no PymtHouse manifest in this PR. Drop or rename the header.
discovery-policy.ts:49-64: the case 'latency': / case 'price': arms fall back to slaScore. Add an inline comment or TODO so the silent fallback isn't surprising once those fields exist.
for-ai.md:73-77 lists POST /plans/refresh as "Cron-only bulk refresh. Never call." — accurate, but cross-link the openapi entry (openapi.yaml:419-452) so it doesn't drift again.

Test coverage gaps

No test pins the new single-wrapped shape of /plans/{id}/results (would have caught the doc/code drift in Blocking #3).
No test verifies that pre-existing rows with billingProviderSlug = null become invisible when the UI requests ?billingProviderSlug=daydream, or that the seed backfill rewrites them.
No test for /plans/refresh/route.ts: missing CRON_SECRET → 401, wrong secret → 401, valid secret → 200 (would have caught Blocking #2).
No test asserts the capability-catalog route only bypasses cache when ?refresh=1 is present.
No test for capability-catalog/route.ts error paths (Facade throws). The route currently lacks a try/catch around getDashboardPipelineCatalog — an upstream failure surfaces as Next's default error page, not a JSON envelope.
discovery-order.test.ts is solid for partitioning but doesn't assert all orchestrators appear exactly once when duplicates are trimmed mid-input (discovery-order.ts:65-74 dedupe).
bin/seed-discovery-plans.ts has no test for backfillLegacyBillingProviderSlugs — given it mutates production rows at build time, an idempotency test (run twice → no change after first) is warranted.

Migration & rollout notes

DiscoveryPlan.billingProviderSlug is nullable + indexed (schema.prisma:2247-2272), so prisma db push is non-destructive.
Backfill runs at deploy in bin/seed-discovery-plans.ts:78-99 via npx tsx, scheduled in bin/vercel-build.sh:108-111. Runs after prisma db push, so legacy null rows are immediately rewritten to 'daydream'. If the seed crashes mid-loop, partial state is possible — but the next deploy is idempotent.
During the deploy window (schema push → seed → new pods take traffic), old pods still query without the new column and don't care about NULL slugs. No read path in this PR breaks on NULL after the backfill completes.
CreatePlanSchema defaults to 'daydream' and the UI hardcodes 'daydream' (PlanCreatePage.tsx:25); post-deploy plans start with the correct slug.
For PR #338 (PymtHouse): the seed script will need to learn that null → 'daydream' is no longer always correct, otherwise future PymtHouse imports get silently coerced.
The /plans/{id}/results envelope change is breaking for external SDKs reading body.data.data.capabilities. Surface in the PR description and release notes; consider a one-release deprecation window if external consumers exist.

Verdict

REQUEST_CHANGES — two failing tests in plans-visibility.test.ts, plain === CRON auth in /plans/refresh/route.ts, and the undocumented breaking change to the /plans/{id}/results envelope (with for-ai.md and openapi.yaml still describing the old shape) all need to land before merge. The Daydream-only pivot is the right call and resolves the structural concerns from the previous round.

- Introduced a check to throw an error if billingProviderSlug is an empty string, ensuring better error handling and input validation in the billing provider logic.

- Replaced inline CRON authorization checks with a dedicated `verifyCronAuth` function for improved code clarity and maintainability. - Updated affected routes to utilize the new authorization method, ensuring consistent authentication handling across the application. - Enhanced error logging for better debugging in case of authorization failures.

…d update data retrieval - Removed references to the 'pymthouse' billing provider, ensuring the capability catalog is now exclusively focused on 'daydream'. - Simplified the logic for filtering capabilities, allowing all requested capabilities to pass through unfiltered when manifestOnly is true. - Updated the data retrieval process to utilize the new `getDatasetCapabilities` function, enhancing the clarity and maintainability of the code. - Adjusted the seeding script to backfill billing provider slugs, ensuring legacy plans are correctly updated to 'daydream'.

seanhanca

Code Review — Critical & Major Findings

Reviewed the full diff across all 72 changed files. CI is green and the architecture is solid — Daydream-only scoping, DB-persisted dataset, lazy plan cache, and provider-restriction abstraction are all clean decisions. Below are the issues that should be addressed before merge.

Summary

#	Severity	File	Issue
1	Critical	global-dataset.ts	writeGlobalDataset deletes all rows before inserting — concurrent reads during refresh window get empty results
2	Major	refresh.ts	evaluateAndCache / refreshAllPlans accept authToken, requestUrl, cookieHeader params that are never forwarded — dead parameters create a false contract
3	Major	plans/[id]/results/route.ts	errors.internal(message) on line 90 leaks raw error text from evaluateAndCache to the client on 500
4	Major	global-refresh.ts	startupRefreshPromise cached-but-cleared-on-reject creates a duplicate refresh race
5	Major	cron-auth.ts	Length check before timing-safe compare leaks CRON_SECRET length

Detailed inline comments follow.

…r dataset updates - Implemented a new approach for updating the global dataset that uses upsert operations followed by pruning stale rows, ensuring concurrent readers always see a populated table. - Introduced a dedicated `upsertDatasetBatch` function to handle batch upserts, improving performance and maintainability. - Updated related tests to reflect changes in the dataset writing logic and ensure proper validation of upsert behavior. - Removed legacy delete-then-insert logic to prevent empty table states during refresh operations.

eliteprox · 2026-05-27T19:17:32Z

All 5 findings from this review (plus the inline pymthouse dead-branch note) are addressed in the latest push. Summary:

#	Issue	Resolution
1	`writeGlobalDataset` delete-then-insert window	Switched to upsert-then-prune: every row is upserted via `INSERT … ON CONFLICT (capability, orchUri) DO UPDATE` with the new `refreshedAt`, then a single `deleteMany({ refreshedAt: { lt: now } })` prunes stragglers. Table is never empty mid-refresh.
2	Dead `authToken/requestUrl/cookieHeader` on `evaluateAndCache`/`refreshAllPlans`	Removed the unused parameters and updated all 4 callers (`plans/[id]/results`, `plans/[id]/python-gateway`, `plans/refresh`, `startLocalRefreshLoop`) plus tests. `refreshGlobalDataset` keeps `authToken` because it actually does forward it to source adapters.
3	Raw error text leaked to client on 500	Log `err` server-side, return generic `Failed to evaluate plan` from `errors.internal(...)`.
4	`startupRefreshPromise` cleared-on-reject duplicate-refresh race	Rejected promise is held for a 30 s `STARTUP_REJECT_COOLDOWN_MS` cooldown via `setTimeout(...).unref?.()` so concurrent cold-start callers share the failure; only then can a fresh attempt run. `inflight.catch(() => undefined)` suppresses spurious unhandled-rejection warnings while leaving the original rejection observable to awaiters.
5	`cron-auth` length-leak before timing-safe compare	Both `auth` and `expected` are HMAC-SHA256'd before `timingSafeEqual` — length is no longer observable via timing.
(bonus)	Dead `pymthouse` branch in `plans.ts` `billingProviderWhere`	Removed — `BillingProviderSlug` is `['daydream']` only in this PR. Will return in #338 with test coverage.

Verification: npm --workspace @naap/web-next test → 705 passed, 1 skipped, 0 failed. Tests for writeGlobalDataset updated to assert the new $executeRaw upsert + deleteMany({ refreshedAt: { lt: ... } }) contract.

eliteprox · 2026-05-27T20:13:31Z

@qianghan @seanhanca This is ready for re-review

Rebased onto main (post #337 "Daydream discovery plans and cache refresh"). Re-applies PymtHouse integration on top of the canonical Daydream discovery work that landed in #337: - PymtHouse OIDC device-flow login + device approval routes (consent screen, CSRF, constant-time cookie verify, per-instance rate limiting) - Billing token/usage APIs and usage helpers via @pymthouse/builder-sdk - Developer API key expiry, billing key prefix masking, safe key serialization - developer-api plugin UI (usage, keys, discovery plan tooltip) + backend proxy - PymtHouse manifest gating restored across discovery: provider-restrictions, capability-catalog, python-gateway, plan refresh, global refresh - billingProviderSlug supports 'pymthouse' alongside 'daydream' (OR-null backfill for legacy rows); defaults to pymthouse - UMD plugin stylesheet lifecycle, Modal escape-to-close, plugin visibleToUsers Conflict resolution preserved #337's Daydream discovery baseline and layered the branch's PymtHouse functionality on top. The plugin frontend (identical to main except 5 files) takes the branch versions of the PymtHouse-aware pages. Co-authored-by: Cursor <cursoragent@cursor.com>

…#338) feat(pymthouse): add OIDC, billing APIs, and manifest-gated discovery Re-apply the PymtHouse integration on top of the canonical Daydream discovery baseline from #337. Add OIDC device-flow login and approval routes, billing token/usage APIs, developer API key handling, developer-api plugin UI/backend proxy, and PymtHouse manifest gating across discovery routes and refresh flows. Restore billingProviderSlug support for both pymthouse and daydream, including legacy OR-null handling and pymthouse defaults. Also harden sensitive API responses with safer error handling and no-store caching, align API key status serialization to uppercase, update jsdom-safe manifest tests, sync the lockfile, and clean up UMD stylesheet lifecycle handling.

github-actions Bot added has-migration Includes database migration size/L Large PR (201-500 lines) labels May 23, 2026

github-actions Bot added scope/shell Shell app changes scope/packages Shared package changes scope/infra Infrastructure changes labels May 23, 2026

vercel Bot deployed to Preview May 23, 2026 05:41 View deployment

coderabbitai Bot requested changes May 23, 2026

View reviewed changes

Comment thread apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/route.ts

Comment thread apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/route.ts

Comment thread apps/web-next/src/lib/orchestrator-leaderboard/plans.ts Outdated

vercel Bot deployed to Preview May 23, 2026 05:52 View deployment

eliteprox enabled auto-merge (squash) May 23, 2026 05:57

vercel Bot deployed to Preview May 23, 2026 06:07 View deployment

eliteprox changed the title ~~feat(discovery): link plans to billing providers~~ feat(orchestrator-leaderboard): discovery plans and PymtHouse manifest gating May 23, 2026

eliteprox mentioned this pull request May 23, 2026

feat(pymthouse): OIDC device flow, billing APIs, and developer-api UI #338

Merged

3 tasks

eliteprox removed the size/L Large PR (201-500 lines) label May 23, 2026

github-actions Bot added the size/XL Extra large PR (500+ lines) label May 23, 2026

eliteprox requested a review from seanhanca May 23, 2026 06:35

seanhanca requested changes May 25, 2026

View reviewed changes

vercel Bot deployed to Preview May 26, 2026 17:22 View deployment

coderabbitai Bot requested changes May 26, 2026

View reviewed changes

eliteprox and others added 2 commits May 26, 2026 14:03

seanhanca requested changes May 26, 2026

View reviewed changes

fix(plans): add validation for empty billingProviderSlug

e4fa461

- Introduced a check to throw an error if billingProviderSlug is an empty string, ensuring better error handling and input validation in the billing provider logic.

eliteprox dismissed coderabbitai[bot]’s stale review via e4fa461 May 26, 2026 23:27

vercel Bot deployed to Preview May 26, 2026 23:30 View deployment

eliteprox added 2 commits May 26, 2026 19:57

vercel Bot deployed to Preview May 27, 2026 00:02 View deployment

eliteprox requested review from qianghan and seanhanca May 27, 2026 00:04