feat(orchestrator-leaderboard): Daydream discovery plans and cache refresh#337
Conversation
Add billingProviderSlug on DiscoveryPlan with API filtering, CRUD support, and default plan seeding for pymthouse. Plans list/get accept an optional billingProviderSlug query param; pymthouse requests also match legacy rows with a null provider slug. Co-authored-by: Cursor <cursoragent@cursor.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
🗃️ Database Migration Detected This PR includes changes to Prisma schema files. Please ensure:
Requesting review from the core team: @livepeer/core |
|
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds billing-provider slug support across types/DB/CRUD and seeds; implements provider-scoped capability catalog; adds plan-scoped and stateless python-gateway discovery with tiered shuffling and composite evaluation cache keys; refactors ClickHouse query routing and adapters; hardens cron/admin auth; updates frontend components, hooks, API client, tests, docs, and startup refresh. ChangesBilling Provider Plan Scoping
Python Gateway Discovery
Capability Catalog and ClickHouse
Caching, Discovery Policy, and Startup
Security and Misc
Possibly related PRs
Estimated code review effort 🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
apps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.ts (1)
164-187: ⚡ Quick winAdd a
getPlan(..., 'pymthouse')compatibility test for legacynullrows.You already cover this fallback in
listPlans; adding the same assertion forgetPlanwould protect the special-case behavior from regressions on the detail endpoint.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.ts` around lines 164 - 187, Add a new test that calls getPlan('pub-1', { ownerUserId: 'user-b' }, 'pymthouse') and asserts mockFindFirst was invoked with a where clause that applies the provider compatibility fallback (i.e., the billingProviderSlug condition allows legacy null rows). Reuse publicPlan and mockFindFirst from the existing test, mirroring the provider-scoped assertion but expect the provider predicate to include billingProviderSlug: null (or an OR that allows null) for the 'pymthouse' case so the special-case behavior is covered on the detail endpoint.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/`[id]/route.ts:
- Around line 41-45: The check `if (!raw)` treats an empty query
`?billingProviderSlug=` as missing and bypasses validation; change the check to
`raw === null` so empty strings still reach
`BillingProviderSlugSchema.safeParse(raw.trim().toLowerCase())`, allowing the
route handler to return a 400 on invalid/empty values; update the logic around
the `raw` variable in route.ts (the block that calls
BillingProviderSlugSchema.safeParse) accordingly to ensure only a truly absent
param is treated as missing.
In `@apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/route.ts`:
- Around line 24-29: The current check uses if (!raw) which treats an empty
query value as absent and bypasses validation; change that to a strict null
check (if (raw === null)) so only missing params return { value: null, error:
null }, and let BillingProviderSlugSchema.safeParse(raw.trim().toLowerCase())
handle empty-string validation; update the conditional around raw and keep
references to the same variables (raw) and schema
(BillingProviderSlugSchema.safeParse) so empty ?billingProviderSlug= is treated
as invalid.
In `@apps/web-next/src/lib/orchestrator-leaderboard/plans.ts`:
- Around line 162-167: The early return when scopeWhere is falsy blocks admin
actions on public plans; in updatePlan and deletePlan adjust the logic around
writeScopeWhere(scope) so that if the existing plan's visibility === 'public'
you do not immediately return 'forbidden' but still construct mutationWhere to
allow admin public-plan mutations (i.e., compute scopeWhere =
writeScopeWhere(scope) only for non-public plans, and for public plans let
mutationWhere be { id } so the public-plan admin branch can apply); apply the
same pattern to both updatePlan and deletePlan, referencing writeScopeWhere,
existing.visibility, mutationWhere, updatePlan and deletePlan to locate the
changes.
---
Nitpick comments:
In
`@apps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.ts`:
- Around line 164-187: Add a new test that calls getPlan('pub-1', { ownerUserId:
'user-b' }, 'pymthouse') and asserts mockFindFirst was invoked with a where
clause that applies the provider compatibility fallback (i.e., the
billingProviderSlug condition allows legacy null rows). Reuse publicPlan and
mockFindFirst from the existing test, mirroring the provider-scoped assertion
but expect the provider predicate to include billingProviderSlug: null (or an OR
that allows null) for the 'pymthouse' case so the special-case behavior is
covered on the detail endpoint.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8f87ff80-6fab-45d6-b0ec-65d538bd0093
📒 Files selected for processing (9)
apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/route.tsapps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/route.tsapps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.tsapps/web-next/src/lib/orchestrator-leaderboard/__tests__/types-validation.test.tsapps/web-next/src/lib/orchestrator-leaderboard/plans.tsapps/web-next/src/lib/orchestrator-leaderboard/types.tsbin/seed-discovery-plans.tspackage.jsonpackages/database/prisma/schema.prisma
…tions Treat only absent billingProviderSlug query params as unset so empty values return 400. Allow admin public-plan updates/deletes without a personal write scope. Add getPlan pymthouse null-slug compatibility test. Co-authored-by: Cursor <cursoragent@cursor.com>
Move orchestrator-leaderboard backend, manifest gating, python-gateway routes, discovery-service client, plugin UI, and docs from #307 into this PR. Excludes developer-api and billing OIDC/auth changes. Includes capability catalog, tiered discovery shuffle, default/plan-scoped python-gateway, PymtHouse manifest sync, and removal of demo seed API. Co-authored-by: Cursor <cursoragent@cursor.com>
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
seanhanca
left a comment
There was a problem hiding this comment.
Review — feat(orchestrator-leaderboard): discovery plans and PymtHouse manifest gating
Summary
The three CodeRabbit-flagged fixes all land correctly — empty ?billingProviderSlug= is rejected by zod, and updatePlan / deletePlan now branch on existing.visibility === 'public' so admins no longer fail when their personal scope is empty. The remaining concerns are a non-obvious fail-closed gap for legacy billingProviderSlug = null rows and a chunk of dead code in discovery-service/client.ts. With those two resolved this is a solid feature.
Blocking issues (must fix)
1. Legacy billingProviderSlug = null plans bypass PymtHouse manifest gating end-to-end
plans.ts:74-78 deliberately maps ?billingProviderSlug=pymthouse to OR: [{ slug: 'pymthouse' }, { slug: null }] so legacy rows are queryable as pymthouse-compatible. But the runtime gating treats them as "no provider":
// apps/web-next/src/lib/orchestrator-leaderboard/provider-restrictions.ts:34-43
if (normalizeBillingProviderSlug(billingProviderSlug) !== 'pymthouse') {
return capabilities; // bypasses manifest filter for null
}
// apps/web-next/src/lib/orchestrator-leaderboard/refresh.ts:111-120
if (normalizeBillingProviderSlug(plan.billingProviderSlug) === 'pymthouse') {
await ensurePymthouseManifestFresh(...);
}
Net effect: a personal plan created before this PR (slug null) returned via /plans/:id/python-gateway will (a) skip the manifest refresh and (b) pass its capabilities through verbatim — i.e. excluded pipelines/models will still be discovered. That contradicts the documented invariant in docs/pymthouse-integration.md:90 ("missing manifest denies by default").
Fix: either treat null as pymthouse in normalizeBillingProviderSlug (so the default-pymthouse semantic applies everywhere), or backfill billingProviderSlug = 'pymthouse' for existing rows during prisma db push (a one-shot UPDATE in bin/seed-discovery-plans.ts would be enough — createPlan already defaults new rows to 'pymthouse').
2. apps/web-next/src/lib/discovery-service/client.ts is dead code
No callers anywhere in the repo (verified queryPlanFromDiscoveryService, refreshDiscoveryServiceDataset, isDiscoveryServiceEnabled, and DISCOVERY_SERVICE_URL). It exports a 158-line HTTP client with no tests. Either wire it into refresh.ts / global-refresh.ts and add a vitest, or delete it from this PR. Shipping it unused increases the surface that has to be kept in sync with a service this codebase doesn't actually call.
Important (should fix before merge)
1. Stale doc reference to the removed POST /plans/seed
// plugins/orchestrator-leaderboard/docs/for-ai.md:65
| `POST` | `/plans/seed` | JWT or `gw_` | Seed 4 demo plans (idempotent). Onboarding. |
The route was deleted in this PR; replace this row with a pointer to npm run seed:discovery-plans / bin/seed-discovery-plans.ts.
2. openapi.yaml is not updated for any of the new surface
Searches in plugins/orchestrator-leaderboard/docs/openapi.yaml for billingProviderSlug, python-gateway, and capability-catalog all return 0 hits. The new query param on list/get and the three new routes (/plans/:id/python-gateway, /python-gateway, /capability-catalog) need OpenAPI entries — this is the contract the PR description points external consumers at.
3. capability-catalog/route.ts:89 always forces bypassCache: true
const catalog = await getDashboardPipelineCatalog({ bypassCache: true });
Combined with Cache-Control: private, no-store, must-revalidate, every plan-edit page open recomputes the entire pipeline catalog. Drop bypassCache (or move the bypass behind ?refresh=1) so the facade-level cache helps the hot UI path.
4. Inconsistent empty-slug handling on the python-gateway plan route
// apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/python-gateway/route.ts:32-35
const raw = request.nextUrl.searchParams.get('billingProviderSlug');
if (!raw) {
return { value: null, error: null };
}
The fixed CRUD/list routes use if (raw === null) so empty strings 400. This route uses if (!raw) so ?billingProviderSlug= is silently treated as missing. Match the strict 400 behavior (preferred) or document the asymmetry — otherwise python-gateway clients sending the param will silently fall back to "no provider override" instead of getting a clear validation error.
Migration & rollout
Schema change is safe (billingProviderSlug String? + nullable index, prisma db push --accept-data-loss will ALTER TABLE without nuking existing rows). Personal plans created before this PR will land with null — see Blocking #1; consider adding a one-time UPDATE "DiscoveryPlan" SET "billingProviderSlug" = 'pymthouse' WHERE "billingProviderSlug" IS NULL to the seed script so the runtime invariant holds without code changes to the gating helpers.
.env.local.example additions reference (commented-out) PYMTHOUSE_* knobs only — no secrets, and PYMTHOUSE_ALLOW_MISSING_MANIFEST_FAIL_OPEN is correctly documented as audited and opt-in.
Verdict
REQUEST_CHANGES — primarily for the legacy-null PymtHouse-gating gap (blocking #1) and the unused discovery-service/client.ts (blocking #2). Once those are resolved and the docs/openapi update lands, the rest are happy "important" cleanups that can ride along.
…points - Updated the capability catalog route to support a `refresh` query parameter for bypassing cache. - Added a test to validate 400 responses when `billingProviderSlug` is empty. - Modified the `parseBillingProviderSlugParam` function to treat null values correctly. - Implemented a backfill function for legacy billing provider slugs in the seeding script. - Introduced new API endpoints for plan-scoped and default python-gateway discovery, along with capability catalog retrieval. Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Actionable comments posted: 14
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
plugins/orchestrator-leaderboard/docs/openapi.yaml (1)
69-78:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winDocument the new 400 path for invalid
billingProviderSlug.These operations now accept
billingProviderSlug, and the shared parameter explicitly says an empty value returns 400. The spec still only advertises 200/401(/404), so generated clients and external integrators will miss a real validation branch.Also applies to: 136-154
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plugins/orchestrator-leaderboard/docs/openapi.yaml` around lines 69 - 78, The OpenAPI operations that reference the shared parameter BillingProviderSlugQuery must include a "400" response to document the empty/invalid billingProviderSlug validation branch; update the responses block for the affected operations (the one shown and the other occurrence around lines 136-154) to add a "400" response that references the existing bad-request response/schema (e.g., components/responses/BadRequest or a suitable error schema) so generated clients and integrators will surface the validation error for BillingProviderSlugQuery.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@apps/web-next/src/app/api/v1/orchestrator-leaderboard/capability-catalog/route.ts`:
- Around line 62-67: When handling requestedBillingProviderSlug === 'pymthouse'
wrap the call to ensurePymthouseManifestFresh() in a try/catch so that any
exception is swallowed (and optionally logged) and manifestAvailable is set to
false instead of letting the error propagate; after the try/catch still call
getPymthouseManifestSnapshot() only when no error occurred to determine
manifestAvailable, referencing ensurePymthouseManifestFresh(),
getPymthouseManifestSnapshot(), and the manifestAvailable variable to locate
where to add the try/catch.
In
`@apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/`[id]/results/route.ts:
- Around line 29-31: The current check uses `if (!raw)` which treats an empty
query like `?billingProviderSlug=` as missing; change the validation so you
distinguish null vs empty string: keep the null check (`raw === null`) for
missing param, and add an explicit empty-string check (e.g., `raw.trim() ===
''`) to reject empty values and return a 400/validation error. Update the logic
around the `raw` variable in route.ts (the `billingProviderSlug` handling) so
empty strings are rejected rather than treated as missing.
In
`@apps/web-next/src/app/api/v1/orchestrator-leaderboard/python-gateway/route.ts`:
- Around line 145-148: The handler currently returns the raw variable message in
the NextResponse for all errors; change it so that when invalidCapability is
true you still return message with status 400, but when status is 500 return a
generic non-sensitive message (e.g., "Internal server error") instead of the raw
message; keep the original message available for server-side logging (e.g., log
the error before returning) and update the NextResponse creation (referencing
NextResponse, message, invalidCapability) accordingly.
- Around line 87-89: Call to ensurePymthouseManifestFresh() is currently outside
the try/catch so any rejection bypasses your structured API error handling; move
the ensurePymthouseManifestFresh() invocation into the existing try block that
handles the 'pymthouse' billingProvider (the same guarded path that creates/uses
the Python gateway client in route.ts) so its errors are caught and translated
into your API error response rather than escaping the handler.
In `@apps/web-next/src/lib/orchestrator-leaderboard/query.ts`:
- Around line 56-58: The current resolveDirectClickhouseUrl function drops any
configured path/query by doing new URL('/', rawUrl); change it to preserve the
original path and query from CLICKHOUSE_URL: parse rawUrl with new URL(rawUrl),
ensure the pathname ends with a trailing slash if needed (so endpoints like
base/path become base/path/), and return that URL.toString(); update the
function resolveDirectClickhouseUrl to use this logic so direct mode calls the
correct path-prefixed endpoint.
In `@apps/web-next/src/lib/pymthouse-discovery-plans.test.ts`:
- Around line 35-37: The test sets inconsistent env vars using PMTHOUSE_*;
update the vi.stubEnv calls so all keys match the rest of the integration (use
PYMTHOUSE_*). Specifically, change vi.stubEnv('PMTHOUSE_CLIENT_ID', 'app_x') to
vi.stubEnv('PYMTHOUSE_CLIENT_ID', 'app_x') and
vi.stubEnv('PMTHOUSE_M2M_CLIENT_SECRET', 'secret') to
vi.stubEnv('PYMTHOUSE_M2M_CLIENT_SECRET', 'secret'), and then run the test to
ensure no other references still use the old PMTHOUSE_* names.
In `@apps/web-next/src/lib/pymthouse-manifest.test.ts`:
- Around line 114-121: The test "uses server manifestVersion when provided" is
asserting the opposite; update the assertion in the test that calls
computeManifestRevision so it expects the returned revision to equal the
provided server manifestVersion ('server-rev-abc') and, if helpful, rename the
test or adjust its description to remain consistent with the behavior being
validated; target the computeManifestRevision call and its expect(...) assertion
for this change.
In `@apps/web-next/src/lib/pymthouse-manifest.ts`:
- Around line 245-265: The current fetch catch block clears any existing
snapshot by setting body and etag to null before calling applyManifestSnapshot,
which causes transient failures to wipe the last good manifest; change the logic
to preload the last known snapshot into body/etag before attempting the network
call (e.g. call a helper like getLastManifestSnapshot() to populate body and
etag), then only overwrite body/etag when the fetch succeeds and
parseManifestJson returns a valid manifest, and in the catch block do not set
body/etag to null so applyManifestSnapshot(body, etag) will preserve the
previous snapshot; reference parseManifestJson, applyManifestSnapshot and the
fetch block around creds.url when making the change.
In `@plugins/orchestrator-leaderboard/docs/how-to-guide.md`:
- Around line 451-452: The key-takeaway line uses the wrong path
`results.data.capabilities[...]`; update it to match the examples by referencing
`results.capabilities[...]` (or `env.data.capabilities[...]` where the doc uses
`env`) so clients parse the same structure; keep the note about `.orchUri` as
the dialer field.
In `@plugins/orchestrator-leaderboard/docs/openapi.yaml`:
- Around line 311-315: Change the query parameters "caps" and "capability" from
a scalar string to an array schema so generators emit repeatable query params;
update each param (e.g., the param named caps and the one named capability) to
use schema: { type: array, items: { type: string } } and include query
serialization hints (style: form, explode: true) so repeated usages like
?caps=a&caps=b are represented correctly.
In `@plugins/orchestrator-leaderboard/frontend/e2e/leaderboard.spec.ts`:
- Around line 260-262: The test asserts properties on resp.data.plan (e.g.,
expect(resp.data.plan.name)...), but the stubbed response still nests the
payload under resp.data.data.plan; update the mocked/stubbed response object
used in the spec so it returns the plan at resp.data.plan (remove the extra
"data" wrapper) and ensure capabilities are at resp.data.capabilities (so checks
like expect(resp.data.capabilities.noop).toHaveLength(1) pass); locate the stub
in the same spec where the fake response body is constructed and adjust the
shape accordingly.
In `@plugins/orchestrator-leaderboard/frontend/src/globals.css`:
- Around line 109-112: Remove the extra blank line between the background and
color declarations to satisfy the Stylelint rule declaration-empty-line-before:
inside the CSS rule that contains "background: var(--bg-tertiary);" and "color:
`#34d399`;", delete the empty line so the two declarations are adjacent (no blank
line before the color declaration).
In `@plugins/orchestrator-leaderboard/frontend/src/pages/PlanDetailPage.tsx`:
- Around line 125-134: bulkToggleCapabilities currently only compares exact
strings (using current.includes and removeSet) so legacy aliases like "sdxl"
remain selected when their canonical counterparts are removed; update
bulkToggleCapabilities to normalize/expand capabilities during comparison:
create or use a mapping of legacy aliases to canonical capability names (e.g.,
aliasMap) and when deselecting expand removeSet to include both the supplied
capability keys and their known aliases/canonical equivalents, then filter
current by checking against this expanded set (referencing
bulkToggleCapabilities, current, next, and removeSet to locate changes).
---
Outside diff comments:
In `@plugins/orchestrator-leaderboard/docs/openapi.yaml`:
- Around line 69-78: The OpenAPI operations that reference the shared parameter
BillingProviderSlugQuery must include a "400" response to document the
empty/invalid billingProviderSlug validation branch; update the responses block
for the affected operations (the one shown and the other occurrence around lines
136-154) to add a "400" response that references the existing bad-request
response/schema (e.g., components/responses/BadRequest or a suitable error
schema) so generated clients and integrators will surface the validation error
for BillingProviderSlugQuery.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3ceef9c9-0bbb-4e57-82b5-f15ae5ae6e10
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json,!package-lock.json
📒 Files selected for processing (68)
apps/web-next/.env.local.exampleapps/web-next/src/app/api/v1/dashboard/orchestrators/route.tsapps/web-next/src/app/api/v1/orchestrator-leaderboard/capability-catalog/route.tsapps/web-next/src/app/api/v1/orchestrator-leaderboard/dataset/refresh/route.tsapps/web-next/src/app/api/v1/orchestrator-leaderboard/filters/route.tsapps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/python-gateway/route.test.tsapps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/python-gateway/route.tsapps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/results/route.tsapps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/route.tsapps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/route.tsapps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/seed/route.tsapps/web-next/src/app/api/v1/orchestrator-leaderboard/python-gateway/route.test.tsapps/web-next/src/app/api/v1/orchestrator-leaderboard/python-gateway/route.tsapps/web-next/src/app/api/v1/orchestrator-leaderboard/sources/route.tsapps/web-next/src/instrumentation.tsapps/web-next/src/lib/facade/index.tsapps/web-next/src/lib/facade/resolvers/pipeline-catalog.tsapps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.tsapps/web-next/src/lib/orchestrator-leaderboard/__tests__/provider-restrictions.test.tsapps/web-next/src/lib/orchestrator-leaderboard/__tests__/query.test.tsapps/web-next/src/lib/orchestrator-leaderboard/__tests__/refresh.test.tsapps/web-next/src/lib/orchestrator-leaderboard/__tests__/sources/integration.test.tsapps/web-next/src/lib/orchestrator-leaderboard/discovery-order.test.tsapps/web-next/src/lib/orchestrator-leaderboard/discovery-order.tsapps/web-next/src/lib/orchestrator-leaderboard/global-refresh.tsapps/web-next/src/lib/orchestrator-leaderboard/plans.tsapps/web-next/src/lib/orchestrator-leaderboard/provider-restrictions.tsapps/web-next/src/lib/orchestrator-leaderboard/query.tsapps/web-next/src/lib/orchestrator-leaderboard/refresh.tsapps/web-next/src/lib/orchestrator-leaderboard/sources/clickhouse.tsapps/web-next/src/lib/orchestrator-leaderboard/sources/naap-pricing.tsapps/web-next/src/lib/orchestrators-discovery-policy.test.tsapps/web-next/src/lib/orchestrators-discovery-policy.tsapps/web-next/src/lib/pymthouse-discovery-plans.test.tsapps/web-next/src/lib/pymthouse-discovery-plans.tsapps/web-next/src/lib/pymthouse-manifest.test.tsapps/web-next/src/lib/pymthouse-manifest.tsapps/web-next/tests/leaderboard-posture.spec.tsapps/web-next/tests/orchestrator-leaderboard.spec.tsapps/web-next/vitest.config.tsbin/db-setup.shbin/seed-discovery-plans.tsbin/vercel-build.shdocs/pymthouse-integration.mdplugins/orchestrator-leaderboard/docs/data-sources.mdplugins/orchestrator-leaderboard/docs/for-ai.mdplugins/orchestrator-leaderboard/docs/how-to-guide.mdplugins/orchestrator-leaderboard/docs/openapi.yamlplugins/orchestrator-leaderboard/frontend/e2e/leaderboard.spec.tsplugins/orchestrator-leaderboard/frontend/src/App.tsxplugins/orchestrator-leaderboard/frontend/src/components/AdminSettings.tsxplugins/orchestrator-leaderboard/frontend/src/components/CapabilityGroupPicker.tsxplugins/orchestrator-leaderboard/frontend/src/components/CapabilityTag.tsxplugins/orchestrator-leaderboard/frontend/src/components/CollapsibleTagList.tsxplugins/orchestrator-leaderboard/frontend/src/components/EndpointGuide.tsxplugins/orchestrator-leaderboard/frontend/src/components/FormLabel.tsxplugins/orchestrator-leaderboard/frontend/src/components/SectionLabel.tsxplugins/orchestrator-leaderboard/frontend/src/components/StyledCheckbox.tsxplugins/orchestrator-leaderboard/frontend/src/components/TabNav.tsxplugins/orchestrator-leaderboard/frontend/src/globals.cssplugins/orchestrator-leaderboard/frontend/src/hooks/useCapabilityCatalog.tsplugins/orchestrator-leaderboard/frontend/src/hooks/usePlanDetail.tsplugins/orchestrator-leaderboard/frontend/src/hooks/usePlans.tsplugins/orchestrator-leaderboard/frontend/src/lib/api.tsplugins/orchestrator-leaderboard/frontend/src/pages/LeaderboardPage.tsxplugins/orchestrator-leaderboard/frontend/src/pages/PlanCreatePage.tsxplugins/orchestrator-leaderboard/frontend/src/pages/PlanDetailPage.tsxplugins/orchestrator-leaderboard/frontend/src/pages/PlansOverviewPage.tsx
💤 Files with no reviewable changes (1)
- apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/seed/route.ts
✅ Files skipped from review due to trivial changes (6)
- plugins/orchestrator-leaderboard/frontend/src/components/StyledCheckbox.tsx
- plugins/orchestrator-leaderboard/frontend/src/components/TabNav.tsx
- plugins/orchestrator-leaderboard/docs/data-sources.md
- apps/web-next/tests/orchestrator-leaderboard.spec.ts
- bin/vercel-build.sh
- docs/pymthouse-integration.md
… routes - Added error handling for PymtHouse manifest refresh failures in the capability catalog route. - Improved validation logic for `billingProviderSlug` in the plans results route. - Updated the Python gateway route to ensure manifest refresh is only attempted when necessary. - Enhanced tests to cover scenarios where manifest fetch fails, preserving the last known good state. - Refactored utility functions to improve clarity and maintainability. Co-authored-by: Cursor <cursoragent@cursor.com>
…rator discovery plans - Replaced PymtHouse manifest dependencies with Daydream-specific logic in the capability catalog and Python gateway routes. - Updated billing provider handling to default to 'daydream' instead of 'pymthouse'. - Introduced new discovery constants and policies to support Daydream-only functionality. - Removed legacy PymtHouse manifest and discovery plan files, streamlining the codebase. - Enhanced the orchestrator discovery policy to apply filters without relying on PymtHouse exclusions. Co-authored-by: Cursor <cursoragent@cursor.com>
seanhanca
left a comment
There was a problem hiding this comment.
Re-review after Daydream-only pivot
Picking this back up after the 8 follow-up commits. Scope-pivot to Daydream-only is clean — PymtHouse manifest/discovery code is gone and BILLING_PROVIDER_SLUGS = ['daydream'] is enforced by Zod end-to-end. 5 of 6 prior blockers are fully resolved. Remaining issues below.
Prior-review blocker status
- Legacy
billingProviderSlug = nullplans bypassed PymtHouse manifest gating — RESOLVED.provider-restrictions.tsno longer does manifest gating;normalizeBillingProviderSlugonly treats'daydream'as valid;bin/seed-discovery-plans.ts:78-91backfills legacy null rows at deploy time. - Dead
apps/web-next/src/lib/discovery-service/client.ts— RESOLVED. Directory deleted. - Stale
POST /plans/seeddoc reference — RESOLVED. Tables infor-ai.mdandhow-to-guide.md:636-640now point atnpm run seed:discovery-plans. Route file deleted; no callers remain. openapi.yamlnot updated for new endpoints/params — PARTIALLY RESOLVED.billingProviderSlug(openapi.yaml:796-803),python-gateway(openapi.yaml:252-375), andcapability-catalog(openapi.yaml:377-417) are documented. Still wrong: the/plans/{id}/resultsresponse schema (openapi.yaml:237-243) keeps the old double-datawrapper — see Blocking #3.capability-catalog/route.ts:89always forcedbypassCache: true— RESOLVED. Now only bypasses when?refresh=1is present.- Inconsistent empty-slug handling on
/plans/[id]/python-gateway— RESOLVED. All four route handlers consistently 400 on empty value, and the python-gateway test asserts it (plans/[id]/python-gateway/route.test.ts:84-93).
Blocking issues (must fix)
1. Vitest is red — two tests fail in plans-visibility.test.ts
apps/web-next/src/lib/orchestrator-leaderboard/__tests__/plans-visibility.test.ts:111-115 and :167-172 expect listPlans / getPlan to throw 'Invalid billingProviderSlug' when called with a blank slug ' '. The implementation in apps/web-next/src/lib/orchestrator-leaderboard/plans.ts:67-75 does no validation — it constructs { billingProviderSlug } from whatever string it receives and passes it to Prisma. Local run: Tests 2 failed | 137 passed | 1 skipped.
Fix: add a trim/empty-string guard inside billingProviderWhere (or at the call sites in listPlans / getPlan / listPlansWhere) that throws Error('Invalid billingProviderSlug') when a non-null/undefined value normalizes to ''. The tests cast ' ' as never precisely because the type alone doesn't enforce the runtime contract — fix the implementation, not the tests.
2. apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/refresh/route.ts:14-17 CRON auth still uses === string compare
Release notes claim "Hardened CRON auth with constant-time checks," but only dataset/refresh/route.ts:21-31 and filters/route.ts:27-37 use timingSafeEqual. The plans-refresh route is still:
function authorized(request: NextRequest): boolean {
const auth = request.headers.get('authorization');
return Boolean(process.env.CRON_SECRET) && auth === `Bearer ${process.env.CRON_SECRET}`;
}Fix: replicate the timingSafeEqual pattern from dataset/refresh/route.ts:21-31 (length short-circuit + buffered compare). Extract into a shared verifyCronAuth(request) helper so all CRON entry points share the same hardened check and the next addition can't drift.
3. /plans/{id}/results response-shape change is undocumented and contradicts for-ai.md + openapi.yaml
apps/web-next/src/app/api/v1/orchestrator-leaderboard/plans/[id]/results/route.ts:83 now returns success(withPlanMeta) — single-wrapped { success, data: PlanResults }. Pre-PR it returned success({ data: withPlanMeta }) — double-wrapped. how-to-guide.md was updated correctly, but:
plugins/orchestrator-leaderboard/docs/for-ai.md:36-37still says "the payload is double-nested:body.data.data.capabilities[<cap>][]. This is intentional, not a bug."for-ai.md:265-269reference impl readsout.datafrom a{ data: PlanResults }wrapper.for-ai.md:309pitfalls table still tells readers "Path isbody.data.data.capabilities— double.data."plugins/orchestrator-leaderboard/docs/openapi.yaml:237-243still defines the response asdata: { data: PlanResults }.
This is a backward-incompatible break for any client following the docs. Fix: either revert to the double-wrap on the server (low-risk option) or update all four doc sites + the openapi schema in this PR, add a "Breaking change" note to the PR description, and pin the new shape with a regression test (body.data.capabilities defined, body.data.data undefined).
Important (should fix before merge)
-
plans/[id]/python-gateway/route.ts:129-135leaks raw error messages in 500 responses. The default/python-gatewayroute correctly returns'Internal server error'and logs server-side (python-gateway/route.ts:131-148). The plan-scoped sibling returnserr instanceof Error ? err.message : 'Failed to evaluate plan'verbatim. Match the default route's pattern. -
dataset/refresh/route.ts:87-93andsources/route.ts:112-118also surfaceerr.messagein 500 envelopes. Lower severity (admin-only), but the generic-message + server-sideconsole.errorpattern should apply across the plugin. -
useCapabilityCatalog.ts:30-48is dead code in this PR. The hook type isBillingProviderSlug = 'daydream'(line 10), so the guardif (billingProviderSlug !== 'pymthouse' || …)is always true andloadManifestalways returns early. Drop themanifest…state + first effect (it's all dead until PR #338), or restore the manifest plumbing now. Leaving dead branches makes the next PR riskier. -
provider-restrictions.ts:17-22providerRestrictionRevisionaccepts a parameter it ignores. Footgun for the cache-key contract inrefresh.ts:31-36(revwill always be'na'). Remove the parameter or drop the cache-key field. -
packages/database/prisma/schema.prisma:2250doc comment is stale. Still says "pymthouse | daydream— PymtHouse Builder allowlist intersection applies when pymthouse." Update to "daydream" (or note that other slugs land in #338). -
openapi.yaml:858Capabilitypattern is^[a-zA-Z0-9_-]+$. Actual Zod regex intypes.ts:96is/^[a-zA-Z0-9_.:\-/]+$/andtypes-validation.test.ts:39-50confirms dots, colons, and/are accepted. OpenAPI under-specifies; generated clients will reject valid plans. -
openapi.yaml:882-884documentsSLAWeightsdefaults0.4 / 0.3 / 0.3. The Zod schema (types.ts:106-110) sets no defaults. Either remove the defaults from the spec or add them in Zod.
Nits / suggestions
plans-visibility.test.ts:111casts' ' as neverfor a reason — once the implementation throws, the cast can stay but documents the runtime contract.- Slug validation drifts between routes:
plans/[id]/results/route.ts:35-38testsnormalized === ''explicitly, whileplans/route.ts:29-33and friends rely onBillingProviderSlugSchema.safeParse('')failing. Consolidate into a singleparseBillingProviderSlugParamutility. provider-restrictions.test.ts:11-13has an emptyafterEachwith only a comment. Delete the hook.python-gateway/route.ts:75-78accepts bothbillingProviderSlugandbillingProvideraliases vianormalizeBillingProviderSlug, but the plan-scoped python-gateway only acceptsbillingProviderSlugand 400s otherwise. Document the asymmetry inline.plans/[id]/python-gateway/route.ts:89always setsX-Pymthouse-Manifest: emptyon the empty-allowlist branch — there is no PymtHouse manifest in this PR. Drop or rename the header.discovery-policy.ts:49-64: thecase 'latency':/case 'price':arms fall back toslaScore. Add an inline comment or TODO so the silent fallback isn't surprising once those fields exist.for-ai.md:73-77listsPOST /plans/refreshas "Cron-only bulk refresh. Never call." — accurate, but cross-link the openapi entry (openapi.yaml:419-452) so it doesn't drift again.
Test coverage gaps
- No test pins the new single-wrapped shape of
/plans/{id}/results(would have caught the doc/code drift in Blocking #3). - No test verifies that pre-existing rows with
billingProviderSlug = nullbecome invisible when the UI requests?billingProviderSlug=daydream, or that the seed backfill rewrites them. - No test for
/plans/refresh/route.ts: missingCRON_SECRET→ 401, wrong secret → 401, valid secret → 200 (would have caught Blocking #2). - No test asserts the capability-catalog route only bypasses cache when
?refresh=1is present. - No test for
capability-catalog/route.tserror paths (Facade throws). The route currently lacks a try/catch aroundgetDashboardPipelineCatalog— an upstream failure surfaces as Next's default error page, not a JSON envelope. discovery-order.test.tsis solid for partitioning but doesn't assert all orchestrators appear exactly once when duplicates are trimmed mid-input (discovery-order.ts:65-74dedupe).bin/seed-discovery-plans.tshas no test forbackfillLegacyBillingProviderSlugs— given it mutates production rows at build time, an idempotency test (run twice → no change after first) is warranted.
Migration & rollout notes
DiscoveryPlan.billingProviderSlugis nullable + indexed (schema.prisma:2247-2272), soprisma db pushis non-destructive.- Backfill runs at deploy in
bin/seed-discovery-plans.ts:78-99vianpx tsx, scheduled inbin/vercel-build.sh:108-111. Runs afterprisma db push, so legacynullrows are immediately rewritten to'daydream'. If the seed crashes mid-loop, partial state is possible — but the next deploy is idempotent. - During the deploy window (schema push → seed → new pods take traffic), old pods still query without the new column and don't care about NULL slugs. No read path in this PR breaks on NULL after the backfill completes.
CreatePlanSchemadefaults to'daydream'and the UI hardcodes'daydream'(PlanCreatePage.tsx:25); post-deploy plans start with the correct slug.- For PR #338 (PymtHouse): the seed script will need to learn that
null→'daydream'is no longer always correct, otherwise future PymtHouse imports get silently coerced. - The
/plans/{id}/resultsenvelope change is breaking for external SDKs readingbody.data.data.capabilities. Surface in the PR description and release notes; consider a one-release deprecation window if external consumers exist.
Verdict
REQUEST_CHANGES — two failing tests in plans-visibility.test.ts, plain === CRON auth in /plans/refresh/route.ts, and the undocumented breaking change to the /plans/{id}/results envelope (with for-ai.md and openapi.yaml still describing the old shape) all need to land before merge. The Daydream-only pivot is the right call and resolves the structural concerns from the previous round.
- Introduced a check to throw an error if billingProviderSlug is an empty string, ensuring better error handling and input validation in the billing provider logic.
- Replaced inline CRON authorization checks with a dedicated `verifyCronAuth` function for improved code clarity and maintainability. - Updated affected routes to utilize the new authorization method, ensuring consistent authentication handling across the application. - Enhanced error logging for better debugging in case of authorization failures.
…d update data retrieval - Removed references to the 'pymthouse' billing provider, ensuring the capability catalog is now exclusively focused on 'daydream'. - Simplified the logic for filtering capabilities, allowing all requested capabilities to pass through unfiltered when manifestOnly is true. - Updated the data retrieval process to utilize the new `getDatasetCapabilities` function, enhancing the clarity and maintainability of the code. - Adjusted the seeding script to backfill billing provider slugs, ensuring legacy plans are correctly updated to 'daydream'.
seanhanca
left a comment
There was a problem hiding this comment.
Code Review — Critical & Major Findings
Reviewed the full diff across all 72 changed files. CI is green and the architecture is solid — Daydream-only scoping, DB-persisted dataset, lazy plan cache, and provider-restriction abstraction are all clean decisions. Below are the issues that should be addressed before merge.
Summary
| # | Severity | File | Issue |
|---|---|---|---|
| 1 | Critical | global-dataset.ts | writeGlobalDataset deletes all rows before inserting — concurrent reads during refresh window get empty results |
| 2 | Major | refresh.ts | evaluateAndCache / refreshAllPlans accept authToken, requestUrl, cookieHeader params that are never forwarded — dead parameters create a false contract |
| 3 | Major | plans/[id]/results/route.ts | errors.internal(message) on line 90 leaks raw error text from evaluateAndCache to the client on 500 |
| 4 | Major | global-refresh.ts | startupRefreshPromise cached-but-cleared-on-reject creates a duplicate refresh race |
| 5 | Major | cron-auth.ts | Length check before timing-safe compare leaks CRON_SECRET length |
Detailed inline comments follow.
…r dataset updates - Implemented a new approach for updating the global dataset that uses upsert operations followed by pruning stale rows, ensuring concurrent readers always see a populated table. - Introduced a dedicated `upsertDatasetBatch` function to handle batch upserts, improving performance and maintainability. - Updated related tests to reflect changes in the dataset writing logic and ensure proper validation of upsert behavior. - Removed legacy delete-then-insert logic to prevent empty table states during refresh operations.
|
All 5 findings from this review (plus the inline pymthouse dead-branch note) are addressed in the latest push. Summary:
Verification: |
|
@qianghan @seanhanca This is ready for re-review |
Rebased onto main (post #337 "Daydream discovery plans and cache refresh"). Re-applies PymtHouse integration on top of the canonical Daydream discovery work that landed in #337: - PymtHouse OIDC device-flow login + device approval routes (consent screen, CSRF, constant-time cookie verify, per-instance rate limiting) - Billing token/usage APIs and usage helpers via @pymthouse/builder-sdk - Developer API key expiry, billing key prefix masking, safe key serialization - developer-api plugin UI (usage, keys, discovery plan tooltip) + backend proxy - PymtHouse manifest gating restored across discovery: provider-restrictions, capability-catalog, python-gateway, plan refresh, global refresh - billingProviderSlug supports 'pymthouse' alongside 'daydream' (OR-null backfill for legacy rows); defaults to pymthouse - UMD plugin stylesheet lifecycle, Modal escape-to-close, plugin visibleToUsers Conflict resolution preserved #337's Daydream discovery baseline and layered the branch's PymtHouse functionality on top. The plugin frontend (identical to main except 5 files) takes the branch versions of the PymtHouse-aware pages. Co-authored-by: Cursor <cursoragent@cursor.com>
…#338) feat(pymthouse): add OIDC, billing APIs, and manifest-gated discovery Re-apply the PymtHouse integration on top of the canonical Daydream discovery baseline from #337. Add OIDC device-flow login and approval routes, billing token/usage APIs, developer API key handling, developer-api plugin UI/backend proxy, and PymtHouse manifest gating across discovery routes and refresh flows. Restore billingProviderSlug support for both pymthouse and daydream, including legacy OR-null handling and pymthouse defaults. Also harden sensitive API responses with safer error handling and no-store caching, align API key status serialization to uppercase, update jsdom-safe manifest tests, sync the lockfile, and clean up UMD stylesheet lifecycle handling.
Summary
Narrows the orchestrator-leaderboard discovery-plan work from #307 into a Daydream-only foundation for PR #337.
billingProviderSlugsupport onDiscoveryPlan, with PR feat(orchestrator-leaderboard): Daydream discovery plans and cache refresh #337 validating and defaulting todaydreamonly.Out of scope / stacked follow-up
PymtHouse-specific behavior has been removed from this PR and belongs in #338 (
feat/pymthouse-integration-followups), including manifest sync/gating, PymtHouse env/docs, provider UI toggle/copy, OIDC/billing APIs, and billing provider seeding.Test plan
npm --workspace @naap/web-next testnpm --workspace @naap/plugin-orchestrator-leaderboard-frontend run build@naap/plugin-build,@naap/plugin-sdk, and Vercel runtime safetyRelated: #307, #338
Summary by CodeRabbit
New Features
Bug Fixes
Improvements