refactor(dashboard): align BFF data layer with OpenAPI v1 contract and improve job feed by eliteprox · Pull Request #282 · livepeer/naap

eliteprox · 2026-04-28T19:43:52Z

Summary

This PR re-applies #270 which was rolled back due to an upstream data issue which has been resolved.

This change refactors the dashboard's BFF (Backend for Frontend) data layer to align with the upstream OpenAPI v1 contract, consolidates streaming and requests model data, improves orchestrator/pipeline data handling, and hardens job feed polling with cache busting and recovery retry logic.

Changes

API / Data Layer

Merged streaming and requests model data in the network data layer for consistency with the upstream OpenAPI contract
Updated developer network models endpoint to combine streaming and requests models into a single response shape
Enhanced pricing resolver to accommodate the new OpenAPI v1 structure and ensure accurate pricing data
Introduced utility functions for dashboard window formatting and timeout management
Improved dashboard hooks to handle the new merged data structures

Orchestrator & KPI Resolvers

Added a utility function to determine the effective upstream window for orchestrators based on the requested period
Updated orchestrator cache key to use the effective window for improved data accuracy
Simplified KPI resolver by removing unnecessary net orchestrator data fetching — now directly returns parsed KPI data
Renamed orchestratorsOnline → orchestratorObserved to better reflect actual semantics

Pipeline & Timeframe Handling

Introduced OVERVIEW_TIMEFRAME_MAX_HOURS constant to cap timeframe options consistently across resolvers
Updated KPI and pipeline resolvers to use the new normalization function for timeframe parsing
Changed default sort in PipelinesCard from model to gpus to surface the busiest models first

Job Feed

Added bustCache option to appendJobFeedPollQuery for explicit cache invalidation
Introduced JOB_FEED_RECOVERY_RETRY_DELAYS for staged recovery backoff after upstream errors
Enhanced fetchJobFeed to accept a base URL and effective polling interval for better flexibility
Improved error logging with labels on failed ingest requests in the developer API server

UI / Display Fixes

Fixed decimal value display formatting
Fixed developer model pricing display
JobFeedCard now conditionally renders input FPS only when a non-null value is present
Added documentation clarifying that effectiveSuccessRate, noSwapRatio, slaScore, and successRatio are scaled API response values in the orchestrators resolver

Cleanup

Removed the pymthouse minimal design document

Testing

Dashboard KPIs render correctly across all supported timeframe windows
Orchestrator and pipeline data resolves without regression
Job feed recovers gracefully after upstream errors using the retry delay schedule
Developer model pricing and decimal formatting display correctly

Summary by CodeRabbit

New Features
- Dashboard timeframe selection now limited to 48 hours maximum.
- Job feed polling now features adaptive retry recovery with automatic cache busting on failures.
- Network model pricing data now calculated and displayed.
Bug Fixes
- Orchestrator metric display updated for accuracy.
- Success rate precision increased to 1 decimal place.
- Pipeline FPS data handling improved with better fallback logic.
- Job feed fault tolerance enhanced with configurable recovery delays.
Documentation
- Dashboard schema and examples updated to reflect metric changes.
- API documentation clarified for data source alignment.

…PI compliance - Enhanced BFF warm route documentation to reference the upstream OpenAPI contract. - Updated developer network models endpoint to merge streaming and requests models. - Refined NAAP API warm route comments to clarify data sources. - Adjusted performance by model resolver to utilize a single cache key for streaming models. - Improved dashboard hooks to handle new data structures and ensure compatibility with requests. - Introduced new utility functions for dashboard window formatting and timeout management. - Merged streaming and requests model data in the network data layer for better consistency. - Updated orchestrator and pipeline resolvers to reflect changes in data fetching and merging logic. - Enhanced pricing resolver to accommodate new OpenAPI v1 structure and ensure accurate pricing data representation.

- Introduced a new utility function to determine the effective upstream window for orchestrators based on the requested period. - Updated the orchestrator cache key to utilize the effective window for improved data accuracy. - Simplified the KPI resolver by removing unnecessary net orchestrator data fetching and directly returning parsed KPI data. - Enhanced documentation to clarify the behavior of the orchestrator endpoint and its limitations on timeframes.

…ed performance - Changed default sorting in PipelinesCard from 'model' to 'gpus' to prioritize busy models. - Introduced OVERVIEW_TIMEFRAME_MAX_HOURS constant to cap timeframe options and ensure consistent handling across various resolvers. - Updated KPI and pipeline resolvers to utilize the new normalization function for timeframe parsing, enhancing data accuracy and performance.

…ntation

…sponse handling - Updated the JobFeedCard component to conditionally display input FPS, ensuring proper formatting when values are null. - Clarified documentation in orchestrators resolver regarding the scaling of API response values for effectiveSuccessRate, noSwapRatio, slaScore, and successRatio. - Enhanced error handling in the developer API server by adding labels to the ingest function for better logging of failed requests.

…oved error handling - Added a `bustCache` option to `appendJobFeedPollQuery` for cache invalidation. - Updated `fetchJobFeed` to accept options for cache busting and improved error handling. - Introduced `JOB_FEED_RECOVERY_RETRY_DELAYS` for better retry logic on upstream errors. - Enhanced logging to reflect the actual request URL during fetch operations.

- Enhanced `fetchJobFeed` to accept a base URL and effective polling interval, improving flexibility. - Updated retry logic with `JOB_FEED_RECOVERY_RETRY_DELAYS` for better handling of transient errors. - Refactored job feed polling to utilize the new fetch function, ensuring consistent cache busting behavior. - Added documentation for recovery backoff strategy after failed polls.

vercel · 2026-04-28T19:43:58Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
naap-platform	Ready	Preview, Comment	Apr 28, 2026 8:06pm

github-actions · 2026-04-28T19:44:05Z

⚠️ This PR is very large (1797 lines changed). Please split it into smaller, focused PRs if possible.

coderabbitai · 2026-04-28T19:44:07Z

Warning

Rate limit exceeded

@eliteprox has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 41 minutes and 8 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 83cc6c55-cef4-4255-abd5-44e19799e973

📥 Commits

Reviewing files that changed from the base of the PR and between e22e398 and 6bbdc3b.

📒 Files selected for processing (11)

apps/web-next/src/hooks/useJobFeedStream.ts
apps/web-next/src/lib/facade/dashboard-window.ts
apps/web-next/src/lib/facade/network-data.ts
apps/web-next/src/lib/facade/resolvers/net-capacity.ts
apps/web-next/src/lib/facade/resolvers/net-orchestrators.ts
apps/web-next/src/lib/facade/resolvers/perf-by-model.ts
apps/web-next/src/lib/facade/resolvers/pricing.ts
packages/plugin-sdk/src/contracts/__tests__/dashboard.test.ts
packages/plugin-sdk/src/contracts/createDashboardProvider.ts
packages/plugin-sdk/src/contracts/dashboard.ts
plugins/developer-api/backend/src/server.ts

📝 Walkthrough

Walkthrough

This PR migrates dashboard data aggregation from legacy NAAP endpoints to OpenAPI v1 streaming/models and requests/models, renames orchestratorsOnline to orchestratorsObserved across schemas and types, introduces upstream window formatting and cache-busting utilities, and refactors facade resolvers to merge streaming/requests data while adding request-mode job statistics.

Changes

Cohort / File(s)	Summary
Dashboard KPI/Orchestrator field rename `apps/web-next/src/app/(dashboard)/dashboard/page.tsx`, `packages/types/src/index.ts`, `packages/plugin-sdk/src/contracts/dashboard.ts`, `packages/plugin-sdk/src/contracts/__tests__/dashboard.test.ts`, `plugins/dashboard-data-provider/frontend/src/__tests__/provider.test.ts`, `plugins/dashboard-data-provider/frontend/src/provider.ts`, `apps/web-next/src/content/docs/guides/dashboard-data-provider.mdx`	Updated GraphQL query and contract definitions to request `orchestratorsObserved` instead of `orchestratorsOnline`; updated test expectations and documentation to match.
Contract and type system expansion `packages/plugin-sdk/src/contracts/dashboard.ts`, `packages/plugin-sdk/src/contracts/index.ts`, `packages/plugin-sdk/src/index.ts`	Introduced `DashboardKPIWithRequests`, `DashboardPipelinesWithRequests`, and job breakdown types (`DashboardJobsStats`, `DashboardJobsOverview`, `DashboardJobsByPipelineRow`, `DashboardJobsByCapabilityRow`); re-exported new types across SDK modules.
Facade and resolver refactoring `apps/web-next/src/lib/facade/index.ts`, `apps/web-next/src/lib/facade/resolvers/kpi.ts`, `apps/web-next/src/lib/facade/resolvers/orchestrators.ts`, `apps/web-next/src/lib/facade/resolvers/pipelines.ts`, `apps/web-next/src/lib/facade/stubs.ts`	Updated `getDashboardKPI` and `getDashboardPipelines` to return enriched types with request metadata; refactored `resolveKPI` to use single unified endpoint; changed `resolveOrchestrators` to apply 24h window cap and merge service URIs; updated stubs to use `orchestratorsObserved`.
Upstream window and cache utilities `apps/web-next/src/lib/dashboard/overview-timeframe.ts`, `apps/web-next/src/lib/facade/dashboard-window.ts`, `apps/web-next/src/lib/facade/resolvers/gpu-capacity.ts`	Introduced `OVERVIEW_TIMEFRAME_MAX_HOURS` constant (48h) and removed 72h/7d options; added `formatDashboardWindow` and `dashboardUpstreamTimeoutMs` helpers; updated resolvers to use new window formatting and timeframe clamping.
Upstream response parsing `apps/web-next/src/lib/facade/upstream-parse.ts`	New module providing `parseDashboardKpiBody`, `parseDashboardKpiWithRequests`, and `parseDashboardPipelinesBody` to safely normalize streaming/requests combined payloads.
Network model aggregation `apps/web-next/src/lib/facade/network-data.ts`, `apps/web-next/src/lib/facade/resolvers/net-capacity.ts`, `apps/web-next/src/lib/facade/resolvers/net-orchestrators.ts`, `plugins/developer-api/backend/src/server.ts`, `packages/plugin-sdk/src/types/network-model.ts`	Migrated from single `/net/models` endpoint to merged `streaming/models` + `requests/models` approach; updated `getRawNetModels` to fetch, merge, and enrich with pricing; refactored `resolveNetOrchestratorData` to aggregate orchestrators from streaming and requests endpoints with deduplication and URI merging; updated documentation to reflect new data source.
FPS and performance caching `apps/web-next/src/lib/facade/resolvers/perf-by-model.ts`, `apps/web-next/src/app/api/v1/network/perf-by-model/route.ts`	Changed `resolvePerfByModel` to derive FPS from `streaming/models` instead of `perf/by-model` endpoint; unified cache key across all query ranges; updated route handler to use shared `perf-by-model:streaming-models` cache key.
Pricing data aggregation `apps/web-next/src/lib/facade/resolvers/pricing.ts`	Extended resolver to support both legacy `priceAvgWeiPerUnit` and OpenAPI v1 per-orchestrator `priceWeiPerUnit` formats; added per-model min/max/avg computation with warm-orchestrator-aware capacity selection.
Dashboard UI and component updates `apps/web-next/src/components/dashboard/overview-content.tsx`	Updated KPI cards to display success rate at 1 decimal precision using `orchestratorsObserved`; removed FPS fallback to `modelUsage?.avgFps`, treating it as available only from `modelFpsByPipelineModel`; added `headerInfo` prop to `PipelineTH` for optional info icon with provenance; changed default sort to GPU descending; updated job feed tooltips and FPS cell formatting to show input/output FPS to 1 decimal place.
Job feed polling with adaptive recovery `apps/web-next/src/hooks/useJobFeedStream.ts`	Introduced cache-busting parameter and failure streak tracking; added failure-dependent backoff using `JOB_FEED_RECOVERY_RETRY_DELAYS`; refactored polling loop to enable cache busting on non-zero failure streak and reset on healthy response.
Public dashboard hook updates `apps/web-next/src/hooks/usePublicDashboard.ts`	Updated `kpi` field type to `DashboardKPIWithRequests`; added normalization logic to handle both array and wrapped responses for pipelines endpoint, selecting `streaming` or falling back to direct array.
BFF route documentation and cache strategies `apps/web-next/src/app/api/internal/bff-warm/route.ts`, `apps/web-next/src/app/api/v1/dashboard/orchestrators/route.ts`, `apps/web-next/src/app/api/v1/developer/network-models/route.ts`, `apps/web-next/src/app/api/v1/naap-api/warm/route.ts`	Updated route documentation comments to reference NAAP OpenAPI v1 specs; changed orchestrators route cache key derivation from raw `period` to normalized `orchestratorUpstreamWindowFromPeriod(period)`.

Sequence Diagrams

sequenceDiagram
    participant Dashboard as Dashboard<br/>(Client)
    participant BFF as BFF Facade<br/>(Server)
    participant NAAP as NAAP API<br/>(Upstream)

    Dashboard->>BFF: GET /api/v1/dashboard/kpi<br/>(timeframe)
    BFF->>BFF: normalizeTimeframeHours()<br/>formatDashboardWindow()
    BFF->>NAAP: GET /dashboard/kpi<br/>(window, timeoutMs)
    NAAP-->>BFF: { streaming: {...},<br/>requests?: {...} }
    BFF->>BFF: parseDashboardKpiWithRequests()<br/>(merge KPI + requests)
    BFF-->>Dashboard: DashboardKPIWithRequests<br/>(orchestratorsObserved,<br/>requests metadata)

sequenceDiagram
    participant Dashboard as Dashboard<br/>(Client)
    participant BFF as BFF Facade<br/>(Server)
    participant Cache as SWR/Redis<br/>(Cache)
    participant StreamAPI as NAAP<br/>Streaming API
    participant ReqAPI as NAAP<br/>Requests API

    Dashboard->>BFF: GET /api/v1/facade/net/orchestrators
    BFF->>Cache: Check (streaming+requests)<br/>aggregated cache
    alt Cache Hit
        Cache-->>BFF: cached orchestrator data
    else Cache Miss
        par
            BFF->>StreamAPI: GET /v1/streaming/orchestrators
            BFF->>ReqAPI: GET /v1/requests/orchestrators
        and
        end
        StreamAPI-->>BFF: rows (models, gpu_count)
        ReqAPI-->>BFF: rows (capabilities, last_seen)
        BFF->>BFF: ingestRow() aggregate<br/>dedupe by address
        BFF->>BFF: merge inventory URIs<br/>with dashboard serviceUri
        BFF->>Cache: store merged result
        Cache-->>BFF: cached data
    end
    BFF-->>Dashboard: NetOrchestratorData

sequenceDiagram
    participant Dashboard as Dashboard<br/>(Client)
    participant BFF as BFF Facade<br/>(Server)
    participant Cache as SWR/Redis<br/>(Cache)
    participant StreamAPI as NAAP<br/>Streaming API
    participant ReqAPI as NAAP<br/>Requests API
    participant PricingAPI as Pricing<br/>API

    Dashboard->>BFF: GET /api/v1/facade/network/models
    BFF->>Cache: Check (streaming+requests)<br/>merged cache
    alt Cache Hit
        Cache-->>BFF: merged NetworkModel[]
    else Cache Miss
        par
            BFF->>StreamAPI: GET /v1/streaming/models
            BFF->>ReqAPI: GET /v1/requests/models
        and
        end
        StreamAPI-->>BFF: rows (avg_fps, WarmOrchCount)
        ReqAPI-->>BFF: rows (TotalCapacity)
        BFF->>BFF: merge by pipeline:model<br/>sum counts/capacity
        BFF->>PricingAPI: GET /v1/dashboard/pricing
        PricingAPI-->>BFF: pricing rows<br/>(legacy or per-orch)
        BFF->>BFF: aggregate pricing<br/>compute min/max/avg
        BFF->>BFF: enrich NetworkModel<br/>with pricing
        BFF->>Cache: store merged+priced
        Cache-->>BFF: cached data
    end
    BFF-->>Dashboard: NetworkModel[]<br/>(with pricing)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

The diff involves substantial, multi-pattern changes across facades, resolvers, and API routes: orchestrator/KPI field renames, migration from single-endpoint to merged streaming/requests aggregation patterns, introduction of new cache utilities and parsing abstractions, UI component updates with new props, and failure-recovery logic. While many changes follow consistent patterns (e.g., field renames across contracts), the resolver logic is heterogeneous—each resolver addresses merging, deduplication, or data transformation differently—requiring separate reasoning for pricing, orchestrators, network models, and performance caching. Type/export expansion adds clarity but requires tracing public API surface changes across multiple packages.

Possibly related PRs

feat(web-next): align NAAP dashboard and BFF with OpenAPI v1 (streaming + requests) #270: Implements the same code-level changes as this PR (orchestratorsOnline → orchestratorsObserved rename, OpenAPI v1 streaming+requests alignment, cache/type/stub updates).
revert: feat(web-next): align NAAP dashboard and BFF with OpenAPI v1 (streaming + requests)" #281: Directly reverts this PR's OpenAPI v1 changes, including field renames, facade/resolver rewrites, upstream parsing, stubs, and related type exports.
feat(dashboard): integrate NAAP metrics APIs with BFF caching layer #224: Modifies the same BFF/facade surface and resolver implementations (KPI field rename, cache-key logic, orchestrator/performance/pricing resolvers).

Suggested labels

scope/shell, scope/sdk, scope/packages, plugin/developer-api, size/XL

Suggested reviewers

seanhanca

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 16.18% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main changes: refactoring the dashboard BFF data layer to align with OpenAPI v1 contract and improving job feed polling, which are the primary focuses across all file changes.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/dashboard-data-staging

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

apps/web-next/src/hooks/useJobFeedStream.ts (1)

90-93: ⚠️ Potential issue | 🟡 Minor

Fallback query builder drops cache-bust flag.

When URL parsing fails, refresh=1 is not appended even if bustCache is true, so recovery polling may not bypass stale caches.

💡 Proposed fix

   } catch {
-    const sep = fetchUrl.includes('?') ? '&' : '?';
-    return `${fetchUrl}${sep}pollMs=${encodeURIComponent(String(ms))}`;
+    const sep = fetchUrl.includes('?') ? '&' : '?';
+    const qs = [`pollMs=${encodeURIComponent(String(ms))}`];
+    if (bustCache) qs.push('refresh=1');
+    return `${fetchUrl}${sep}${qs.join('&')}`;
   }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/web-next/src/hooks/useJobFeedStream.ts` around lines 90 - 93, The
fallback URL builder in useJobFeedStream.ts's catch block currently appends only
pollMs and loses the bustCache flag; update the catch branch that builds the
fallback string using fetchUrl, sep, and ms to also append refresh=1 when the
bustCache boolean is true (e.g., include "&refresh=1" or "?refresh=1" as
appropriate), ensuring both pollMs and refresh are URL-encoded via
encodeURIComponent and joined using the existing sep logic so recovery polling
bypasses cache when requested.

plugins/developer-api/backend/src/server.ts (1)

355-374: ⚠️ Potential issue | 🟠 Major

limit is not enforced on merged upstream rows.

When upstream returns more than limit, response still includes all merged rows; only catalog backfill is capped. This violates the endpoint’s limit behavior.

💡 Proposed fix

-  merged.sort(
+  merged.sort(
     (a, b) => a.Pipeline.localeCompare(b.Pipeline) || a.Model.localeCompare(b.Model),
   );
-  return { merged, partial };
+  const bounded = limitIsAll ? merged : merged.slice(0, limit ?? merged.length);
+  return { merged: bounded, partial };

Also applies to: 400-404, 430-437

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@plugins/developer-api/backend/src/server.ts` around lines 355 - 374, The
merged array is not being capped by the endpoint limit so upstream rows can
exceed `limit`; after merging upstream rows into `merged` (the array built from
`modelRows` and where keys are tracked by `seen` using `netModelRowKey`),
enforce the limit by truncating `merged` to `limit` when `limitIsAll` is false
(e.g., set merged = merged.slice(0, limit ?? merged.length)); recalc
`catalogSlotsRemaining` from the post-truncation length (or compute it before
backfill using Math.max(0, (limit ?? 0) - merged.length) but ensure merged has
been truncated), and apply the same truncation logic in the other similar merge
blocks referenced (the ones around the other fetch/merge sections).

🧹 Nitpick comments (2)

apps/web-next/src/lib/facade/dashboard-window.ts (1)

28-31: Timeout helper can be simplified.

hours <= 72 and > 72 return the same value, so this branch is redundant and can be collapsed for clarity.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/web-next/src/lib/facade/dashboard-window.ts` around lines 28 - 31, The
conditional in the timeout helper is redundant: both branches of "if (hours <=
72)" return 55_000; remove the branch and replace it with a single return of
55_000 (i.e., eliminate the if (...) { return 55_000; } return 55_000; pattern)
so the helper simply returns 55_000 unconditionally.

apps/web-next/src/lib/facade/resolvers/net-capacity.ts (1)

12-12: Don’t swallow net-model fetch failures silently.

Returning [] here hides upstream outages and makes capacity collapse look like valid data. Add a warning log (or metric) before fallback.

♻️ Suggested patch

-    const models = await getRawNetModels().catch(() => []);
+    const models = await getRawNetModels().catch((err) => {
+      console.warn('[facade/net-capacity] getRawNetModels failed; returning empty capacity map:', err);
+      return [];
+    });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/web-next/src/lib/facade/resolvers/net-capacity.ts` at line 12, The
current await getRawNetModels().catch(() => []) swallows errors; change the
catch to log a warning with the caught error before falling back to [] so
upstream outages are visible — e.g. replace .catch(() => []) with .catch((err)
=> { /* log a warning including err and context (use existing logger like
processLogger or console) */; return []; }) referencing the getRawNetModels call
and the models variable so the log includes which fetch failed.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/web-next/src/lib/facade/network-data.ts`:
- Around line 52-55: The merge logic currently adds incoming values into an
existing entry (existing.WarmOrchCount += warm; existing.TotalCapacity +=
slots), which double-counts when the same (pipeline, model) key appears from
multiple sources; update the merge in the network-data.ts routine that handles
existing entries so it uses max semantics instead of summation: set
existing.WarmOrchCount = Math.max(existing.WarmOrchCount, warm) and
existing.TotalCapacity = Math.max(existing.TotalCapacity, slots) (apply the same
max logic wherever the code merges duplicate (pipeline, model) rows).
- Around line 60-70: The loops over stream and req only read snake_case fields
(r.pipeline, r.model, r.warm_orch_count, r.gpu_slots) so PascalCase payloads are
ignored; update the parsing to accept both snake_case and PascalCase by reading
fallback properties (e.g., pipeline = (r.pipeline ?? r.Pipeline)?.trim() ?? '',
model = (r.model ?? r.Model)?.trim() ?? '', and use Number(r.warm_orch_count ??
r.WarmOrchCount ?? 0) and Number(r.gpu_slots ?? r.GpuSlots ?? 0)) before calling
add; apply the same change in both for..of loops so both stream and req entries
support either casing.

In `@apps/web-next/src/lib/facade/resolvers/net-orchestrators.ts`:
- Around line 157-171: The fetch functions fetchStreamingOrchestrators and
fetchRequestsOrchestrators currently call the offset/limit endpoints without a
limit, risking truncated results; update both naapGet calls to include a query
parameter setting limit=1000 (e.g., pass { limit: 1000 } or include it in the
request options/query) so each function explicitly requests up to 1000 rows and
prevents silent undercounting.

In `@apps/web-next/src/lib/facade/resolvers/perf-by-model.ts`:
- Around line 32-39: The current code swallows all naapGet errors by .catch(()
=> []), which causes transport failures to be cached as empty FPS data; change
the rawRows load so that you do not catch/rewrite network errors—let exceptions
from naapGet propagate (so the route can return 503), but still coerce null or
undefined responses into an empty array (i.e., after awaiting
naapGet<StreamingModelRow[] | null | undefined>(...) set rawRows = result ?? []
rather than catching rejections). Keep the same request options and preserve the
'streaming/models' call and the StreamingModelRow type.

In `@apps/web-next/src/lib/facade/resolvers/pricing.ts`:
- Around line 159-166: The aggregation loop is creating entries for rows whose
pipeline or model are blank/whitespace; before computing pricingKey(r.pipeline,
r.model) and updating map, trim r.pipeline and r.model and skip the row if
either trimmed value is an empty string (e.g., if (!trimmedPipeline ||
!trimmedModel) continue;). Apply the same pre-check in the other aggregation
block that creates slots (the second occurrence around the code handling minWei
at lines ~205-208) so no per-orchestrator rows with blank IDs are inserted into
map or emitted.

In `@packages/plugin-sdk/src/contracts/dashboard.ts`:
- Around line 504-509: The kpi resolver signature in the Dashboard contract
declares pipeline and model_id parameters that are never forwarded; either
remove pipeline and model_id from the kpi type declaration or wire them
end-to-end: update the GraphQL schema to accept pipeline and model_id, update
createDashboardProvider (the provider factory that currently extracts only
window and timeframe) to also extract and forward pipeline and model_id into the
resolver call, and ensure any consumer code that implements kpi handles those
additional args; reference the kpi type in
packages/plugin-sdk/src/contracts/dashboard.ts and the createDashboardProvider
function to apply the change consistently.

In `@plugins/developer-api/backend/src/server.ts`:
- Around line 258-275: Promise.all currently causes total failure if either
fetch rejects or res.json() throws, breaking the intended partial-merge
behavior; change the fetch logic to use a safe pattern (e.g., Promise.allSettled
or a safeFetch wrapper) so failures return a sentinel (null or rejected state)
instead of rejecting the whole Promise, and update ingest (the
ingest(res,label,source) function) to guard the JSON parse in a try/catch (log a
warning on parse error and return false) so transport errors and JSON parse
failures are treated as per-source failures rather than aborting the whole
merge; ensure code that currently reads sRes and rRes handles the sentinel/null
results and still calls ingest only for valid Response objects.

---

Outside diff comments:
In `@apps/web-next/src/hooks/useJobFeedStream.ts`:
- Around line 90-93: The fallback URL builder in useJobFeedStream.ts's catch
block currently appends only pollMs and loses the bustCache flag; update the
catch branch that builds the fallback string using fetchUrl, sep, and ms to also
append refresh=1 when the bustCache boolean is true (e.g., include "&refresh=1"
or "?refresh=1" as appropriate), ensuring both pollMs and refresh are
URL-encoded via encodeURIComponent and joined using the existing sep logic so
recovery polling bypasses cache when requested.

In `@plugins/developer-api/backend/src/server.ts`:
- Around line 355-374: The merged array is not being capped by the endpoint
limit so upstream rows can exceed `limit`; after merging upstream rows into
`merged` (the array built from `modelRows` and where keys are tracked by `seen`
using `netModelRowKey`), enforce the limit by truncating `merged` to `limit`
when `limitIsAll` is false (e.g., set merged = merged.slice(0, limit ??
merged.length)); recalc `catalogSlotsRemaining` from the post-truncation length
(or compute it before backfill using Math.max(0, (limit ?? 0) - merged.length)
but ensure merged has been truncated), and apply the same truncation logic in
the other similar merge blocks referenced (the ones around the other fetch/merge
sections).

---

Nitpick comments:
In `@apps/web-next/src/lib/facade/dashboard-window.ts`:
- Around line 28-31: The conditional in the timeout helper is redundant: both
branches of "if (hours <= 72)" return 55_000; remove the branch and replace it
with a single return of 55_000 (i.e., eliminate the if (...) { return 55_000; }
return 55_000; pattern) so the helper simply returns 55_000 unconditionally.

In `@apps/web-next/src/lib/facade/resolvers/net-capacity.ts`:
- Line 12: The current await getRawNetModels().catch(() => []) swallows errors;
change the catch to log a warning with the caught error before falling back to
[] so upstream outages are visible — e.g. replace .catch(() => []) with
.catch((err) => { /* log a warning including err and context (use existing
logger like processLogger or console) */; return []; }) referencing the
getRawNetModels call and the models variable so the log includes which fetch
failed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 93f0e15a-ba73-4410-ad65-dfdebdd5543b

📥 Commits

Reviewing files that changed from the base of the PR and between 1530887 and e22e398.

📒 Files selected for processing (33)

apps/web-next/src/app/(dashboard)/dashboard/page.tsx
apps/web-next/src/app/api/internal/bff-warm/route.ts
apps/web-next/src/app/api/v1/dashboard/orchestrators/route.ts
apps/web-next/src/app/api/v1/developer/network-models/route.ts
apps/web-next/src/app/api/v1/naap-api/warm/route.ts
apps/web-next/src/app/api/v1/network/perf-by-model/route.ts
apps/web-next/src/components/dashboard/overview-content.tsx
apps/web-next/src/content/docs/guides/dashboard-data-provider.mdx
apps/web-next/src/hooks/useJobFeedStream.ts
apps/web-next/src/hooks/usePublicDashboard.ts
apps/web-next/src/lib/dashboard/overview-timeframe.ts
apps/web-next/src/lib/facade/dashboard-window.ts
apps/web-next/src/lib/facade/index.ts
apps/web-next/src/lib/facade/network-data.ts
apps/web-next/src/lib/facade/resolvers/gpu-capacity.ts
apps/web-next/src/lib/facade/resolvers/kpi.ts
apps/web-next/src/lib/facade/resolvers/net-capacity.ts
apps/web-next/src/lib/facade/resolvers/net-orchestrators.ts
apps/web-next/src/lib/facade/resolvers/orchestrators.ts
apps/web-next/src/lib/facade/resolvers/perf-by-model.ts
apps/web-next/src/lib/facade/resolvers/pipelines.ts
apps/web-next/src/lib/facade/resolvers/pricing.ts
apps/web-next/src/lib/facade/stubs.ts
apps/web-next/src/lib/facade/upstream-parse.ts
packages/plugin-sdk/src/contracts/__tests__/dashboard.test.ts
packages/plugin-sdk/src/contracts/dashboard.ts
packages/plugin-sdk/src/contracts/index.ts
packages/plugin-sdk/src/index.ts
packages/plugin-sdk/src/types/network-model.ts
packages/types/src/index.ts
plugins/dashboard-data-provider/frontend/src/__tests__/provider.test.ts
plugins/dashboard-data-provider/frontend/src/provider.ts
plugins/developer-api/backend/src/server.ts

…filter arguments - Updated the `kpi` query in the dashboard schema to accept `pipeline` and `model_id` as filter arguments, improving query flexibility. - Modified the `createDashboardProvider` function to forward the new arguments to the KPI resolver. - Added tests to verify the correct handling of the new filter arguments in the KPI resolver. - Enhanced error handling in the developer API for improved robustness during data fetching.

eliteprox and others added 10 commits April 27, 2026 19:41

renamed orchestratorsOnline to orchestratorObserved

f1d90a4

update display of decimal values

9883214

fixed developer model pricing

e5510f1

remove pymthouse minimal design document to streamline project docume…

d795e4f

…ntation

github-actions Bot added scope/shell Shell app changes scope/sdk Plugin SDK changes scope/packages Shared package changes plugin/developer-api Developer API plugin size/XL Extra large PR (500+ lines) labels Apr 28, 2026

eliteprox enabled auto-merge (squash) April 28, 2026 19:48

eliteprox disabled auto-merge April 28, 2026 19:48

eliteprox enabled auto-merge (squash) April 28, 2026 19:49

coderabbitai Bot requested changes Apr 28, 2026

View reviewed changes

vercel Bot deployed to Preview April 28, 2026 20:06 View deployment

coderabbitai Bot approved these changes Apr 28, 2026

View reviewed changes

eliteprox merged commit e8f689c into main Apr 28, 2026
36 checks passed

eliteprox deleted the feat/dashboard-data-staging branch April 28, 2026 20:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(dashboard): align BFF data layer with OpenAPI v1 contract and improve job feed#282

refactor(dashboard): align BFF data layer with OpenAPI v1 contract and improve job feed#282
eliteprox merged 11 commits into
mainfrom
feat/dashboard-data-staging

eliteprox commented Apr 28, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

vercel Bot commented Apr 28, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 28, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Apr 28, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

eliteprox commented Apr 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

API / Data Layer

Orchestrator & KPI Resolvers

Pipeline & Timeframe Handling

Job Feed

UI / Display Fixes

Cleanup

Testing

Summary by CodeRabbit

Uh oh!

vercel Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

eliteprox commented Apr 28, 2026 •

edited by coderabbitai Bot

Loading

vercel Bot commented Apr 28, 2026 •

edited

Loading

github-actions Bot commented Apr 28, 2026 •

edited

Loading

coderabbitai Bot commented Apr 28, 2026 •

edited

Loading