Shaw/live ai pricing by lalalune · Pull Request #455 · elizaOS/cloud

lalalune · 2026-04-14T21:33:30Z

No description provided.

vercel · 2026-04-14T21:33:35Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
eliza-cloud-v2	Error		Apr 15, 2026 1:51am

coderabbitai · 2026-04-14T21:33:37Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cdc64e6b-1738-4f88-8963-35dd963fac15

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch shaw/live-ai-pricing

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

claude · 2026-04-14T21:37:46Z

Code Review: Shaw/live ai pricing (#455)

Overview

This PR introduces a live AI pricing catalog — replacing hardcoded constants with a DB-backed catalog that syncs from external sources (Vercel Gateway, OpenRouter, FAL.ai, ElevenLabs). It adds an admin override endpoint, a public pricing summary endpoint, a cron refresh job, and a new music-metadata-based audio duration estimator for more accurate STT billing. A significant restructuring: voice endpoints are unified under /api/v1/voice/ and the old /api/elevenlabs/ endpoints are now thin re-exports.

🔴 Critical Issues

1. Failed video generation now refunds 100% of credits (generate-video/route.ts)

Previously a failed video attempt was charged VIDEO_GENERATION_FALLBACK_COST (partial charge for the API attempt). This PR changes it to reconcile(0) — a full refund. Users can now trigger video generation requests indefinitely at zero cost if the underlying provider fails. This should be intentional if so — but the prior logic existed specifically to cover provider-side costs.

- await reservation.reconcile(VIDEO_GENERATION_FALLBACK_COST);
+ await reservation.reconcile(0);

2. seedance pricing parser is not implemented but the model is in the type union (ai-pricing.ts)

case "seedance":
  throw new Error("Seedance pricing parser is not implemented yet");

The seedance model is listed in the VideoModelId union in ai-pricing-definitions.ts. If refreshPricingCatalog() is called with Seedance in the catalog, it will throw and fail the entire refresh run. Either remove Seedance from the supported models list or implement the parser before shipping.

3. /api/v1/pricing/summary has no error handling

All Promise.all calls in the GET handler will throw unhandled exceptions if any external catalog fetch or DB lookup fails. Since this is a public, unauthenticated route, a single failing FAL endpoint will return a 500 with an internal error message instead of a graceful response.

// No try/catch anywhere in this function
const videoCosts = await Promise.all(
  SUPPORTED_VIDEO_MODELS.map(async (model) => { ... })
);

Wrap each lookup in a try/catch or use .catch(() => null) and filter nulls before building the range.

🟡 Medium Issues

4. Eight fragile regex parsers for FAL.ai HTML scraping (ai-pricing.ts)

Models like veo, kling, hailuo_standard, wan, and pixverse are priced by parsing raw HTML from FAL.ai's pricing pages with regexes. Any HTML restructure on FAL's side will throw:

throw new Error(`Unable to parse Veo pricing paragraph: ${paragraph}`);

There's no fallback to the last known-good database price when a parse fails. The behavior should be: fail gracefully, log a warning, and keep serving the last valid price from the DB rather than blowing up the entire refresh run.

5. externalCatalogCache Map grows unbounded

const externalCatalogCache = new Map<string, ExternalCacheValue>();

This module-level Map is never pruned. In a long-running Node.js instance (Next.js on Vercel Node runtime), entries accumulate indefinitely. Add an eviction step when setting: before inserting, delete entries where expiresAt < Date.now(), or cap the map size.

6. ElevenLabs pricing is a hardcoded snapshot, not live

ai-pricing-definitions.ts includes static elevenLabsSnapshotPricing that never updates automatically. If ElevenLabs changes rates, the system silently uses stale prices until a manual code deploy. At minimum, document that this requires manual updates and add a staleness warning in the admin endpoint.

7. music-metadata audio parsing blocks the request thread for large files

parseBuffer() is called synchronously in the STT route handler for files up to 25 MB. For large files this is a measurable delay on the request. Consider moving this to a worker thread or using a streaming parser to extract duration without loading the full buffer.

🟢 Positive Observations

Decimal.js for all financial math — consistent and correct, avoids floating-point drift in billing calculations.
Admin override endpoint is properly auth-gated with requireAdminWithResponse() and records updated_by for audit purposes.
Cron endpoint uses verifyCronSecret — consistent with other cron routes in the project.
Dimension-based pricing with wildcard fallback is a clean design that will scale to new models without schema changes.
20% platform markup is centralized via applyPlatformMarkup() — easy to change in one place.
Transaction-wrapped refresh runs with ai_pricing_refresh_runs audit table is a solid approach.
Voice endpoint consolidation under /api/v1/voice/ with the legacy paths as re-exports is clean.

Minor Nits

as PricingProductFamily and as PricingChargeUnit casts on raw DB string values (aiEntryToPrepared()) have no runtime validation. A Zod parse or explicit set membership check would prevent stale DB values from silently corrupting pricing logic.
buildRange() in the summary route calls await on a non-async function — minor inconsistency, no functional impact.
The 15-minute EXTERNAL_CACHE_TTL_MS and AbortSignal.timeout(15_000) are hardcoded. Consider moving to named constants with a comment explaining the rationale.
ai-pricing.ts is ~1,286 lines. The FAL HTML parsers are good candidates to extract into a lib/services/ai-pricing/fal-parsers.ts sub-module before this grows further.

Summary

The architecture is sound and the live-pricing direction is the right call long-term. The three issues to resolve before merging are: (1) clarify the intended behavior of failed video billing, (2) gate or remove the Seedance model until its parser is implemented, and (3) add error handling to the public /pricing/summary endpoint so a single failing external catalog doesn't return a 500 to unauthenticated callers.

…456) * fix: implement Seedance 2.0 pricing parser and add models to catalog Implement the seedance pricing parser that was throwing an unimplemented error, crashing the entire pricing refresh cron when Seedance models appeared in the catalog. Parser extracts per-second pricing from fal.ai model pages: - Seedance 2.0: $0.3034/second (720p) - Seedance 2.0 Fast: $0.2419/second (720p) Audio is included in the base price (no audio dimension needed). Also adds both models to SUPPORTED_VIDEO_MODELS in the definitions file. Co-authored-by: wakesync <shadow@shad0w.xyz> * fix: document public-only blob access limitation Vercel Blob only supports access: 'public' as of 2026-04. Added TODO comment explaining the limitation and noting that a proper fix requires an auth-gated proxy route to serve blob content with session validation. Co-authored-by: wakesync <shadow@shad0w.xyz> * fix: charge ~10% on failed video generation instead of full refund Shaw changed failed video reconciliation from partial charge to full refund (reconcile(0)), creating an abuse vector where users could trigger failures intentionally at zero cost while fal.ai still charges for the compute attempt. Restore partial charge at ~10% of quoted cost for all failure paths: - No video URL in response - Blob upload failure - General generation error Co-authored-by: wakesync <shadow@shad0w.xyz> * fix: add error handling to /api/v1/pricing/summary endpoint Wrap each model cost lookup in try/catch so one failing fal endpoint (or any third-party catalog) won't 500 the entire public unauthenticated route. Failed lookups are filtered out and partial results returned with a warnings array. Categories with zero successful lookups are omitted from the response entirely. Co-authored-by: wakesync <shadow@shad0w.xyz> * fix: add is_public column to generations, filter explore gallery listRandomPublicImages() was returning ALL completed images across all users, leaking private generations in the explore/discover section. Added is_public boolean column (default false) via migration 0065. Updated the query to filter by is_public = true so only explicitly opted-in content appears in the explore gallery. Includes a partial index on is_public WHERE true for query performance. Co-authored-by: wakesync <shadow@shad0w.xyz> * fix: evict expired entries from third-partyCatalogCache to prevent unbounded growth The module-level third-partyCatalogCache Map was never pruned. While in practice only ~4 keys are used (gateway, openrouter, fal, elevenlabs), expired entries were never removed. Add eviction of expired entries before inserting new ones in getCachedExternalEntries. Co-authored-by: wakesync <shadow@shad0w.xyz> * fix: require authentication for image generation, remove anonymous fallback Anonymous users could generate images for free by hitting the endpoint without auth. The authenticateUser() function silently fell back to creating anonymous users who bypassed all credit checks. Now returns 401 if no valid session or API key is provided. Cleaned up all anonymous-specific code paths (isAnonymous checks, anonymous reservation creation for unauthenticated users). Co-authored-by: wakesync <shadow@shad0w.xyz> * fix: add DB fallback when fal.ai HTML pricing parsers fail The 8+ fal.ai pricing parsers scrape HTML with regexes. If fal changes their page structure, the parsers throw and crash the entire fal catalog refresh. Now each model's parse is wrapped in try/catch. On failure, fall back to last known active DB entries for that model. If no DB fallback exists, log an error and return empty (other models still succeed). This prevents one broken model page from taking down pricing for all video models. Co-authored-by: wakesync <shadow@shad0w.xyz> * fix: move steward passthrough after bearerToken declaration bearerToken was referenced before its const declaration (temporal dead zone), causing a ReferenceError that crashed the middleware and redirected to /auth/error on every steward-authenticated request. Co-authored-by: wakesync <shadow@shad0w.xyz> --------- Co-authored-by: wakesync <shadow@shad0w.xyz>

… branch Brings in: - Shaw's live AI pricing catalog (PR #455) - Sol's 8 patches on top via PR #456: - Seedance 2.0 parser implemented + models added - Failed video charges 10% (was 100% refund abuse vector) - /api/v1/pricing/summary error handling - third-partyCatalogCache eviction - HTML parser fallback to DB - is_public column on generations (default false) - Anonymous image gen removed (require auth) - Blob access limitation documented This unblocks production from continuing to charge $0.05 flat on video generation. Co-authored-by: wakesync <shadow@shad0w.xyz>

lalalune added 3 commits April 14, 2026 15:08

cloud: add a canonical ai pricing catalog

8df84c5

cloud: bill inference endpoints from live catalog

cc0d376

cloud: expose live pricing refresh and summaries

a4798f5

0xSolace mentioned this pull request Apr 14, 2026

fix: patch pricing, privacy, and security issues in live-ai-pricing #456

Merged

vercel bot had a problem deploying to Preview April 15, 2026 01:51 Failure

0xSolace mentioned this pull request Apr 17, 2026

fix(auth+billing): steward auth sweep, dynamic AI pricing catalog, privacy hardening #458

Merged

lalalune merged commit 6a4dc08 into develop Apr 17, 2026
11 of 15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shaw/live ai pricing#455

Shaw/live ai pricing#455
lalalune merged 4 commits intodevelopfrom
shaw/live-ai-pricing

lalalune commented Apr 14, 2026

Uh oh!

vercel bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Apr 14, 2026 •

edited

Loading

Review skipped

Uh oh!

claude bot commented Apr 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lalalune commented Apr 14, 2026

Uh oh!

vercel bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

claude bot commented Apr 14, 2026

Code Review: Shaw/live ai pricing (#455)

Overview

🔴 Critical Issues

🟡 Medium Issues

🟢 Positive Observations

Minor Nits

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel bot commented Apr 14, 2026 •

edited

Loading

coderabbitai bot commented Apr 14, 2026 •

edited

Loading