Skip to content

Shaw/live ai pricing#455

Merged
lalalune merged 4 commits intodevelopfrom
shaw/live-ai-pricing
Apr 17, 2026
Merged

Shaw/live ai pricing#455
lalalune merged 4 commits intodevelopfrom
shaw/live-ai-pricing

Conversation

@lalalune
Copy link
Copy Markdown
Member

No description provided.

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 14, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
eliza-cloud-v2 Error Error Apr 15, 2026 1:51am

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 14, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cdc64e6b-1738-4f88-8963-35dd963fac15

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch shaw/live-ai-pricing

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link
Copy Markdown

claude bot commented Apr 14, 2026

Code Review: Shaw/live ai pricing (#455)

Overview

This PR introduces a live AI pricing catalog — replacing hardcoded constants with a DB-backed catalog that syncs from external sources (Vercel Gateway, OpenRouter, FAL.ai, ElevenLabs). It adds an admin override endpoint, a public pricing summary endpoint, a cron refresh job, and a new music-metadata-based audio duration estimator for more accurate STT billing. A significant restructuring: voice endpoints are unified under /api/v1/voice/ and the old /api/elevenlabs/ endpoints are now thin re-exports.


🔴 Critical Issues

1. Failed video generation now refunds 100% of credits (generate-video/route.ts)

Previously a failed video attempt was charged VIDEO_GENERATION_FALLBACK_COST (partial charge for the API attempt). This PR changes it to reconcile(0) — a full refund. Users can now trigger video generation requests indefinitely at zero cost if the underlying provider fails. This should be intentional if so — but the prior logic existed specifically to cover provider-side costs.

- await reservation.reconcile(VIDEO_GENERATION_FALLBACK_COST);
+ await reservation.reconcile(0);

2. seedance pricing parser is not implemented but the model is in the type union (ai-pricing.ts)

case "seedance":
  throw new Error("Seedance pricing parser is not implemented yet");

The seedance model is listed in the VideoModelId union in ai-pricing-definitions.ts. If refreshPricingCatalog() is called with Seedance in the catalog, it will throw and fail the entire refresh run. Either remove Seedance from the supported models list or implement the parser before shipping.

3. /api/v1/pricing/summary has no error handling

All Promise.all calls in the GET handler will throw unhandled exceptions if any external catalog fetch or DB lookup fails. Since this is a public, unauthenticated route, a single failing FAL endpoint will return a 500 with an internal error message instead of a graceful response.

// No try/catch anywhere in this function
const videoCosts = await Promise.all(
  SUPPORTED_VIDEO_MODELS.map(async (model) => { ... })
);

Wrap each lookup in a try/catch or use .catch(() => null) and filter nulls before building the range.


🟡 Medium Issues

4. Eight fragile regex parsers for FAL.ai HTML scraping (ai-pricing.ts)

Models like veo, kling, hailuo_standard, wan, and pixverse are priced by parsing raw HTML from FAL.ai's pricing pages with regexes. Any HTML restructure on FAL's side will throw:

throw new Error(`Unable to parse Veo pricing paragraph: ${paragraph}`);

There's no fallback to the last known-good database price when a parse fails. The behavior should be: fail gracefully, log a warning, and keep serving the last valid price from the DB rather than blowing up the entire refresh run.

5. externalCatalogCache Map grows unbounded

const externalCatalogCache = new Map<string, ExternalCacheValue>();

This module-level Map is never pruned. In a long-running Node.js instance (Next.js on Vercel Node runtime), entries accumulate indefinitely. Add an eviction step when setting: before inserting, delete entries where expiresAt < Date.now(), or cap the map size.

6. ElevenLabs pricing is a hardcoded snapshot, not live

ai-pricing-definitions.ts includes static elevenLabsSnapshotPricing that never updates automatically. If ElevenLabs changes rates, the system silently uses stale prices until a manual code deploy. At minimum, document that this requires manual updates and add a staleness warning in the admin endpoint.

7. music-metadata audio parsing blocks the request thread for large files

parseBuffer() is called synchronously in the STT route handler for files up to 25 MB. For large files this is a measurable delay on the request. Consider moving this to a worker thread or using a streaming parser to extract duration without loading the full buffer.


🟢 Positive Observations

  • Decimal.js for all financial math — consistent and correct, avoids floating-point drift in billing calculations.
  • Admin override endpoint is properly auth-gated with requireAdminWithResponse() and records updated_by for audit purposes.
  • Cron endpoint uses verifyCronSecret — consistent with other cron routes in the project.
  • Dimension-based pricing with wildcard fallback is a clean design that will scale to new models without schema changes.
  • 20% platform markup is centralized via applyPlatformMarkup() — easy to change in one place.
  • Transaction-wrapped refresh runs with ai_pricing_refresh_runs audit table is a solid approach.
  • Voice endpoint consolidation under /api/v1/voice/ with the legacy paths as re-exports is clean.

Minor Nits

  • as PricingProductFamily and as PricingChargeUnit casts on raw DB string values (aiEntryToPrepared()) have no runtime validation. A Zod parse or explicit set membership check would prevent stale DB values from silently corrupting pricing logic.
  • buildRange() in the summary route calls await on a non-async function — minor inconsistency, no functional impact.
  • The 15-minute EXTERNAL_CACHE_TTL_MS and AbortSignal.timeout(15_000) are hardcoded. Consider moving to named constants with a comment explaining the rationale.
  • ai-pricing.ts is ~1,286 lines. The FAL HTML parsers are good candidates to extract into a lib/services/ai-pricing/fal-parsers.ts sub-module before this grows further.

Summary

The architecture is sound and the live-pricing direction is the right call long-term. The three issues to resolve before merging are: (1) clarify the intended behavior of failed video billing, (2) gate or remove the Seedance model until its parser is implemented, and (3) add error handling to the public /pricing/summary endpoint so a single failing external catalog doesn't return a 500 to unauthenticated callers.

…456)

* fix: implement Seedance 2.0 pricing parser and add models to catalog

Implement the seedance pricing parser that was throwing an unimplemented
error, crashing the entire pricing refresh cron when Seedance models
appeared in the catalog.

Parser extracts per-second pricing from fal.ai model pages:
- Seedance 2.0: $0.3034/second (720p)
- Seedance 2.0 Fast: $0.2419/second (720p)

Audio is included in the base price (no audio dimension needed).

Also adds both models to SUPPORTED_VIDEO_MODELS in the definitions file.

Co-authored-by: wakesync <shadow@shad0w.xyz>

* fix: document public-only blob access limitation

Vercel Blob only supports access: 'public' as of 2026-04. Added TODO
comment explaining the limitation and noting that a proper fix requires
an auth-gated proxy route to serve blob content with session validation.

Co-authored-by: wakesync <shadow@shad0w.xyz>

* fix: charge ~10% on failed video generation instead of full refund

Shaw changed failed video reconciliation from partial charge to full
refund (reconcile(0)), creating an abuse vector where users could
trigger failures intentionally at zero cost while fal.ai still charges
for the compute attempt.

Restore partial charge at ~10% of quoted cost for all failure paths:
- No video URL in response
- Blob upload failure
- General generation error

Co-authored-by: wakesync <shadow@shad0w.xyz>

* fix: add error handling to /api/v1/pricing/summary endpoint

Wrap each model cost lookup in try/catch so one failing fal endpoint
(or any third-party catalog) won't 500 the entire public unauthenticated
route.

Failed lookups are filtered out and partial results returned with a
warnings array. Categories with zero successful lookups are omitted
from the response entirely.

Co-authored-by: wakesync <shadow@shad0w.xyz>

* fix: add is_public column to generations, filter explore gallery

listRandomPublicImages() was returning ALL completed images across all
users, leaking private generations in the explore/discover section.

Added is_public boolean column (default false) via migration 0065.
Updated the query to filter by is_public = true so only explicitly
opted-in content appears in the explore gallery. Includes a partial
index on is_public WHERE true for query performance.

Co-authored-by: wakesync <shadow@shad0w.xyz>

* fix: evict expired entries from third-partyCatalogCache to prevent unbounded growth

The module-level third-partyCatalogCache Map was never pruned. While in
practice only ~4 keys are used (gateway, openrouter, fal, elevenlabs),
expired entries were never removed. Add eviction of expired entries
before inserting new ones in getCachedExternalEntries.

Co-authored-by: wakesync <shadow@shad0w.xyz>

* fix: require authentication for image generation, remove anonymous fallback

Anonymous users could generate images for free by hitting the endpoint
without auth. The authenticateUser() function silently fell back to
creating anonymous users who bypassed all credit checks.

Now returns 401 if no valid session or API key is provided. Cleaned up
all anonymous-specific code paths (isAnonymous checks, anonymous
reservation creation for unauthenticated users).

Co-authored-by: wakesync <shadow@shad0w.xyz>

* fix: add DB fallback when fal.ai HTML pricing parsers fail

The 8+ fal.ai pricing parsers scrape HTML with regexes. If fal changes
their page structure, the parsers throw and crash the entire fal catalog
refresh.

Now each model's parse is wrapped in try/catch. On failure, fall back to
last known active DB entries for that model. If no DB fallback exists,
log an error and return empty (other models still succeed).

This prevents one broken model page from taking down pricing for all
video models.

Co-authored-by: wakesync <shadow@shad0w.xyz>

* fix: move steward passthrough after bearerToken declaration

bearerToken was referenced before its const declaration (temporal dead
zone), causing a ReferenceError that crashed the middleware and redirected
to /auth/error on every steward-authenticated request.

Co-authored-by: wakesync <shadow@shad0w.xyz>

---------

Co-authored-by: wakesync <shadow@shad0w.xyz>
0xSolace added a commit that referenced this pull request Apr 17, 2026
… branch

Brings in:
- Shaw's live AI pricing catalog (PR #455)
- Sol's 8 patches on top via PR #456:
  - Seedance 2.0 parser implemented + models added
  - Failed video charges 10% (was 100% refund abuse vector)
  - /api/v1/pricing/summary error handling
  - third-partyCatalogCache eviction
  - HTML parser fallback to DB
  - is_public column on generations (default false)
  - Anonymous image gen removed (require auth)
  - Blob access limitation documented

This unblocks production from continuing to charge $0.05 flat on video generation.

Co-authored-by: wakesync <shadow@shad0w.xyz>
@lalalune lalalune merged commit 6a4dc08 into develop Apr 17, 2026
11 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants