fix(billing): 识别 drizzle 包装后的 FK 违例,补齐 PR #169 漏掉的真实形状#170
Merged
Conversation
PR #169 加的 catch-and-retry 在生产 alpha.4 部署后仍然失效。根因是 drizzle-orm 0.45 把驱动抛出的 PostgresError 包成 DrizzleQueryError,原始错误挂到 `.cause` 上,外层 对象没有 `code` 字段。旧的 `isPgForeignKeyViolation` 只检查 `error.code === "23503"`, 对 wrapped 形状永远返回 false,retry 分支从未触发——日志里也找不到一行 "billing snapshot FK violation retried with NULL",恰好印证。 修复:把探测函数改为 `extractPgForeignKeyViolation`,同时检查 `error` 和 `error.cause` 单层(drizzle 0.45 是单层包装),返回统一形状的 violation 信息,避免 retry 分支 再次按错误形状做二次判断。 测试:把 api_key_id retry 用例改用真实 wrapped 形状 `{ cause: { code, constraint_name } }` 复现生产链路;保留一个平铺形状用例覆盖 postgres-js 不经包装直接抛出的边界; 新增 wrapped 非 FK 错误(cause.code = 23505)的负面测试,防止误判。 仓库内手工查证:node_modules/drizzle-orm/errors.js 的 DrizzleQueryError 实现确认是 单层 cause 包装,无需递归展开。 Refs: PR #168, PR #169
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #170 +/- ##
==========================================
- Coverage 78.58% 78.57% -0.01%
==========================================
Files 145 145
Lines 11504 11515 +11
Branches 3977 3982 +5
==========================================
+ Hits 9040 9048 +8
Misses 1630 1630
- Partials 834 837 +3
*This pull request uses carry forward flags. Click here to find out more. 🚀 New features to boost your workflow:
|
8 tasks
g1331
added a commit
that referenced
this pull request
May 23, 2026
fix(billing): FK 违例探测兼容 constraint 字段名,补齐 PR #170 漏掉的兜底
This was referenced May 23, 2026
g1331
added a commit
that referenced
this pull request
May 23, 2026
…hooting (#167) (#183) 收尾使用侧 9 篇,至此 Phase 2 使用指南全部完成。 - logs-stats:request_logs 40+ 列分组列出,重点讲 PR #170/#171 的 duration_ms INT4 clamp、stale reconcile (status_code=520) 兜底; /api/admin/logs 的 query 参数集(无 model 过滤)、/api/admin/logs/live SSE 与多副本下的进程内 pub/sub 限制;overview / timeseries / leaderboard 三类聚合的实时计算口径,特别是 TPS 的多重过滤条件;明确指出 LOG_RETENTION_DAYS 当前无后台清理任务消费、表实际无限增长,给出手动 DELETE 的兜底方案。 - request-recording:Runtime Settings 主控(env 已弃用,shouldRecordFixture 只读 DB),默认 enabled=false / mode=failure;磁盘布局含 latest.json 双写、16 MiB 截断、compactSSEChunks 对 OpenAI Responses 大型快照事件 的 instructions/tools 替换;脱敏的 SENSITIVE_HEADER_NAMES 全列;hook 时机 fire-and-forget、tee() 分叉不影响 client 延迟;/api/mock 回放仅 NODE_ENV !== production 生效。同时纠正 .env.example 注释里 tests/fixtures 与源码实际默认 data/traffic-recordings 的不一致。 - troubleshooting:按客户端 Key / 路由 / SSE / CLIProxyAPI / 计费 / 日志 六类组织 symptom→error code→源码定位→排查方向表格。覆盖 API_KEY_MODEL_NOT_ALLOWED、NO_AUTHORIZED_UPSTREAMS、NO_HEALTHY_CANDIDATES、 CONCURRENCY_FULL、QUEUE_WAIT_TIMEOUT、CLIENT_DISCONNECTED、queue_full、 REQUEST_TIMEOUT、STREAM_ERROR、CliproxyConnectionStatus 四态、 CliproxyInstanceInUseError、duration_ms 24.8 天、status_code=520、 UnbillableReason 四类等,并明确划分与部署侧 troubleshooting / circuit breaker 长篇 / cliproxy 长篇的边界,避免主题重叠。 Phase 2 使用侧 6/9 → 9/9,剩部署 0/6、架构 0/9。
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
背景
PR #169(v0.3.0-alpha.4)部署后,生产 deploy smoke test 仍然复现同一 FK 违例:
且日志中完全找不到 PR #169 加的 warn
"billing snapshot FK violation retried with NULL"——意味着 catch-and-retry 分支从未真正执行过。根因
drizzle-orm 0.45 在
node_modules/drizzle-orm/errors.js的DrizzleQueryError把驱动抛出的 PostgresError 包了一层,原始错误塞到.cause上:PR #169 的
isPgForeignKeyViolation只检查error.code === "23503",对外层 DrizzleQueryError 永远返回 false,retry 分支永远不会触发,错误原样冒出来落到persistBillingSnapshotSafely的 catch 打印"failed to persist billing snapshot"。PR #169 的单元测试用 `Object.assign(new Error(), { code, constraint_name })` 直接平铺字段,没复现真实的 wrap 形状,所以测试通过但生产失败。
修复
src/lib/services/billing-cost-service.ts:把探测函数改为 `extractPgForeignKeyViolation`,同时检查 `error` 和 `error.cause` 单层(drizzle 0.45 是单层包装,已查证),返回统一形状的 violation 信息,retry 分支不再二次按错误形状判断。tests/unit/services/billing-cost-service.test.ts:Codex double check
已和 Codex 商量方案,推荐采用「保留 catch-and-retry 思路,只修探测函数 + 测试用真实形状」,备选方案(SELECT FOR SHARE / schema 改为 ON DELETE SET NULL / 测试侧 workaround)均不建议作为主线修复。本 PR 即按推荐方案实现。
Test plan
Refs: #168, #169