Conversation
📝 Walkthrough📋 Walkthrough该PR引入了一套完整的公开状态页面功能,包括数据库模式扩展(system_settings新增2列、message_request新增2个条件索引)、6种语言的本地化文本、新的公开状态聚合与调度服务、前端UI组件、设置页面、API路由以及全面的测试覆盖。 📊 Changes
🎯 Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~75 minutes Reasoning: Significant logic density in aggregation (token computation, gap filling, state transitions), multi-pattern changes spanning database schema, services, UI components, and localization. Heterogeneous across system layers (repo, cache, validation, action, scheduler, component) with critical snapshot/scheduling logic. Extensive file spread (~60+) but with well-contained cohorts. 🔗 Possibly Related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a Public Status Page feature, allowing administrators to configure and expose the status of specific provider groups and models. It includes database schema updates, internationalization support across multiple languages, background aggregation logic for system status snapshots, and a new public-facing status dashboard. Key feedback includes potential performance issues due to missing database indexes on the message_request table, the use of hardcoded strings in server actions instead of localized keys, restrictive character limits for serialized configuration in the database, and potential hydration mismatches caused by server-side date formatting.
| or( | ||
| inArray(messageRequest.originalModel, targetModelIds), | ||
| inArray(messageRequest.model, targetModelIds) | ||
| ), |
There was a problem hiding this comment.
This query filters by originalModel and model over a time range. In large-scale deployments, the message_request table can contain millions of rows. Without a composite index on (model, created_at) and (original_model, created_at), this query could become a performance bottleneck. Note that original_model currently has no index at all in the schema.
| try { | ||
| const session = await getSession(); | ||
| if (!session || session.user.role !== "admin") { | ||
| return { ok: false, error: "无权限执行此操作" }; |
There was a problem hiding this comment.
| if (nextDescription && nextDescription.length > 500) { | ||
| return { | ||
| ok: false, | ||
| error: "公开状态配置超过 provider_groups.description 的 500 字符限制", | ||
| }; | ||
| } |
There was a problem hiding this comment.
The 500-character limit on the description column is quite restrictive for storing serialized JSON configuration. If a provider group has many public models, this limit will be easily exceeded, causing the save operation to fail. Consider increasing the column size or using a dedicated table for these settings.
| <span className="text-xs font-semibold uppercase tracking-wider"> | ||
| {hasFailedModel ? t("statusPage.public.failed") : t("statusPage.public.operational")}{" "} | ||
| · {t("statusPage.public.generatedAt")}{" "} | ||
| {new Date(snapshot.generatedAt).toLocaleString()} |
🧪 测试结果
总体结果: ✅ 所有测试通过 |
| await updateSystemSettings({ | ||
| publicStatusWindowHours: input.publicStatusWindowHours, | ||
| publicStatusAggregationIntervalMinutes: input.publicStatusAggregationIntervalMinutes, | ||
| }); | ||
|
|
There was a problem hiding this comment.
Missing server-side input validation for numeric settings
publicStatusWindowHours and publicStatusAggregationIntervalMinutes are stored to the DB without going through UpdateSystemSettingsSchema. A user who clears the field submits Number("") = 0; that 0 is persisted, then passed directly to aggregatePublicStatusSnapshot where Array.from({ length: Math.ceil((hours * 60) / 0) }) → Array.from({ length: Infinity }) → RangeError: Invalid array length. The scheduler then throws this error every 30 s indefinitely until the bad value is manually corrected, and the initial save action surfaces an opaque "Invalid array length" error to the admin.
UpdateSystemSettingsSchema already has min(1) / max(168|60) guards for exactly these fields — they just need to be applied here:
const SavePublicStatusSchema = z.object({
publicStatusWindowHours: z.coerce.number().int().min(1).max(168),
publicStatusAggregationIntervalMinutes: z.coerce.number().int().min(1).max(60),
});Or simply reuse UpdateSystemSettingsSchema.pick(...) before calling updateSystemSettings.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/actions/public-status.ts
Line: 39-43
Comment:
**Missing server-side input validation for numeric settings**
`publicStatusWindowHours` and `publicStatusAggregationIntervalMinutes` are stored to the DB without going through `UpdateSystemSettingsSchema`. A user who clears the field submits `Number("") = 0`; that 0 is persisted, then passed directly to `aggregatePublicStatusSnapshot` where `Array.from({ length: Math.ceil((hours * 60) / 0) })` → `Array.from({ length: Infinity })` → `RangeError: Invalid array length`. The scheduler then throws this error every 30 s indefinitely until the bad value is manually corrected, and the initial save action surfaces an opaque `"Invalid array length"` error to the admin.
`UpdateSystemSettingsSchema` already has `min(1)` / `max(168|60)` guards for exactly these fields — they just need to be applied here:
```typescript
const SavePublicStatusSchema = z.object({
publicStatusWindowHours: z.coerce.number().int().min(1).max(168),
publicStatusAggregationIntervalMinutes: z.coerce.number().int().min(1).max(60),
});
```
Or simply reuse `UpdateSystemSettingsSchema.pick(...)` before calling `updateSystemSettings`.
How can I resolve this? If you propose a fix, please make it concise.| async function writeSnapshotToRedis(record: PublicStatusSnapshotRecord): Promise<void> { | ||
| const redis = getRedisClient({ allowWhenRateLimitDisabled: true }); | ||
| if (!redis || redis.status !== "ready") { | ||
| return; | ||
| } | ||
|
|
||
| await redis.set(SNAPSHOT_CACHE_KEY, JSON.stringify(record)); | ||
| } |
There was a problem hiding this comment.
No TTL on Redis snapshot key — stale data can persist after feature is disabled
redis.set(SNAPSHOT_CACHE_KEY, ...) is called without an expiry. clearPublicStatusSnapshot silently no-ops when Redis is not ready (lines 50-54). If an admin disables the feature while Redis is briefly unavailable, the snapshot key remains and /api/public-status continues returning 200 with old data. Because stopPublicStatusScheduler was also called, no periodic job will re-attempt the deletion.
Adding a TTL provides a safety net so stale snapshots self-expire:
const ttlSeconds = 2 * 24 * 60 * 60; // or derive from payload.windowHours
await redis.set(SNAPSHOT_CACHE_KEY, JSON.stringify(record), "EX", ttlSeconds);Prompt To Fix With AI
This is a comment left during a code review.
Path: src/repository/public-status-snapshot.ts
Line: 40-47
Comment:
**No TTL on Redis snapshot key — stale data can persist after feature is disabled**
`redis.set(SNAPSHOT_CACHE_KEY, ...)` is called without an expiry. `clearPublicStatusSnapshot` silently no-ops when Redis is not ready (lines 50-54). If an admin disables the feature while Redis is briefly unavailable, the snapshot key remains and `/api/public-status` continues returning 200 with old data. Because `stopPublicStatusScheduler` was also called, no periodic job will re-attempt the deletion.
Adding a TTL provides a safety net so stale snapshots self-expire:
```typescript
const ttlSeconds = 2 * 24 * 60 * 60; // or derive from payload.windowHours
await redis.set(SNAPSHOT_CACHE_KEY, JSON.stringify(record), "EX", ttlSeconds);
```
How can I resolve this? If you propose a fix, please make it concise.| <Label htmlFor="public-status-window-hours">{t("statusPage.form.windowHours")}</Label> | ||
| <Input | ||
| id="public-status-window-hours" | ||
| inputMode="numeric" | ||
| value={windowHours} | ||
| onChange={(event) => setWindowHours(event.target.value)} | ||
| /> |
There was a problem hiding this comment.
No client-side bounds on numeric inputs
inputMode="numeric" allows any string value to be typed and submitted. Pairing with type="number" and min/max attributes gives native browser validation and improves UX, complementing (not replacing) the server-side fix:
| <Label htmlFor="public-status-window-hours">{t("statusPage.form.windowHours")}</Label> | |
| <Input | |
| id="public-status-window-hours" | |
| inputMode="numeric" | |
| value={windowHours} | |
| onChange={(event) => setWindowHours(event.target.value)} | |
| /> | |
| <Input | |
| id="public-status-window-hours" | |
| type="number" | |
| min={1} | |
| max={168} | |
| inputMode="numeric" | |
| value={windowHours} | |
| onChange={(event) => setWindowHours(event.target.value)} | |
| /> |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/[locale]/settings/status-page/_components/public-status-settings-form.tsx
Line: 107-113
Comment:
**No client-side bounds on numeric inputs**
`inputMode="numeric"` allows any string value to be typed and submitted. Pairing with `type="number"` and `min`/`max` attributes gives native browser validation and improves UX, complementing (not replacing) the server-side fix:
```suggestion
<Input
id="public-status-window-hours"
type="number"
min={1}
max={168}
inputMode="numeric"
value={windowHours}
onChange={(event) => setWindowHours(event.target.value)}
/>
```
How can I resolve this? If you propose a fix, please make it concise.| export async function aggregatePublicStatusSnapshot( | ||
| input: AggregatePublicStatusSnapshotInput | ||
| ): Promise<PublicStatusSnapshotPayload> { | ||
| const now = new Date(); | ||
| const windowStart = new Date(now.getTime() - input.windowHours * 60 * 60 * 1000); | ||
| const targetModelIds = Array.from(new Set(input.groups.flatMap((group) => group.modelIds))); | ||
| const latestPricesByModel = await findLatestPricesByModels(targetModelIds); |
There was a problem hiding this comment.
Unnecessary DB query when target model list is empty
findLatestPricesByModels(targetModelIds) is always called before the request query, even when targetModelIds is an empty array. A short-circuit guard avoids the extra round-trip:
if (targetModelIds.length === 0) {
return buildPublicStatusSnapshotFromRequests({ windowHours: input.windowHours, bucketMinutes: input.bucketMinutes, now, groups: input.groups, requests: [] });
}
const latestPricesByModel = await findLatestPricesByModels(targetModelIds);Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/public-status/aggregation.ts
Line: 471-477
Comment:
**Unnecessary DB query when target model list is empty**
`findLatestPricesByModels(targetModelIds)` is always called before the request query, even when `targetModelIds` is an empty array. A short-circuit guard avoids the extra round-trip:
```typescript
if (targetModelIds.length === 0) {
return buildPublicStatusSnapshotFromRequests({ windowHours: input.windowHours, bucketMinutes: input.bucketMinutes, now, groups: input.groups, requests: [] });
}
const latestPricesByModel = await findLatestPricesByModels(targetModelIds);
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1f3016c8e7
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| await updateSystemSettings({ | ||
| publicStatusWindowHours: input.publicStatusWindowHours, | ||
| publicStatusAggregationIntervalMinutes: input.publicStatusAggregationIntervalMinutes, | ||
| }); |
There was a problem hiding this comment.
Validate public status interval values before persistence
This action writes publicStatusWindowHours/publicStatusAggregationIntervalMinutes directly to system_settings without schema validation, so normal UI input like an empty field (Number("") === 0) can persist invalid values. With bucketMinutes=0, the next forced refresh/scheduler run reaches Math.ceil((windowHours * 60) / bucketMinutes) in aggregation and can throw (invalid/infinite bucket sizing), leaving the feature stuck in repeated runtime failures until settings are manually corrected.
Useful? React with 👍 / 👎.
| if (nextDescription && nextDescription.length > 500) { | ||
| return { | ||
| ok: false, | ||
| error: "公开状态配置超过 provider_groups.description 的 500 字符限制", | ||
| }; |
There was a problem hiding this comment.
Make public-status settings updates atomic on validation failure
The function performs writes before all validations complete: updateSystemSettings runs first, then provider groups are updated one-by-one, and the loop can return early when a later group's serialized description exceeds 500 chars. In that case the API returns an error but earlier writes are already committed, producing a partially-applied configuration (some groups/settings changed, others not), which is hard to recover from and inconsistent with the failed response.
Useful? React with 👍 / 👎.
| input: AggregatePublicStatusSnapshotInput | ||
| ): Promise<PublicStatusSnapshotPayload> { | ||
| const now = new Date(); | ||
| const windowStart = new Date(now.getTime() - input.windowHours * 60 * 60 * 1000); |
There was a problem hiding this comment.
Query full aligned time window for status aggregation
Aggregation fetches rows from now - windowHours, but snapshot construction later aligns windowEnd down to the bucket boundary. When now is not exactly on a boundary, the built timeline starts earlier than the queried range, so the oldest bucket is missing data (up to nearly one full bucket), which skews availability/TTFB/TPS for that segment on every run.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 14
🧹 Nitpick comments (5)
src/repository/public-status-snapshot.ts (1)
40-47: 建议为快照 key 设置 TTL(可选)当前
redis.set未设置过期时间。正常流程下scheduler会周期性刷新或显式调用clearPublicStatusSnapshot(),但如果调度器因异常停止、配置被关闭前未触发 clear、或 Redis 从其他部署残留数据,旧快照可能长期驻留并被公开 API 返回。建议加一个略大于aggregationIntervalMinutes的EX,作为失效兜底。♻️ 建议改动
- await redis.set(SNAPSHOT_CACHE_KEY, JSON.stringify(record)); + // TTL 作为兜底;调度器正常刷新时会续期,异常停止时自动过期 + await redis.set(SNAPSHOT_CACHE_KEY, JSON.stringify(record), "EX", 60 * 60 * 24);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/repository/public-status-snapshot.ts` around lines 40 - 47, The snapshot write currently stores JSON at SNAPSHOT_CACHE_KEY without an expiration; update writeSnapshotToRedis to set an EX TTL slightly larger than aggregationIntervalMinutes (e.g., aggregationIntervalMinutes * 60 + buffer) so stale snapshots are auto-evicted if the scheduler stops; use the Redis client's set with an expiration flag (or call expire on SNAPSHOT_CACHE_KEY) and reference the existing getRedisClient and SNAPSHOT_CACHE_KEY in your change to implement the TTL.tests/unit/actions/system-config-save.test.ts (1)
273-287: 新增公开状态调度字段用例:LGTM测试覆盖了
publicStatusWindowHours/publicStatusAggregationIntervalMinutes经saveSystemSettings校验并透传到updateSystemSettings的路径,断言粒度合适。可选建议:补充一个负向用例,验证越界值(例如
publicStatusWindowHours: 0或极大值、publicStatusAggregationIntervalMinutes: 0)会被src/lib/validation/schemas.ts中的 Zod schema 拒绝,确保边界约束不会被后续重构无声地放宽。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/actions/system-config-save.test.ts` around lines 273 - 287, Add a negative test that asserts saveSystemSettings rejects out-of-bound values by calling it with invalid inputs (e.g., publicStatusWindowHours: 0 and publicStatusAggregationIntervalMinutes: 0) and expecting a thrown validation error (or result.ok === false), and/or that updateSystemSettingsMock is not called; refer to the existing positive test using saveSystemSettings and updateSystemSettingsMock and validate against the Zod schema in src/lib/validation/schemas.ts to ensure those constraints remain enforced.src/app/[locale]/settings/status-page/page.tsx (1)
15-16: 可以并行化两次独立的数据加载。
getSystemSettings()与findAllProviderGroups()彼此无依赖,可用Promise.all并行等待,减少页面 SSR 的串行 I/O 延迟。♻️ 建议的重构
- const settings = await getSystemSettings(); - const groups = await findAllProviderGroups(); + const [settings, groups] = await Promise.all([ + getSystemSettings(), + findAllProviderGroups(), + ]);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/app/`[locale]/settings/status-page/page.tsx around lines 15 - 16, 当前代码在 SSR 中串行调用 getSystemSettings() 与 findAllProviderGroups() 导致不必要的等待;将这两个独立的异步调用并行化:在 page.tsx 中用 Promise.all([getSystemSettings(), findAllProviderGroups()]) 并解构结果赋值给 settings 和 groups(替换原先的两次 await 调用),以减少 I/O 延迟并保持变量名(settings, groups)不变。tests/unit/lib/public-status-scheduler.test.ts (1)
81-86: 建议使用vi.waitFor替代手动微任务轮询,提升测试稳定性。当前依赖固定 10 次
await Promise.resolve()驱动微任务队列推进runCycle的异步链,若未来runCycle中新增一两个await(例如额外的 Redis 调用或清理步骤),迭代次数可能不再足够,导致断言在started === true时失败。建议使用vi.waitFor让等待条件显式化,不再耦合实现细节的微任务深度。♻️ 建议的重构
- for (let index = 0; index < 10; index++) { - await Promise.resolve(); - if (!getPublicStatusSchedulerStatus().started) { - break; - } - } + await vi.waitFor(() => { + expect(getPublicStatusSchedulerStatus().started).toBe(false); + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/lib/public-status-scheduler.test.ts` around lines 81 - 86, Replace the manual microtask polling loop with vi.waitFor so the test waits for the scheduler state explicitly: remove the for-loop that does repeated await Promise.resolve() and instead call vi.waitFor(() => !getPublicStatusSchedulerStatus().started) (or the appropriate boolean expectation used later in assertions) so the test waits until getPublicStatusSchedulerStatus().started reaches the expected value; this avoids coupling to runCycle's internal await depth and makes the test robust to added async steps.src/lib/public-status/aggregation.ts (1)
564-577: 重复的latestPricesByModel.get(...)调用,建议提取局部变量同一个
model.modelId在三行中被get了四次,既影响可读性也可能引发潜在的priceData.display_name读取一致性理解问题。建议一次性取出并缓存。建议:提取局部变量
- models: group.models.map((model) => ({ - ...model, - displayName: - typeof latestPricesByModel.get(model.modelId)?.priceData.display_name === "string" && - String(latestPricesByModel.get(model.modelId)?.priceData.display_name).trim().length > 0 - ? String(latestPricesByModel.get(model.modelId)?.priceData.display_name).trim() - : model.displayName, - })), + models: group.models.map((model) => { + const priceDisplayName = latestPricesByModel.get(model.modelId)?.priceData.display_name; + const normalized = + typeof priceDisplayName === "string" ? priceDisplayName.trim() : ""; + return { + ...model, + displayName: normalized.length > 0 ? normalized : model.displayName, + }; + }),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/public-status/aggregation.ts` around lines 564 - 577, Inside the models mapping block (snapshot.groups.map -> group.models.map) avoid calling latestPricesByModel.get(model.modelId) repeatedly: fetch it once into a local const (e.g., const latest = latestPricesByModel.get(model.modelId)), then compute the candidate display string from latest?.priceData.display_name (check typeof === "string" and trim length) and fall back to model.displayName; replace the three repeated get(...) uses with that local variable to improve readability and consistency.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/actions/public-status.ts`:
- Around line 79-85: The current gate uses hasConfiguredTargets computed from
raw input.groups which can be truthy for entries like [" "] that sanitize to
empty; change the logic to determine scheduling from the sanitized/enabled
groups or, after calling refreshPublicStatusSnapshot({ force: true }), inspect
its return value and only call startPublicStatusScheduler() when the snapshot
indicates enabled/hasTargets; update code paths around hasConfiguredTargets,
input.groups, refreshPublicStatusSnapshot, startPublicStatusScheduler (and keep
the cache invalidations invalidateSystemSettingsCache and
invalidateConfiguredPublicStatusGroupsCache) so the scheduler is not started
when the sanitized config is effectively disabled.
- Around line 30-77: The savePublicStatusSettings action performs writes
(updateSystemSettings and updateProviderGroup) incrementally and can leave
partial commits if validation fails; change it to validate numeric ranges
(publicStatusWindowHours, publicStatusAggregationIntervalMinutes), preload all
provider groups with findAllProviderGroups, compute every nextDescription using
parsePublicStatusDescription and serializePublicStatusDescription and validate
each length <= 500 before any DB writes, then perform updateSystemSettings and
all updateProviderGroup calls inside a single db.transaction() so the whole
operation is atomic; keep references to savePublicStatusSettings,
updateSystemSettings, findAllProviderGroups, parsePublicStatusDescription,
serializePublicStatusDescription, updateProviderGroup and use the existing
db.transaction() pattern in the codebase.
In
`@src/app/`[locale]/settings/status-page/_components/public-status-settings-form.tsx:
- Around line 160-166: The Checkbox and the related text inputs (the display
name input and the model IDs textbox) lack accessible labels; update the
Checkbox (the instance calling updateGroup) and the two inputs so each has a
stable id and is associated with a Label (use <Label htmlFor={id}>) or, for the
Checkbox, wrap it in a label or apply a descriptive aria-label; ensure ids are
derived from group.groupName (sanitized) so they remain unique/stable and use
those ids in htmlFor for the display name and model IDs inputs so screen readers
announce the proper control names.
In `@src/app/`[locale]/status/_components/public-status-timeline.tsx:
- Around line 127-129: The timeline is visually reversed because the container
uses "flex-row-reverse", which flips the padded.map output (where tests expect
oldest→newest and use timeline.at(-1) for latest); remove "flex-row-reverse"
from the div with className "flex h-full w-full flex-row-reverse gap-[2px]
p-[2px]" so the mapped segments render left-to-right (old→new) matching
padded.map and timeline.at(-1); apply the same removal for the other instance
around lines 211-213 where the reverse layout is also used.
- Around line 150-162: The empty segment buttons inside HoverCardTrigger lack
accessible names; update the button in public-status-timeline.tsx (the button
that calls setActiveSegmentKey and uses getBucketColor, segment, segmentKey, and
labels) to include a localized aria-label (and/or title) that combines the
appropriate labels text with the bucket's time and state (e.g., using
labels[...] plus segment or bucket time/state values) so keyboard and
screen-reader users can identify each time segment.
- Around line 38-69: compressTimeline currently computes chunkSize via
Math.ceil(items.length / SEGMENT_LIMIT) which groups items two-by-two when
items.length is just over SEGMENT_LIMIT and yields far fewer than SEGMENT_LIMIT
segments, causing many padded empty segments; change the bucketing to produce up
to SEGMENT_LIMIT real aggregated segments by distributing items into
SEGMENT_LIMIT buckets: when items.length > SEGMENT_LIMIT compute baseSize =
Math.floor(items.length / SEGMENT_LIMIT) and remainder = items.length %
SEGMENT_LIMIT, then create the first `remainder` chunks of size baseSize+1 and
the remaining chunks of size baseSize (iterating through `items` and slicing
accordingly), and for each chunk keep the existing aggregation logic (use `last`
for base fields, compute sampleCount by summing `item.sampleCount`, determine
`state` with the reverse-find of non-"no_data", and set `availabilityPct` using
the same sampleCount/last.availabilityPct rules); keep the early return when
items.length <= SEGMENT_LIMIT.
In `@src/app/`[locale]/status/page.tsx:
- Around line 108-110: The timestamp is rendered with new
Date(snapshot.generatedAt).toLocaleString(), which uses the runtime default
locale; replace this with a locale-aware formatted string using getLocale() and
date-fns-tz (e.g., formatInTimeZone or format) so the generatedAt time is
formatted according to the current route locale. In the page component (where
hasFailedModel and snapshot.generatedAt are used) call getLocale() to obtain the
current locale, then use date-fns-tz to format snapshot.generatedAt with that
locale (and appropriate timezone) and render that string instead of
toLocaleString().
In `@src/lib/public-status/aggregation.ts`:
- Around line 471-510: The DB query in aggregatePublicStatusSnapshot uses
windowStart = now - windowHours*3600*1000 which can be misaligned with the
bucket-aligned window used by buildPublicStatusSnapshotFromRequests (which uses
alignWindowEnd(now, bucketMinutes) and windowStartMs = alignedEnd -
bucketCount*bucketMs), causing the leftmost bucket to be partially omitted; fix
by computing the same aligned windowEnd and windowStart used by
buildPublicStatusSnapshotFromRequests (call alignWindowEnd(now, bucketMinutes)
to get alignedEnd, compute windowStart = new Date(alignedEnd.getTime() -
bucketCount * bucketMs)) and use that windowStart in the DB where
gte(messageRequest.createdAt, windowStart), or alternatively pass the alignedEnd
(aligned now) into both aggregatePublicStatusSnapshot and
buildPublicStatusSnapshotFromRequests so both use identical boundaries (refer to
aggregatePublicStatusSnapshot, buildPublicStatusSnapshotFromRequests,
alignWindowEnd, windowStart/windowStartMs, bucketMinutes, bucketCount,
bucketMs).
- Around line 142-170: applyBoundedGapFill currently ignores bucketMinutes and
unconditionally fills any null gap between lastKnownIndex and index; compute a
real fill limit and only fill gaps whose size <= maxFillBuckets. Specifically,
in applyBoundedGapFill use bucketMinutes to derive maxFillBuckets (e.g.,
allowedWindowMinutes / bucketMinutes or a configured maxBucketCount), compare
gapBuckets to maxFillBuckets before the inner loop that assigns
result[fillIndex] = previous, and leave larger gaps as null; update variable
names (gapBuckets, maxFillBuckets, previous, lastKnownIndex, result, fillIndex)
accordingly so only bounded gaps are forward-filled.
In `@src/lib/public-status/config.ts`:
- Around line 108-117: In collectEnabledPublicStatusGroups, sanitize each
group's modelIds before filtering instead of checking the raw array: call
sanitizeModelIds(group.publicStatus?.modelIds) into a local variable, check its
length > 0 to decide whether to keep the group, and use that sanitized list in
the mapped EnabledPublicStatusGroup (set modelIds to the sanitized list and keep
displayName logic using group.publicStatus?.displayName?.trim() ||
group.groupName). This prevents inputs like [" "] from producing empty
modelIds that bypass hasConfiguredPublicStatusTargets().
In `@src/lib/public-status/scheduler.ts`:
- Around line 75-88: The current onLost handler only logs but
refreshPublicStatusSnapshot() continues and may overwrite a newer snapshot;
update the leader-lock logic so that when startLeaderLockKeepAlive (the stop
function returned) signals loss (onLost) the current refresh cycle is aborted,
or add a re-check inside the save path to verify
schedulerState.__CCH_PUBLIC_STATUS_SCHEDULER_LOCK__ still matches the lock
before writing; specifically modify startLeaderLockKeepAlive/onLost handling
and/or refreshPublicStatusSnapshot() save path to early-return if the lock is no
longer held (compare schedulerState.__CCH_PUBLIC_STATUS_SCHEDULER_LOCK__ to the
original lock id) to prevent writes after lock loss.
- Around line 152-168: stopPublicStatusScheduler currently releases the leader
lock and flips __CCH_PUBLIC_STATUS_SCHEDULER_RUNNING__ to false immediately,
which can allow a new startPublicStatusScheduler to race with an in-flight
runCycle; instead only set __CCH_PUBLIC_STATUS_SCHEDULER_STOP_REQUESTED__ = true
and clear the interval (__CCH_PUBLIC_STATUS_SCHEDULER_INTERVAL_ID__), but do not
release the lock or set RUNNING to false there—let runCycle’s finally block
perform lock release and RUNNING cleanup (or alternatively track and await a
dedicated current promise like __CCH_PUBLIC_STATUS_SCHEDULER_CURRENT_PROMISE__
before cleaning up); update stopPublicStatusScheduler to remove
releaseLeaderLock(currentLock) and avoid touching RUNNING/LOCK fields so the
existing runCycle and its finally can safely finish.
In `@src/repository/public-status-snapshot.ts`:
- Around line 23-32: The current validation only checks parsed.payload
truthiness which is too weak; update the validation in the code handling
PublicStatusSnapshotRecord (the parsed variable and existing check that deletes
SNAPSHOT_CACHE_KEY and returns null) to also verify payload.generatedAt is a
string and payload.groups is an array (e.g.,
Array.isArray(parsed.payload.groups)); if either check fails, call
redis.del(SNAPSHOT_CACHE_KEY) and return null so incompatible/corrupt Redis
entries never get passed to NextResponse.json in route.ts or to page.tsx reading
payload.groups. Optionally, extend checks to ensure each group has the minimal
expected fields before accepting.
In `@src/repository/system-config.ts`:
- Around line 218-219: The current downgrade selections still include the new
columns publicStatusWindowHours and publicStatusAggregationIntervalMinutes
causing read failures when migration 0091 isn't applied; update
selectSettingsRow's selectionWithoutHighConcurrencyMode and
selectionWithoutCodexAndHighConcurrency to exclude those two new columns (or add
a new intermediate downgrade selection that only removes
publicStatusWindowHours/publicStatusAggregationIntervalMinutes) so the
first-level and second-level SELECTs won't throw 42703; also ensure the updates
handling that strips new columns (the updates removal logic around
publicStatusWindowHours/publicStatusAggregationIntervalMinutes) stays consistent
with toSystemSettings and fullSelection/minimalSelection so missing columns are
filled by toSystemSettings defaults rather than causing fallback to
minimalSelection.
---
Nitpick comments:
In `@src/app/`[locale]/settings/status-page/page.tsx:
- Around line 15-16: 当前代码在 SSR 中串行调用 getSystemSettings() 与
findAllProviderGroups() 导致不必要的等待;将这两个独立的异步调用并行化:在 page.tsx 中用
Promise.all([getSystemSettings(), findAllProviderGroups()]) 并解构结果赋值给 settings 和
groups(替换原先的两次 await 调用),以减少 I/O 延迟并保持变量名(settings, groups)不变。
In `@src/lib/public-status/aggregation.ts`:
- Around line 564-577: Inside the models mapping block (snapshot.groups.map ->
group.models.map) avoid calling latestPricesByModel.get(model.modelId)
repeatedly: fetch it once into a local const (e.g., const latest =
latestPricesByModel.get(model.modelId)), then compute the candidate display
string from latest?.priceData.display_name (check typeof === "string" and trim
length) and fall back to model.displayName; replace the three repeated get(...)
uses with that local variable to improve readability and consistency.
In `@src/repository/public-status-snapshot.ts`:
- Around line 40-47: The snapshot write currently stores JSON at
SNAPSHOT_CACHE_KEY without an expiration; update writeSnapshotToRedis to set an
EX TTL slightly larger than aggregationIntervalMinutes (e.g.,
aggregationIntervalMinutes * 60 + buffer) so stale snapshots are auto-evicted if
the scheduler stops; use the Redis client's set with an expiration flag (or call
expire on SNAPSHOT_CACHE_KEY) and reference the existing getRedisClient and
SNAPSHOT_CACHE_KEY in your change to implement the TTL.
In `@tests/unit/actions/system-config-save.test.ts`:
- Around line 273-287: Add a negative test that asserts saveSystemSettings
rejects out-of-bound values by calling it with invalid inputs (e.g.,
publicStatusWindowHours: 0 and publicStatusAggregationIntervalMinutes: 0) and
expecting a thrown validation error (or result.ok === false), and/or that
updateSystemSettingsMock is not called; refer to the existing positive test
using saveSystemSettings and updateSystemSettingsMock and validate against the
Zod schema in src/lib/validation/schemas.ts to ensure those constraints remain
enforced.
In `@tests/unit/lib/public-status-scheduler.test.ts`:
- Around line 81-86: Replace the manual microtask polling loop with vi.waitFor
so the test waits for the scheduler state explicitly: remove the for-loop that
does repeated await Promise.resolve() and instead call vi.waitFor(() =>
!getPublicStatusSchedulerStatus().started) (or the appropriate boolean
expectation used later in assertions) so the test waits until
getPublicStatusSchedulerStatus().started reaches the expected value; this avoids
coupling to runCycle's internal await depth and makes the test robust to added
async steps.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 267f3acc-84ab-424b-ad2f-6126e919e7c7
📒 Files selected for processing (50)
drizzle/0091_nervous_kronos.sqldrizzle/meta/0091_snapshot.jsondrizzle/meta/_journal.jsonmessages/en/settings/index.tsmessages/en/settings/nav.jsonmessages/en/settings/statusPage.jsonmessages/ja/settings/index.tsmessages/ja/settings/nav.jsonmessages/ja/settings/statusPage.jsonmessages/ru/settings/index.tsmessages/ru/settings/nav.jsonmessages/ru/settings/statusPage.jsonmessages/zh-CN/settings/index.tsmessages/zh-CN/settings/nav.jsonmessages/zh-CN/settings/statusPage.jsonmessages/zh-TW/settings/index.tsmessages/zh-TW/settings/nav.jsonmessages/zh-TW/settings/statusPage.jsonpackage.jsonsrc/actions/public-status.tssrc/actions/system-config.tssrc/app/[locale]/settings/_components/settings-nav.tsxsrc/app/[locale]/settings/_components/settings-page-header.tsxsrc/app/[locale]/settings/_lib/nav-items.tssrc/app/[locale]/settings/status-page/_components/public-status-settings-form.tsxsrc/app/[locale]/settings/status-page/page.tsxsrc/app/[locale]/status/_components/public-status-timeline.tsxsrc/app/[locale]/status/page.tsxsrc/app/api/public-status/route.tssrc/components/ui/hover-card.tsxsrc/drizzle/schema.tssrc/instrumentation.tssrc/lib/config/system-settings-cache.tssrc/lib/public-status/aggregation.tssrc/lib/public-status/config.tssrc/lib/public-status/scheduler.tssrc/lib/public-status/service.tssrc/lib/validation/schemas.tssrc/proxy.tssrc/repository/_shared/transformers.tssrc/repository/public-status-snapshot.tssrc/repository/system-config.tssrc/types/system-config.tstests/unit/actions/public-status-save.test.tstests/unit/actions/system-config-save.test.tstests/unit/api/public-status-route.test.tstests/unit/lib/public-status-aggregation.test.tstests/unit/lib/public-status-config.test.tstests/unit/lib/public-status-scheduler.test.tstests/unit/lib/public-status-service.test.ts
| export async function savePublicStatusSettings( | ||
| input: SavePublicStatusSettingsInput | ||
| ): Promise<ActionResult<{ updatedGroupCount: number }>> { | ||
| try { | ||
| const session = await getSession(); | ||
| if (!session || session.user.role !== "admin") { | ||
| return { ok: false, error: "无权限执行此操作" }; | ||
| } | ||
|
|
||
| await updateSystemSettings({ | ||
| publicStatusWindowHours: input.publicStatusWindowHours, | ||
| publicStatusAggregationIntervalMinutes: input.publicStatusAggregationIntervalMinutes, | ||
| }); | ||
|
|
||
| const allGroups = await findAllProviderGroups(); | ||
| const enabledByName = new Map(input.groups.map((group) => [group.groupName, group])); | ||
|
|
||
| let updatedGroupCount = 0; | ||
|
|
||
| for (const group of allGroups) { | ||
| const existing = parsePublicStatusDescription(group.description); | ||
| const configured = enabledByName.get(group.name); | ||
| const nextDescription = serializePublicStatusDescription({ | ||
| note: existing.note, | ||
| publicStatus: configured | ||
| ? { | ||
| displayName: configured.displayName, | ||
| modelIds: configured.modelIds, | ||
| } | ||
| : null, | ||
| }); | ||
|
|
||
| if (nextDescription && nextDescription.length > 500) { | ||
| return { | ||
| ok: false, | ||
| error: "公开状态配置超过 provider_groups.description 的 500 字符限制", | ||
| }; | ||
| } | ||
|
|
||
| if ((group.description ?? null) === nextDescription) { | ||
| continue; | ||
| } | ||
|
|
||
| await updateProviderGroup(group.id, { | ||
| description: nextDescription, | ||
| }); | ||
| updatedGroupCount++; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the file and understand the full context
cat -n src/actions/public-status.ts | head -100Repository: ding113/claude-code-hub
Length of output: 4127
🏁 Script executed:
# Check the SavePublicStatusSettingsInput type definition
rg "SavePublicStatusSettingsInput" -A 10 --type tsRepository: ding113/claude-code-hub
Length of output: 4090
🏁 Script executed:
# Look for patterns of input validation in other Server Actions
fd -e ts -path "*/actions/*" | head -5Repository: ding113/claude-code-hub
Length of output: 235
🏁 Script executed:
# Check if there's a database transaction/atomicity pattern in the codebase
rg "transaction|db\.|prisma\." src/actions/ --type ts -B 2 -A 2 | head -50Repository: ding113/claude-code-hub
Length of output: 2592
🏁 Script executed:
# Verify the transaction pattern more thoroughly to confirm availability
rg "db\.transaction" src/ --type ts | head -3Repository: ding113/claude-code-hub
Length of output: 370
🏁 Script executed:
# Check EnabledPublicStatusGroup interface to understand data structure
rg "EnabledPublicStatusGroup" -A 5 --type ts | head -20Repository: ding113/claude-code-hub
Length of output: 1461
应在数据库写入前完成所有校验和描述预计算。
该 Server Action 可被客户端直接调用,SavePublicStatusSettingsInput 缺乏运行时校验。当前存在原子性问题:第 39 行更新系统设置后立即提交,然后在第 49-77 行逐个循环更新 provider group。若循环中任意 group 超过 500 字符(第 62 行)或更新失败,用户将收到错误,但系统设置和前面已更新的 group 已被永久提交,导致数据状态不一致。
建议:
- 先校验数字范围(
publicStatusWindowHours、publicStatusAggregationIntervalMinutes) - 预计算所有
nextDescription并检查长度限制,确保全部有效再写入 - 将系统设置和所有 group 更新放入单个事务中(codebase 已有
db.transaction()模式可参考)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/actions/public-status.ts` around lines 30 - 77, The
savePublicStatusSettings action performs writes (updateSystemSettings and
updateProviderGroup) incrementally and can leave partial commits if validation
fails; change it to validate numeric ranges (publicStatusWindowHours,
publicStatusAggregationIntervalMinutes), preload all provider groups with
findAllProviderGroups, compute every nextDescription using
parsePublicStatusDescription and serializePublicStatusDescription and validate
each length <= 500 before any DB writes, then perform updateSystemSettings and
all updateProviderGroup calls inside a single db.transaction() so the whole
operation is atomic; keep references to savePublicStatusSettings,
updateSystemSettings, findAllProviderGroups, parsePublicStatusDescription,
serializePublicStatusDescription, updateProviderGroup and use the existing
db.transaction() pattern in the codebase.
| <Checkbox | ||
| checked={group.enabled} | ||
| onCheckedChange={(checked) => | ||
| updateGroup(group.groupName, { enabled: checked === true }) | ||
| } | ||
| /> | ||
| <span>{group.groupName}</span> |
There was a problem hiding this comment.
为表单控件补上可访问名称。
这里的 Checkbox、display name 输入框和 model IDs 文本框都缺少明确的 label 关联;视觉文本不会自动成为控件的可访问名称。请给控件设置稳定 id 并用 Label htmlFor 关联,或为 Checkbox 加 aria-label/包裹式 label。
Also applies to: 176-196
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@src/app/`[locale]/settings/status-page/_components/public-status-settings-form.tsx
around lines 160 - 166, The Checkbox and the related text inputs (the display
name input and the model IDs textbox) lack accessible labels; update the
Checkbox (the instance calling updateGroup) and the two inputs so each has a
stable id and is associated with a Label (use <Label htmlFor={id}>) or, for the
Checkbox, wrap it in a label or apply a descriptive aria-label; ensure ids are
derived from group.groupName (sanitized) so they remain unique/stable and use
those ids in htmlFor for the display name and model IDs inputs so screen readers
announce the proper control names.
| function compressTimeline(items: PublicStatusTimelineBucket[]): PublicStatusTimelineBucket[] { | ||
| if (items.length <= SEGMENT_LIMIT) { | ||
| return items; | ||
| } | ||
|
|
||
| const chunkSize = Math.ceil(items.length / SEGMENT_LIMIT); | ||
| const result: PublicStatusTimelineBucket[] = []; | ||
|
|
||
| for (let index = 0; index < items.length; index += chunkSize) { | ||
| const chunk = items.slice(index, index + chunkSize); | ||
| const last = chunk.at(-1); | ||
| if (!last) continue; | ||
|
|
||
| const sampleCount = chunk.reduce((sum, item) => sum + item.sampleCount, 0); | ||
| const state = [...chunk].reverse().find((item) => item.state !== "no_data")?.state ?? "no_data"; | ||
|
|
||
| result.push({ | ||
| ...last, | ||
| state, | ||
| sampleCount, | ||
| availabilityPct: | ||
| sampleCount === 0 | ||
| ? state === "operational" | ||
| ? 100 | ||
| : state === "failed" | ||
| ? 0 | ||
| : null | ||
| : last.availabilityPct, | ||
| }); | ||
| } | ||
|
|
||
| return result; |
There was a problem hiding this comment.
压缩算法会把有效数据渲染成空白占位。
当 bucket 数刚超过 60(例如 61)时,ceil(length / 60) 会按 2 个一组压缩,只产生约 31 个 segment,剩余 29 个被 padded 显示为空白,用户会误以为缺失大量数据。建议按目标 segment 数分桶,尽量输出 60 个真实聚合段。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/app/`[locale]/status/_components/public-status-timeline.tsx around lines
38 - 69, compressTimeline currently computes chunkSize via
Math.ceil(items.length / SEGMENT_LIMIT) which groups items two-by-two when
items.length is just over SEGMENT_LIMIT and yields far fewer than SEGMENT_LIMIT
segments, causing many padded empty segments; change the bucketing to produce up
to SEGMENT_LIMIT real aggregated segments by distributing items into
SEGMENT_LIMIT buckets: when items.length > SEGMENT_LIMIT compute baseSize =
Math.floor(items.length / SEGMENT_LIMIT) and remainder = items.length %
SEGMENT_LIMIT, then create the first `remainder` chunks of size baseSize+1 and
the remaining chunks of size baseSize (iterating through `items` and slicing
accordingly), and for each chunk keep the existing aggregation logic (use `last`
for base fields, compute sampleCount by summing `item.sampleCount`, determine
`state` with the reverse-find of non-"no_data", and set `availabilityPct` using
the same sampleCount/last.availabilityPct rules); keep the early return when
items.length <= SEGMENT_LIMIT.
| <div className="relative h-8 w-full overflow-hidden rounded-sm bg-muted/20"> | ||
| <div className="flex h-full w-full flex-row-reverse gap-[2px] p-[2px]"> | ||
| {padded.map((segment, index) => { |
There was a problem hiding this comment.
时间线方向和 Past/Now 标签相反。
聚合测试里最新 bucket 使用 timeline.at(-1),说明数组顺序是从过去到现在;但 flex-row-reverse 会把第一个 bucket 放到右侧,导致“Past”左侧实际显示最新数据、“Now”右侧显示最旧数据。请移除反向布局,保持旧→新的视觉顺序。
建议修改
- <div className="flex h-full w-full flex-row-reverse gap-[2px] p-[2px]">
+ <div className="flex h-full w-full gap-[2px] p-[2px]">Also applies to: 211-213
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/app/`[locale]/status/_components/public-status-timeline.tsx around lines
127 - 129, The timeline is visually reversed because the container uses
"flex-row-reverse", which flips the padded.map output (where tests expect
oldest→newest and use timeline.at(-1) for latest); remove "flex-row-reverse"
from the div with className "flex h-full w-full flex-row-reverse gap-[2px]
p-[2px]" so the mapped segments render left-to-right (old→new) matching
padded.map and timeline.at(-1); apply the same removal for the other instance
around lines 211-213 where the reverse layout is also used.
| export function collectEnabledPublicStatusGroups( | ||
| groups: PublicStatusConfiguredGroupInput[] | ||
| ): EnabledPublicStatusGroup[] { | ||
| return groups | ||
| .filter((group) => group.publicStatus && group.publicStatus.modelIds.length > 0) | ||
| .map((group) => ({ | ||
| groupName: group.groupName, | ||
| displayName: group.publicStatus?.displayName?.trim() || group.groupName, | ||
| modelIds: sanitizeModelIds(group.publicStatus?.modelIds), | ||
| })); |
There was a problem hiding this comment.
过滤前先清洗 modelIds。
filter 现在检查的是未清洗数组长度;如果调用方传入 [" "],会返回一个 modelIds: [] 的 enabled group,hasConfiguredPublicStatusTargets() 也会误判为已配置。请先 sanitize,再基于清洗后的数组长度过滤。
建议修改
export function collectEnabledPublicStatusGroups(
groups: PublicStatusConfiguredGroupInput[]
): EnabledPublicStatusGroup[] {
return groups
- .filter((group) => group.publicStatus && group.publicStatus.modelIds.length > 0)
- .map((group) => ({
- groupName: group.groupName,
- displayName: group.publicStatus?.displayName?.trim() || group.groupName,
- modelIds: sanitizeModelIds(group.publicStatus?.modelIds),
- }));
+ .map((group) => {
+ const modelIds = sanitizeModelIds(group.publicStatus?.modelIds);
+ return {
+ groupName: group.groupName,
+ displayName: group.publicStatus?.displayName?.trim() || group.groupName,
+ modelIds,
+ };
+ })
+ .filter((group) => group.modelIds.length > 0);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export function collectEnabledPublicStatusGroups( | |
| groups: PublicStatusConfiguredGroupInput[] | |
| ): EnabledPublicStatusGroup[] { | |
| return groups | |
| .filter((group) => group.publicStatus && group.publicStatus.modelIds.length > 0) | |
| .map((group) => ({ | |
| groupName: group.groupName, | |
| displayName: group.publicStatus?.displayName?.trim() || group.groupName, | |
| modelIds: sanitizeModelIds(group.publicStatus?.modelIds), | |
| })); | |
| export function collectEnabledPublicStatusGroups( | |
| groups: PublicStatusConfiguredGroupInput[] | |
| ): EnabledPublicStatusGroup[] { | |
| return groups | |
| .map((group) => { | |
| const modelIds = sanitizeModelIds(group.publicStatus?.modelIds); | |
| return { | |
| groupName: group.groupName, | |
| displayName: group.publicStatus?.displayName?.trim() || group.groupName, | |
| modelIds, | |
| }; | |
| }) | |
| .filter((group) => group.modelIds.length > 0); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/public-status/config.ts` around lines 108 - 117, In
collectEnabledPublicStatusGroups, sanitize each group's modelIds before
filtering instead of checking the raw array: call
sanitizeModelIds(group.publicStatus?.modelIds) into a local variable, check its
length > 0 to decide whether to keep the group, and use that sanitized list in
the mapped EnabledPublicStatusGroup (set modelIds to the sanitized list and keep
displayName logic using group.publicStatus?.displayName?.trim() ||
group.groupName). This prevents inputs like [" "] from producing empty
modelIds that bypass hasConfiguredPublicStatusTargets().
| stopKeepAlive = startLeaderLockKeepAlive({ | ||
| getLock: () => schedulerState.__CCH_PUBLIC_STATUS_SCHEDULER_LOCK__, | ||
| clearLock: () => { | ||
| schedulerState.__CCH_PUBLIC_STATUS_SCHEDULER_LOCK__ = undefined; | ||
| }, | ||
| ttlMs: LOCK_TTL_MS, | ||
| logTag: "PublicStatusScheduler", | ||
| onLost: () => { | ||
| logger.warn("[PublicStatusScheduler] Lost leader lock"); | ||
| }, | ||
| }).stop; | ||
|
|
||
| const result = await refreshPublicStatusSnapshot(); | ||
| logger.info("[PublicStatusScheduler] Cycle finished", result); |
There was a problem hiding this comment.
锁丢失后不要继续写入快照。
onLost 现在只打日志,refreshPublicStatusSnapshot() 仍会继续聚合并保存;如果 Redis 锁续期失败,另一个实例可能已经接管并写入更新的快照,当前实例随后会覆盖它。建议在锁丢失时中止本轮刷新,或把“聚合完成后、保存前再次确认仍持有同一把锁”的检查下沉到保存路径。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/public-status/scheduler.ts` around lines 75 - 88, The current onLost
handler only logs but refreshPublicStatusSnapshot() continues and may overwrite
a newer snapshot; update the leader-lock logic so that when
startLeaderLockKeepAlive (the stop function returned) signals loss (onLost) the
current refresh cycle is aborted, or add a re-check inside the save path to
verify schedulerState.__CCH_PUBLIC_STATUS_SCHEDULER_LOCK__ still matches the
lock before writing; specifically modify startLeaderLockKeepAlive/onLost
handling and/or refreshPublicStatusSnapshot() save path to early-return if the
lock is no longer held (compare
schedulerState.__CCH_PUBLIC_STATUS_SCHEDULER_LOCK__ to the original lock id) to
prevent writes after lock loss.
| export async function stopPublicStatusScheduler(): Promise<void> { | ||
| schedulerState.__CCH_PUBLIC_STATUS_SCHEDULER_STOP_REQUESTED__ = true; | ||
|
|
||
| const intervalId = schedulerState.__CCH_PUBLIC_STATUS_SCHEDULER_INTERVAL_ID__; | ||
| if (intervalId) { | ||
| clearInterval(intervalId); | ||
| } | ||
|
|
||
| const currentLock = schedulerState.__CCH_PUBLIC_STATUS_SCHEDULER_LOCK__; | ||
| if (currentLock) { | ||
| await releaseLeaderLock(currentLock); | ||
| } | ||
|
|
||
| schedulerState.__CCH_PUBLIC_STATUS_SCHEDULER_STARTED__ = false; | ||
| schedulerState.__CCH_PUBLIC_STATUS_SCHEDULER_INTERVAL_ID__ = undefined; | ||
| schedulerState.__CCH_PUBLIC_STATUS_SCHEDULER_RUNNING__ = false; | ||
| schedulerState.__CCH_PUBLIC_STATUS_SCHEDULER_LOCK__ = undefined; |
There was a problem hiding this comment.
停止调度器时不要把执行中的 cycle 标记为已结束。
stopPublicStatusScheduler() 会在外部停止时立即释放锁并把 RUNNING 置为 false,但旧的 runCycle() 可能仍在执行刷新;随后 startPublicStatusScheduler() 可启动新 cycle,导致同进程内并发刷新且旧 cycle 已不再受锁保护。建议只设置 stop 请求并清 interval,等待当前 cycle 的 finally 统一释放/清理,或显式跟踪并 await 当前运行的 promise。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/public-status/scheduler.ts` around lines 152 - 168,
stopPublicStatusScheduler currently releases the leader lock and flips
__CCH_PUBLIC_STATUS_SCHEDULER_RUNNING__ to false immediately, which can allow a
new startPublicStatusScheduler to race with an in-flight runCycle; instead only
set __CCH_PUBLIC_STATUS_SCHEDULER_STOP_REQUESTED__ = true and clear the interval
(__CCH_PUBLIC_STATUS_SCHEDULER_INTERVAL_ID__), but do not release the lock or
set RUNNING to false there—let runCycle’s finally block perform lock release and
RUNNING cleanup (or alternatively track and await a dedicated current promise
like __CCH_PUBLIC_STATUS_SCHEDULER_CURRENT_PROMISE__ before cleaning up); update
stopPublicStatusScheduler to remove releaseLeaderLock(currentLock) and avoid
touching RUNNING/LOCK fields so the existing runCycle and its finally can safely
finish.
| const parsed = JSON.parse(raw) as PublicStatusSnapshotRecord; | ||
| if ( | ||
| !parsed || | ||
| typeof parsed !== "object" || | ||
| typeof parsed.aggregatedAt !== "string" || | ||
| !parsed.payload | ||
| ) { | ||
| await redis.del(SNAPSHOT_CACHE_KEY); | ||
| return null; | ||
| } |
There was a problem hiding this comment.
payload 校验过于宽松
此处只校验 parsed.payload 为 truthy,而下游 src/app/api/public-status/route.ts 会原样 NextResponse.json(snapshot) 返回,src/app/[locale]/status/page.tsx 也会直接读取 payload.groups 等字段。如果 Redis 中存在历史版本或被外部写入格式不兼容的数据,会透传到前端/API 并引发运行时错误或渲染崩溃。建议在这里做一次结构化校验(至少校验 payload.generatedAt 与 payload.groups 为数组),不匹配则 del + 返回 null,与已有的 corruption-clean 策略保持一致。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/repository/public-status-snapshot.ts` around lines 23 - 32, The current
validation only checks parsed.payload truthiness which is too weak; update the
validation in the code handling PublicStatusSnapshotRecord (the parsed variable
and existing check that deletes SNAPSHOT_CACHE_KEY and returns null) to also
verify payload.generatedAt is a string and payload.groups is an array (e.g.,
Array.isArray(parsed.payload.groups)); if either check fails, call
redis.del(SNAPSHOT_CACHE_KEY) and return null so incompatible/corrupt Redis
entries never get passed to NextResponse.json in route.ts or to page.tsx reading
payload.groups. Optionally, extend checks to ensure each group has the minimal
expected fields before accepting.
| publicStatusWindowHours: systemSettings.publicStatusWindowHours, | ||
| publicStatusAggregationIntervalMinutes: systemSettings.publicStatusAggregationIntervalMinutes, |
There was a problem hiding this comment.
降级读/写字段集不对称,旧库读取路径会整体失败。
更新路径 (Line 630-635) 把 publicStatusWindowHours / publicStatusAggregationIntervalMinutes 视为"可能缺失的新列"并从降级 updates 中删除;但 selectSettingsRow 的一级降级 selectionWithoutHighConcurrencyMode (Line 218-219) 与二级降级 selectionWithoutCodexAndHighConcurrency (Line 252-253) 仍然 SELECT 这两列。
结果:当数据库尚未执行 0091_nervous_kronos.sql 时,读操作会在前两级降级继续抛 42703,最终只能落到 minimalSelection,丢失绝大部分字段(enableThinkingSignatureRectifier、quota* 等全部走默认值),而不是按"仅该两列缺失"进行最小化降级。对于未打该迁移的老部署,这会把升级期间的读取体验明显退化。
建议修复:将新列从前两级降级 selection 中剔除,或新增一级"仅缺新列"的降级
const selectionWithoutHighConcurrencyMode = {
...
quotaLeaseCapUsd: systemSettings.quotaLeaseCapUsd,
- publicStatusWindowHours: systemSettings.publicStatusWindowHours,
- publicStatusAggregationIntervalMinutes: systemSettings.publicStatusAggregationIntervalMinutes,
createdAt: systemSettings.createdAt,
updatedAt: systemSettings.updatedAt,
};selectionWithoutCodexAndHighConcurrency 同理。toSystemSettings 会把缺失字段补默认值。另一种方案是在 fullSelection 失败后先尝试一个只剔除 publicStatus 两列的降级集。
Also applies to: 252-253, 630-635
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/repository/system-config.ts` around lines 218 - 219, The current
downgrade selections still include the new columns publicStatusWindowHours and
publicStatusAggregationIntervalMinutes causing read failures when migration 0091
isn't applied; update selectSettingsRow's selectionWithoutHighConcurrencyMode
and selectionWithoutCodexAndHighConcurrency to exclude those two new columns (or
add a new intermediate downgrade selection that only removes
publicStatusWindowHours/publicStatusAggregationIntervalMinutes) so the
first-level and second-level SELECTs won't throw 42703; also ensure the updates
handling that strips new columns (the updates removal logic around
publicStatusWindowHours/publicStatusAggregationIntervalMinutes) stays consistent
with toSystemSettings and fullSelection/minimalSelection so missing columns are
filled by toSystemSettings defaults rather than causing fallback to
minimalSelection.
There was a problem hiding this comment.
Code Review Summary -- PR #1055: Public Uptime Status Page
| Severity | Count |
|---|---|
| Critical | 1 |
| High | 1 |
| Medium | 1 |
Findings
1. [Critical] Missing server-side validation in savePublicStatusSettings (confidence: 95%)
File: src/actions/public-status.ts:39-42
The publicStatusWindowHours and publicStatusAggregationIntervalMinutes values are passed directly from client input to updateSystemSettings() with zero validation. The Zod schema UpdateSystemSettingsSchema in src/lib/validation/schemas.ts:1014-1025 already defines correct bounds (.min(1).max(168) / .min(1).max(60)), and the sibling action saveSystemSettings in src/actions/system-config.ts:96 correctly uses UpdateSystemSettingsSchema.parse() -- but this new action bypasses that entirely.
Impact: A zero bucketMinutes value reaches aggregation.ts:208 where Math.ceil((windowHours * 60) / 0) produces Infinity, then Array.from({ length: Infinity }) throws a RangeError crashing the aggregation engine. NaN or negative values cause similarly undefined behavior.
2. [High] Redis snapshot key written without TTL (confidence: 85%)
File: src/repository/public-status-snapshot.ts:46
The redis.set() call stores the snapshot with no expiry. While clearPublicStatusSnapshot() is called on disable, if that call fails (Redis temporarily unavailable, process crash) the stale snapshot persists indefinitely and continues to be served from the public /api/public-status endpoint.
3. [Medium] Numeric inputs lack client-side bounds (confidence: 80%)
File: src/app/[locale]/settings/status-page/_components/public-status-settings-form.tsx:108-113
The <Input> elements use inputMode="numeric" but not type="number" with min/max. Combined with the missing server-side validation (Issue 1), there is currently no validation layer at all. Even after server-side is fixed, adding min/max improves UX by preventing obvious invalid submissions.
General Observations
- Test coverage is good overall.
tests/unit/actions/public-status-save.test.tscovers happy paths, auth, group rewriting, and oversized config.tests/unit/lib/public-status-aggregation.test.tscovers TPS, exclusion, gap-fill, and provider chain attribution. - Missing test case: No tests for zero/negative/NaN numeric inputs, which would exercise the exact validation gap in Issue 1.
- Architecture is clean: pure aggregation function, leader-lock scheduler, in-memory config TTL cache, and Redis snapshot storage are well-separated.
- i18n is properly implemented across all 5 locales.
- The scheduler's
globalThis-based singleton and lock pattern are solid for the distributed use case.
| await updateSystemSettings({ | ||
| publicStatusWindowHours: input.publicStatusWindowHours, | ||
| publicStatusAggregationIntervalMinutes: input.publicStatusAggregationIntervalMinutes, | ||
| }); |
There was a problem hiding this comment.
[Critical] Missing server-side validation -- bypasses existing Zod schema
publicStatusWindowHours and publicStatusAggregationIntervalMinutes are passed from untrusted client input directly to updateSystemSettings() with no validation.
The sibling action saveSystemSettings (in src/actions/system-config.ts:96) correctly uses UpdateSystemSettingsSchema.parse(formData), which already defines the bounds you need (min(1).max(168) / min(1).max(60)) at src/lib/validation/schemas.ts:1014-1025.
Impact: A zero bucketMinutes produces Infinity bucket count at aggregation.ts:208, causing Array.from({ length: Infinity }) -> RangeError. NaN and negative values also cause undefined behavior.
Suggested fix:
| }); | |
| const windowHours = Math.max(1, Math.min(168, Math.trunc(input.publicStatusWindowHours || 24))); | |
| const intervalMinutes = Math.max(1, Math.min(60, Math.trunc(input.publicStatusAggregationIntervalMinutes || 5))); | |
| await updateSystemSettings({ | |
| publicStatusWindowHours: windowHours, | |
| publicStatusAggregationIntervalMinutes: intervalMinutes, | |
| }); |
Alternatively, reuse the existing Zod schema:
import { UpdateSystemSettingsSchema } from "@/lib/validation/schemas";
const validated = UpdateSystemSettingsSchema.pick({
publicStatusWindowHours: true,
publicStatusAggregationIntervalMinutes: true,
}).parse({
publicStatusWindowHours: input.publicStatusWindowHours,
publicStatusAggregationIntervalMinutes: input.publicStatusAggregationIntervalMinutes,
});| return; | ||
| } | ||
|
|
||
| await redis.set(SNAPSHOT_CACHE_KEY, JSON.stringify(record)); |
There was a problem hiding this comment.
[High] No TTL on Redis snapshot key
This redis.set() writes the snapshot without an expiry. If the feature is disabled but clearPublicStatusSnapshot() fails (Redis temporarily unavailable, process crash), the stale snapshot persists indefinitely and continues to be served from the public /api/public-status endpoint.
A safety-net TTL matching the window size (or a reasonable upper bound like 7 days) would auto-expire stale data.
Suggested fix:
| await redis.set(SNAPSHOT_CACHE_KEY, JSON.stringify(record)); | |
| const TTL_SECONDS = 7 * 24 * 60 * 60; // 7-day safety net | |
| await redis.set(SNAPSHOT_CACHE_KEY, JSON.stringify(record), "EX", TTL_SECONDS); |
| id="public-status-window-hours" | ||
| inputMode="numeric" | ||
| value={windowHours} | ||
| onChange={(event) => setWindowHours(event.target.value)} |
There was a problem hiding this comment.
[Medium] Numeric input fields lack min/max bounds
Using inputMode="numeric" without type="number" + min/max allows any text entry. Number("") returns 0, which (combined with the missing server-side validation in savePublicStatusSettings) flows through to the aggregation engine unchecked.
Even after server-side validation is added, client-side bounds improve UX.
Suggested fix:
| onChange={(event) => setWindowHours(event.target.value)} | |
| <Input | |
| id="public-status-window-hours" | |
| type="number" | |
| min={1} | |
| max={168} | |
| value={windowHours} | |
| onChange={(event) => setWindowHours(event.target.value)} | |
| /> |
(Apply the same pattern with min={1} max={60} to the aggregation interval input below.)
🧪 测试结果
总体结果: ✅ 所有测试通过 |
There was a problem hiding this comment.
♻️ Duplicate comments (3)
src/actions/public-status.ts (3)
43-80:⚠️ Potential issue | 🟠 Major保存流程缺乏原子性:
system_settings和provider_groups分步提交可能产生部分写入。第 71-74 行先独立提交
updateSystemSettings,再在 76-80 行对groupUpdates逐条执行updateProviderGroup(参考src/repository/provider-groups.ts:118-143,本身也无事务包裹)。任意一次 group 更新失败都会留下「系统设置已改、部分分组未改」的不一致状态,而 catch 分支只返回错误、不做回滚。建议将所有写入放入单个
db.transaction():先做完字段校验与nextDescription预计算/长度校验,再在事务内统一执行updateSystemSettings+ 批量updateProviderGroup。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/actions/public-status.ts` around lines 43 - 80, The current save flow performs updateSystemSettings(...) then multiple updateProviderGroup(...) calls separately, risking partial commits; instead, after validating and computing nextDescription for each group (using parsePublicStatusDescription and serializePublicStatusDescription) and populating groupUpdates, perform the writes inside a single db.transaction(): within the transaction call execute updateSystemSettings(...) and then apply all updateProviderGroup(...) updates (or a batch update) so either all succeed or all roll back; ensure validation/length checks remain outside/precede the transaction and propagate errors to abort the transaction on failure.
82-92:⚠️ Potential issue | 🟡 Minor
hasConfiguredTargets基于原始input.groups而非清洗后的配置。
input.groups[*].modelIds未经sanitizeModelIds处理,客户端传入如[" "]、重复项、非字符串等会在serializePublicStatusDescription中被清空,但此处仍判为 truthy,从而触发refreshPublicStatusSnapshot({ force: true })与startPublicStatusScheduler();而refreshPublicStatusSnapshot内部检查到无目标后会返回"disabled",导致调度器空转。建议改为依据清洗后的 enabled groups(或
refreshPublicStatusSnapshot的返回值)来决定是否启动调度器;仅当其返回"updated"时才startPublicStatusScheduler(),返回"disabled"时走clearPublicStatusSnapshot()+stopPublicStatusScheduler()分支。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/actions/public-status.ts` around lines 82 - 92, The current check uses hasConfiguredTargets computed from the raw input.groups (input.groups.some(...)) which can be truthy for unclean values; instead compute enabled/clean groups using sanitizeModelIds (or call the same sanitation path used by serializePublicStatusDescription) or rely on the return value of refreshPublicStatusSnapshot to decide actions: call refreshPublicStatusSnapshot({force:true}) unconditionally, capture its result and only call startPublicStatusScheduler() when the result === "updated"; if it returns "disabled" (or no enabled targets), call clearPublicStatusSnapshot() and stopPublicStatusScheduler() to avoid an empty scheduler; ensure references to hasConfiguredTargets/input.groups/sanitizeModelIds/serializePublicStatusDescription/refreshPublicStatusSnapshot/startPublicStatusScheduler/clearPublicStatusSnapshot/stopPublicStatusScheduler are updated accordingly.
71-74:⚠️ Potential issue | 🔴 Critical关键:
publicStatusWindowHours/publicStatusAggregationIntervalMinutes未做服务端数值校验。来自客户端的两个数值直接写库,绕过了
src/lib/validation/schemas.ts中已存在的UpdateSystemSettingsSchema(min(1).max(168)/min(1).max(60))。当publicStatusAggregationIntervalMinutes === 0时,src/lib/public-status/aggregation.ts:208的Array.from({ length: Infinity })会抛出RangeError,调度器随之崩溃;NaN / 负值同样会导致未定义行为。建议在进入
updateSystemSettings之前,复用兄弟 Action(src/actions/system-config.ts:96中saveSystemSettings)的做法,用UpdateSystemSettingsSchema.pick({...}).parse(...)统一校验与清洗;或至少做一次Math.max/min/trunc的 clamp。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/actions/public-status.ts` around lines 71 - 74, The two client-provided numeric fields publicStatusWindowHours and publicStatusAggregationIntervalMinutes are written directly via updateSystemSettings without server-side validation; validate and sanitize them first by reusing the existing UpdateSystemSettingsSchema (use UpdateSystemSettingsSchema.pick({ publicStatusWindowHours: true, publicStatusAggregationIntervalMinutes: true }).parse(input) ) and pass the parsed values to updateSystemSettings, or at minimum clamp/truncate the values (e.g. Math.trunc and Math.max/Math.min to enforce the existing ranges) before calling updateSystemSettings to prevent Infinity/NaN/negative inputs from reaching aggregation logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/actions/public-status.ts`:
- Around line 43-80: The current save flow performs updateSystemSettings(...)
then multiple updateProviderGroup(...) calls separately, risking partial
commits; instead, after validating and computing nextDescription for each group
(using parsePublicStatusDescription and serializePublicStatusDescription) and
populating groupUpdates, perform the writes inside a single db.transaction():
within the transaction call execute updateSystemSettings(...) and then apply all
updateProviderGroup(...) updates (or a batch update) so either all succeed or
all roll back; ensure validation/length checks remain outside/precede the
transaction and propagate errors to abort the transaction on failure.
- Around line 82-92: The current check uses hasConfiguredTargets computed from
the raw input.groups (input.groups.some(...)) which can be truthy for unclean
values; instead compute enabled/clean groups using sanitizeModelIds (or call the
same sanitation path used by serializePublicStatusDescription) or rely on the
return value of refreshPublicStatusSnapshot to decide actions: call
refreshPublicStatusSnapshot({force:true}) unconditionally, capture its result
and only call startPublicStatusScheduler() when the result === "updated"; if it
returns "disabled" (or no enabled targets), call clearPublicStatusSnapshot() and
stopPublicStatusScheduler() to avoid an empty scheduler; ensure references to
hasConfiguredTargets/input.groups/sanitizeModelIds/serializePublicStatusDescription/refreshPublicStatusSnapshot/startPublicStatusScheduler/clearPublicStatusSnapshot/stopPublicStatusScheduler
are updated accordingly.
- Around line 71-74: The two client-provided numeric fields
publicStatusWindowHours and publicStatusAggregationIntervalMinutes are written
directly via updateSystemSettings without server-side validation; validate and
sanitize them first by reusing the existing UpdateSystemSettingsSchema (use
UpdateSystemSettingsSchema.pick({ publicStatusWindowHours: true,
publicStatusAggregationIntervalMinutes: true }).parse(input) ) and pass the
parsed values to updateSystemSettings, or at minimum clamp/truncate the values
(e.g. Math.trunc and Math.max/Math.min to enforce the existing ranges) before
calling updateSystemSettings to prevent Infinity/NaN/negative inputs from
reaching aggregation logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f3915cd6-ba95-4454-9a87-a6848a283e1c
📒 Files selected for processing (6)
drizzle/0092_concerned_alice.sqldrizzle/meta/0092_snapshot.jsondrizzle/meta/_journal.jsonsrc/actions/public-status.tssrc/drizzle/schema.tssrc/lib/public-status/aggregation.ts
✅ Files skipped from review due to trivial changes (3)
- drizzle/0092_concerned_alice.sql
- drizzle/meta/_journal.json
- src/lib/public-status/aggregation.ts
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e66e0b231a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const redis = getRedisClient({ allowWhenRateLimitDisabled: true }); | ||
| if (!redis || redis.status !== "ready") { | ||
| await stopPublicStatusScheduler(); | ||
| return; |
There was a problem hiding this comment.
Retry scheduler init when Redis is temporarily unavailable
This initialization path permanently disables the public-status scheduler when Redis is not ready at boot, and then returns without any retry. I checked the call sites (src/instrumentation.ts startup hooks and src/actions/public-status.ts save action), so if Redis comes up after startup and no admin re-saves settings, snapshots will never refresh despite existing configured groups.
Useful? React with 👍 / 👎.
| if (hasConfiguredTargets) { | ||
| await refreshPublicStatusSnapshot({ force: true }); | ||
| startPublicStatusScheduler(); | ||
| } else { |
There was a problem hiding this comment.
Start scheduler on all nodes after enabling status page
Enabling public status only starts the scheduler in the single process that handled this action. Other instances remain stopped because they only run initializePublicStatusScheduler() during startup, so in multi-instance deployments the feature can silently stop updating when that one node is rotated or crashes.
Useful? React with 👍 / 👎.
| } | ||
| } | ||
|
|
||
| await updateSystemSettings({ |
There was a problem hiding this comment.
[HIGH] [TYPE-MISSING-VALIDATION] Public status numeric settings are persisted without server-side validation
Why this is a problem: publicStatusWindowHours / publicStatusAggregationIntervalMinutes are written as-is:
await updateSystemSettings({
publicStatusWindowHours: input.publicStatusWindowHours,
publicStatusAggregationIntervalMinutes: input.publicStatusAggregationIntervalMinutes,
});If the client submits 0/NaN (e.g. Number("")), downstream aggregation divides by bucketMinutes and can throw RangeError: Invalid array length, causing the scheduler to fail every cycle until the DB value is corrected.
Suggested fix:
import { extractZodErrorCode, formatZodError } from "@/lib/utils/zod-i18n";
import { UpdateSystemSettingsSchema } from "@/lib/validation/schemas";
const settingsResult = UpdateSystemSettingsSchema.pick({
publicStatusWindowHours: true,
publicStatusAggregationIntervalMinutes: true,
}).safeParse(input);
if (!settingsResult.success) {
return {
ok: false,
error: formatZodError(settingsResult.error),
errorCode: extractZodErrorCode(settingsResult.error),
};
}
await updateSystemSettings(settingsResult.data);| return; | ||
| } | ||
|
|
||
| await redis.set(SNAPSHOT_CACHE_KEY, JSON.stringify(record)); |
There was a problem hiding this comment.
[HIGH] [LOGIC-BUG] Redis snapshot has no TTL; stale public data can persist after disabling
Why this is a problem: the snapshot is written without an expiry:
await redis.set(SNAPSHOT_CACHE_KEY, JSON.stringify(record));But clearPublicStatusSnapshot() is best-effort and returns early when Redis is not ready. If an admin disables public status while Redis is briefly unavailable, the old snapshot key can remain and /api/public-status will keep returning 200 with stale data, and the scheduler is stopped so nothing retries the deletion.
Suggested fix:
const ttlSeconds = Math.max(60, record.payload.windowHours * 2 * 60 * 60);
await redis.setex(SNAPSHOT_CACHE_KEY, ttlSeconds, JSON.stringify(record));
🧪 测试结果
总体结果: ✅ 所有测试通过 |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@src/app/`[locale]/settings/status-page/_components/public-status-settings-form.tsx:
- Around line 174-178: The Badge currently shows the wrong i18n key when a group
is disabled: inside the Badge rendering conditional that checks group.enabled
(in the public-status-settings-form component) replace the false branch
t("statusPage.form.groupName") with a dedicated disabled-state key such as
t("statusPage.form.disabled") (or an agreed key name like
statusPage.form.disabledLabel), and then add the corresponding translations for
that key across all 5 language files (zh-CN, zh-TW, en, ja, ru) so the disabled
state displays the correct localized text.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7a9090d9-998b-4a6d-b01d-ede68548679f
📒 Files selected for processing (4)
src/actions/public-status.tssrc/app/[locale]/settings/status-page/_components/public-status-settings-form.tsxsrc/repository/public-status-snapshot.tstests/unit/actions/public-status-save.test.ts
✅ Files skipped from review due to trivial changes (1)
- tests/unit/actions/public-status-save.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- src/repository/public-status-snapshot.ts
- src/actions/public-status.ts
| <Badge variant={group.enabled ? "default" : "outline"}> | ||
| {group.enabled | ||
| ? t("statusPage.form.enabled") | ||
| : t("statusPage.form.groupName")} | ||
| </Badge> |
There was a problem hiding this comment.
禁用状态的 Badge 文案 key 看起来用错了。
group.enabled === false 时这里会渲染 statusPage.form.groupName,用户看到的很可能是“组名称”而不是“已禁用/未启用”。建议改为专门的禁用状态翻译 key,并确保 5 个语言文件都补齐。
建议修改
{group.enabled
? t("statusPage.form.enabled")
- : t("statusPage.form.groupName")}
+ : t("statusPage.form.disabled")}As per coding guidelines, All user-facing strings must use i18n (5 languages supported: zh-CN, zh-TW, en, ja, ru). Never hardcode display text
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@src/app/`[locale]/settings/status-page/_components/public-status-settings-form.tsx
around lines 174 - 178, The Badge currently shows the wrong i18n key when a
group is disabled: inside the Badge rendering conditional that checks
group.enabled (in the public-status-settings-form component) replace the false
branch t("statusPage.form.groupName") with a dedicated disabled-state key such
as t("statusPage.form.disabled") (or an agreed key name like
statusPage.form.disabledLabel), and then add the corresponding translations for
that key across all 5 language files (zh-CN, zh-TW, en, ja, ru) so the disabled
state displays the correct localized text.
| } | ||
|
|
||
| return parsed; | ||
| } catch { |
There was a problem hiding this comment.
[HIGH] [ERROR-SILENT] readSnapshotFromRedis swallows Redis/JSON errors without logging or cleanup
Why this is a problem: this catch block drops all context:
} catch {
return null;
}If redis.get or JSON.parse fails, the public API will return 404 PUBLIC_STATUS_NOT_CONFIGURED with no server-side signal, and a malformed cached value can keep failing until it expires.
Suggested fix:
import { logger } from "@/lib/logger";
let raw: string | null;
try {
raw = await redis.get(SNAPSHOT_CACHE_KEY);
} catch (error) {
logger.warn("[PublicStatus] Redis get snapshot failed", {
error: error instanceof Error ? error.message : String(error),
});
return null;
}
if (!raw) return null;
let parsed: unknown;
try {
parsed = JSON.parse(raw);
} catch (error) {
logger.warn("[PublicStatus] Snapshot JSON invalid; clearing key", {
error: error instanceof Error ? error.message : String(error),
});
await redis.del(SNAPSHOT_CACHE_KEY);
return null;
}There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 06d3727d57
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| })) | ||
| ); | ||
|
|
||
| setConfiguredPublicStatusGroupsCache(enabledGroups); |
There was a problem hiding this comment.
Do not refresh public-status cache timestamp on cache hits
refreshPublicStatusSnapshot always calls setConfiguredPublicStatusGroupsCache(enabledGroups) even when enabledGroups came from getConfiguredPublicStatusGroupsOnlyCache(). Because the scheduler runs every 30s and TTL is 60s, this line keeps extending cachedConfiguredGroupsAt, so the DB is never re-read and group/model exposure can stay stale indefinitely on long-lived nodes (for example after provider-group edits or when another instance processes the status settings save).
Useful? React with 👍 / 👎.
| for (const group of allGroups) { | ||
| const existing = parsePublicStatusDescription(group.description); | ||
| const configured = enabledByName.get(group.name); | ||
| const nextDescription = serializePublicStatusDescription({ |
There was a problem hiding this comment.
Keep plain provider descriptions unchanged when status is disabled
This always reserializes every group description through serializePublicStatusDescription, so legacy/plain-text descriptions get rewritten into JSON ({"note":...}) even when the group has no public-status config. That mutates unrelated provider metadata on save and can cause false validation failures near the 500-char DB limit because JSON wrapping adds extra characters. Consider preserving the original description when configured is absent and no existing structured public-status payload exists.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Code Review Summary
This is an XL PR (52 files, 12,011 lines changed) that adds a new public surface (/api/public-status, /[locale]/status) plus scheduler + Redis snapshotting. The primary remaining risk is silent error swallowing in the Redis snapshot read path, which can mask Redis/JSON failures and misreport the feature as “not configured”.
PR Size: XL
- Lines changed: 12011
- Files changed: 52
- Split suggestions:
- PR 1: DB/schema/migrations/indexes + type updates
- PR 2: Core aggregation + scheduler + Redis snapshot service
- PR 3: Public UI routes/components + i18n messages
- PR 4: Tests aligned to each PR above
Issues Found
| Category | Critical | High | Medium | Low |
|---|---|---|---|---|
| Logic/Bugs | 0 | 0 | 0 | 0 |
| Security | 0 | 0 | 0 | 0 |
| Error Handling | 0 | 1 | 0 | 0 |
| Types | 0 | 0 | 0 | 0 |
| Comments/Docs | 0 | 0 | 0 | 0 |
| Tests | 0 | 0 | 0 | 0 |
| Simplification | 0 | 0 | 0 | 0 |
Critical Issues (Must Fix)
None
High Priority Issues (Should Fix)
- [HIGH] [ERROR-SILENT]
src/repository/public-status-snapshot.ts:37swallows Redis/JSON errors without logging/cleanup (Confidence: 90)
Review Coverage
- Logic and correctness
- Security (OWASP Top 10)
- Error handling
- Type safety
- Documentation accuracy
- Test coverage
- Code clarity
Automated review by Codex AI
There was a problem hiding this comment.
- Reviewed PR
#1055(head commit06d3727d576f76896744c9a4b35aef7ae8d99764) and appliedsize/XL(12,011 lines changed; 52 files). - Posted inline review comments:
src/actions/public-status.ts:71(validation) comment id3116509455src/repository/public-status-snapshot.ts:46(snapshot TTL) comment id3116511317src/repository/public-status-snapshot.ts:37(silent catch) comment id3116536133
- Submitted the required summary review via
gh pr review --comment(remaining issue called out there:src/repository/public-status-snapshot.ts:37swallowing Redis/JSON errors). - Note: initial
gh apiinline-comment requests failed with HTTP 422 until addingAccept: application/vnd.github+json; retries succeeded.
🧪 测试结果
总体结果: ✅ 所有测试通过 |
| : labels.noData} | ||
| </Badge> | ||
| <span className="font-mono text-[10px] text-muted-foreground"> | ||
| {new Date(segment.bucketStart).toLocaleString()} |
There was a problem hiding this comment.
SSR/hydration mismatch from
toLocaleString()
new Date(segment.bucketStart).toLocaleString() is called in a "use client" component that is still server-side rendered by Next.js for the initial HTML. The server uses its system locale/timezone while the browser uses the user's locale, causing React hydration mismatches (Text content did not match). The same call on line 110 of the parent server component is safe because server components don't hydrate.
Pass a stable locale and timezone option, or suppress hydration on this element:
| {new Date(segment.bucketStart).toLocaleString()} | |
| {new Date(segment.bucketStart).toLocaleString("en-US", { timeZone: "UTC" })} |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/[locale]/status/_components/public-status-timeline.tsx
Line: 186
Comment:
**SSR/hydration mismatch from `toLocaleString()`**
`new Date(segment.bucketStart).toLocaleString()` is called in a `"use client"` component that is still server-side rendered by Next.js for the initial HTML. The server uses its system locale/timezone while the browser uses the user's locale, causing React hydration mismatches (`Text content did not match`). The same call on line 110 of the parent server component is safe because server components don't hydrate.
Pass a stable locale and timezone option, or suppress hydration on this element:
```suggestion
{new Date(segment.bucketStart).toLocaleString("en-US", { timeZone: "UTC" })}
```
How can I resolve this? If you propose a fix, please make it concise.| await updateSystemSettings({ | ||
| publicStatusWindowHours: validatedSettings.publicStatusWindowHours, | ||
| publicStatusAggregationIntervalMinutes: | ||
| validatedSettings.publicStatusAggregationIntervalMinutes, | ||
| }); | ||
|
|
||
| for (const groupUpdate of groupUpdates) { | ||
| await updateProviderGroup(groupUpdate.id, { | ||
| description: groupUpdate.description, | ||
| }); | ||
| } |
There was a problem hiding this comment.
System settings and group description updates are not atomic
updateSystemSettings is committed at line 94 before the per-group description updates begin. If any updateProviderGroup call fails partway through, the aggregation settings (window hours / interval) are already persisted with their new values but some provider groups still carry the old public-status configuration. The snapshot refreshed immediately after (refreshPublicStatusSnapshot({ force: true })) will then aggregate with mismatched settings vs. group targets.
Consider wrapping both writes in a single DB transaction, or at minimum move updateSystemSettings after all group updates succeed.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/actions/public-status.ts
Line: 94-104
Comment:
**System settings and group description updates are not atomic**
`updateSystemSettings` is committed at line 94 before the per-group description updates begin. If any `updateProviderGroup` call fails partway through, the aggregation settings (window hours / interval) are already persisted with their new values but some provider groups still carry the old public-status configuration. The snapshot refreshed immediately after (`refreshPublicStatusSnapshot({ force: true })`) will then aggregate with mismatched settings vs. group targets.
Consider wrapping both writes in a single DB transaction, or at minimum move `updateSystemSettings` after all group updates succeed.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 69195462d7
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const itemGroups = Array.from(new Set(parseProviderGroups(item.groupTag))); | ||
| if (itemGroups.length === 0) { | ||
| continue; |
There was a problem hiding this comment.
Treat null provider group tags as default during aggregation
Requests routed through providers with groupTag = null are dropped from status metrics because parseProviderGroups(item.groupTag) returns an empty array and the loop immediately continues. In this codebase, null provider tags are treated as the default group in request routing (checkProviderGroupMatch in provider-selector.ts), so a status page configured for the default group will incorrectly show no data/low availability even when traffic exists. Add the same null→default normalization here before computing group outcomes.
Useful? React with 👍 / 👎.
Summary
Adds a public status page feature that exposes provider group uptime metrics via a public route (
/api/public-status) and a visual status page (/[locale]/status). The feature aggregates historicalmessage_requestdata into Redis snapshots on a scheduled interval, keeping the API disabled (404) until at least one provider group/model is explicitly configured for public exposure.Architecture Overview
Changes
Core Infrastructure
public_status_window_hours(default: 24) andpublic_status_aggregation_interval_minutes(default: 5) to system_settingssrc/lib/public-status/) - Computes TTFB, TPS, and availability from historical request dataPublic API
GET /api/public-status- Returns 404PUBLIC_STATUS_NOT_CONFIGUREDwhen disabled, 200 with snapshot data when enabledUI Components
src/app/[locale]/status/) - Public-facing page with timeline visualization, i18n support, and theme adaptationsrc/app/[locale]/settings/status-page/) - Configure which provider groups/models to expose publiclyi18n (5 languages)
Tests (5 new test files)
tests/unit/lib/public-status-aggregation.test.ts- 224 linestests/unit/lib/public-status-config.test.ts- 80 linestests/unit/lib/public-status-scheduler.test.ts- 101 linestests/unit/lib/public-status-service.test.ts- 140 linestests/unit/api/public-status-route.test.ts- 50 linesConfiguration
To enable the public status page:
public_status_window_hours(default: 24) - time window for metrics aggregationpublic_status_aggregation_interval_minutes(default: 5) - how often to refresh metricsThe feature is disabled by default until at least one group/model is explicitly configured.
Validation
bun run typecheck✓bun run lint✓ (passes with existing repo-wide warnings/info only)bun run build✓bun run test✓Local Smoke Test Results
GET /api/public-statusreturns404 {\"errorCode\":\"PUBLIC_STATUS_NOT_CONFIGURED\"}GET /api/public-statusreturns200/en/statusSSR HTML containsAlpha Public,GPT-4.1,TTFB,TPS,Availability,OpenAI Compatible,OperationalScreenshot
Description enhanced by Claude AI
Greptile Summary
This PR introduces a full public uptime status page feature: a background aggregation engine reading
message_requesthistory into Redis snapshots, a public/api/public-statusendpoint, a visual/[locale]/statuspage, and an admin settings tab — all gated behind explicit provider-group configuration. Prior review concerns (input validation, missing TTL) are resolved in the follow-up commits.Two remaining issues worth addressing:
toLocaleString()inPublicStatusTimeline(a\"use client\"component that is still SSR'd) will produce React hydration mismatches when server and browser locales differ.updateSystemSettingsin the save action is committed before the per-group description loop, leaving aggregation settings and group targets inconsistent if any individual group update fails.Confidence Score: 4/5
Safe to merge after fixing the hydration mismatch and the non-atomic save action
Two P1 findings remain: a React hydration error caused by locale-unqualified toLocaleString() in a SSR'd client component, and a partial-write risk in the settings save action when group updates are committed after system settings. Both are straightforward to fix. All prior P0/P1 concerns (missing validation, missing Redis TTL) are already resolved.
src/app/[locale]/status/_components/public-status-timeline.tsx (hydration), src/actions/public-status.ts (non-atomic writes)
Important Files Changed
Sequence Diagram
sequenceDiagram participant Admin participant SaveAction as savePublicStatusSettings participant DB as Database participant Service as refreshPublicStatusSnapshot participant Redis participant PublicAPI as GET /api/public-status participant Scheduler as PublicStatusScheduler (30s tick) participant Browser as Status Page (/status) Admin->>SaveAction: configure groups + settings SaveAction->>DB: updateSystemSettings (window/interval) SaveAction->>DB: updateProviderGroup × N (descriptions) SaveAction->>Service: force refresh Service->>DB: aggregatePublicStatusSnapshot Service->>Redis: saveSnapshot (EX 7d) SaveAction->>Scheduler: startPublicStatusScheduler() loop every 30s Scheduler->>Service: refreshPublicStatusSnapshot() Service->>Redis: check freshness (bucketMinutes interval) alt due Service->>DB: aggregate Service->>Redis: saveSnapshot else not due Service-->>Scheduler: skipped end end Browser->>PublicAPI: GET /api/public-status PublicAPI->>Redis: getPublicStatusSnapshot Redis-->>PublicAPI: snapshot or null alt snapshot exists PublicAPI-->>Browser: 200 + payload else no snapshot PublicAPI-->>Browser: 404 PUBLIC_STATUS_NOT_CONFIGURED endPrompt To Fix All With AI
Reviews (4): Last reviewed commit: "chore: merge public status migrations" | Re-trigger Greptile