修复多 Key 并发下 User 并发 Session 上限可能被击穿#776
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthrough新增原子化 Key/User 并发会话“检查并跟踪”路径,迁移 sessionId 生成与 active-session key 工具,更新相关 Lua 脚本、服务与测试。(≤50 字) Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @tesgth032, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! 此拉取请求旨在解决在多 Key 并发场景下,用户并发会话上限可能被击穿的问题。通过引入一个原子性的 Redis Lua 脚本,实现了对 Key 和 User 级别并发会话的同步检查和追踪,确保了并发限制的准确性和健壮性,从而防止了在高峰期因竞态条件导致的限额超标。 Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
| redis.call('ZADD', global_key, now, session_id) | ||
| redis.call('ZADD', key_key, now, session_id) | ||
| redis.call('ZADD', user_key, now, session_id) |
There was a problem hiding this comment.
session always tracked in global_key even when both limits are 0
when key_limit <= 0 and user_limit <= 0, the TypeScript code returns early without calling the Lua script (service.ts:574-576). but if only one limit is 0 and the other is positive, the script still executes and unconditionally adds the session to global_key even though it shouldn't participate in limit enforcement per the comment on line 80.
this can cause global_key to accumulate sessions that should only be tracked by SessionTracker.trackSession() when no concurrent limits are active. while global_key is only for observability (Sessions page), this creates inconsistent tracking behavior.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/redis/lua-scripts.ts
Line: 139:141
Comment:
session always tracked in `global_key` even when both limits are 0
when `key_limit <= 0` and `user_limit <= 0`, the TypeScript code returns early without calling the Lua script (service.ts:574-576). but if only one limit is 0 and the other is positive, the script still executes and unconditionally adds the session to `global_key` even though it shouldn't participate in limit enforcement per the comment on line 80.
this can cause `global_key` to accumulate sessions that should only be tracked by `SessionTracker.trackSession()` when no concurrent limits are active. while `global_key` is only for observability (Sessions page), this creates inconsistent tracking behavior.
How can I resolve this? If you propose a fix, please make it concise.| local key_count = current_key_count | ||
| local key_tracked = 0 | ||
| if not is_tracked_key then | ||
| key_count = key_count + 1 | ||
| key_tracked = 1 | ||
| end | ||
|
|
||
| local user_count = current_user_count | ||
| local user_tracked = 0 | ||
| if not is_tracked_user then | ||
| user_count = user_count + 1 | ||
| user_tracked = 1 | ||
| end |
There was a problem hiding this comment.
count calculation handles partial tracking correctly - session in key but not user is valid state
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/redis/lua-scripts.ts
Line: 151:163
Comment:
count calculation handles partial tracking correctly - session in key but not user is valid state
How can I resolve this? If you propose a fix, please make it concise.| const hasConcurrentSessionLimit = | ||
| (session.authState?.key?.limitConcurrentSessions ?? 0) > 0 || | ||
| (session.authState?.user?.limitConcurrentSessions ?? 0) > 0; |
There was a problem hiding this comment.
logic doesn't account for inherited key limit from user - causes double tracking
when key.limitConcurrentSessions = 0 but user.limitConcurrentSessions > 0, this evaluates to false (no concurrent limit), but rate-limit-guard.ts:123-126 resolves effectiveKeyConcurrentLimit to inherit the user limit. session gets tracked by both SessionTracker.trackSession() here and the Lua script in rate-limit-guard, leading to duplicate entries in Redis ZSETs.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/session-guard.ts
Line: 142:144
Comment:
logic doesn't account for inherited key limit from user - causes double tracking
when `key.limitConcurrentSessions = 0` but `user.limitConcurrentSessions > 0`, this evaluates to `false` (no concurrent limit), but `rate-limit-guard.ts:123-126` resolves `effectiveKeyConcurrentLimit` to inherit the user limit. session gets tracked by both `SessionTracker.trackSession()` here and the Lua script in rate-limit-guard, leading to duplicate entries in Redis ZSETs.
How can I resolve this? If you propose a fix, please make it concise.| @@ -197,6 +210,45 @@ export class ProxyRateLimitGuard { | |||
| null | |||
| ); | |||
| } | |||
There was a problem hiding this comment.
fallback to non-atomic checkSessionLimit() reintroduces race condition
when sessionId is missing, this reverts to the check-only approach (no tracking), which is the race condition this PR fixes. if this path executes, concurrent requests can still bypass limits. also, without tracking, the session won't appear in any Redis ZSETs, creating observability gaps.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/rate-limit-guard.ts
Line: 176:212
Comment:
fallback to non-atomic `checkSessionLimit()` reintroduces race condition
when `sessionId` is missing, this reverts to the check-only approach (no tracking), which is the race condition this PR fixes. if this path executes, concurrent requests can still bypass limits. also, without tracking, the session won't appear in any Redis ZSETs, creating observability gaps.
How can I resolve this? If you propose a fix, please make it concise.| const normalizedUserConcurrentLimit = | ||
| typeof user.limitConcurrentSessions === "number" && | ||
| Number.isFinite(user.limitConcurrentSessions) && | ||
| user.limitConcurrentSessions > 0 | ||
| ? Math.floor(user.limitConcurrentSessions) | ||
| : 0; |
There was a problem hiding this comment.
user limit normalization inconsistent with key limit resolution
normalizedUserConcurrentLimit uses inline validation/normalization, while effectiveKeyConcurrentLimit uses resolveKeyConcurrentSessionLimit(). consider extracting normalization logic to a shared helper for consistency.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/rate-limit-guard.ts
Line: 127:132
Comment:
user limit normalization inconsistent with key limit resolution
`normalizedUserConcurrentLimit` uses inline validation/normalization, while `effectiveKeyConcurrentLimit` uses `resolveKeyConcurrentSessionLimit()`. consider extracting normalization logic to a shared helper for consistency.
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.
src/lib/redis/lua-scripts.ts
Outdated
| -- 1. Cleanup expired sessions (TTL window ago) - key/user only | ||
| local cutoff = now - ttl | ||
| redis.call('ZREMRANGEBYSCORE', key_key, '-inf', cutoff) | ||
| redis.call('ZREMRANGEBYSCORE', user_key, '-inf', cutoff) |
There was a problem hiding this comment.
cleanup doesn't include global_key - may accumulate stale sessions
expired sessions are removed from key_key and user_key, but global_key (line 139) isn't cleaned up. while it has TTL (line 146), relying solely on TTL without explicit cleanup can cause global:active_sessions to show inflated counts on the Sessions page until expiration.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/redis/lua-scripts.ts
Line: 114:117
Comment:
cleanup doesn't include `global_key` - may accumulate stale sessions
expired sessions are removed from `key_key` and `user_key`, but `global_key` (line 139) isn't cleaned up. while it has TTL (line 146), relying solely on TTL without explicit cleanup can cause `global:active_sessions` to show inflated counts on the Sessions page until expiration.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Code Review
本次 PR 通过引入 Redis Lua 脚本,实现了对 Key 和 User 并发会话数的原子性“检查与追踪”,从根本上解决了在高并发场景下可能出现的并发上限击穿问题。然而,此实现引入了两个潜在的安全漏洞:由于缺少全局会话集的清理,可能导致 Redis 内存耗尽;以及由于会话 ID 可能受用户输入影响,存在绕过限速的风险。为了解决这些问题,建议添加基于分数的全局会话集清理机制,并确保会话 ID 由服务器生成或进行严格验证。此外,在 rate-limit-guard.ts 的降级逻辑中,处理 Key 和 User 并发限制的错误抛出部分存在代码重复,建议进行重构以提升代码的可维护性。
src/lib/redis/lua-scripts.ts
Outdated
| -- 1. Cleanup expired sessions (TTL window ago) - key/user only | ||
| local cutoff = now - ttl | ||
| redis.call('ZREMRANGEBYSCORE', key_key, '-inf', cutoff) | ||
| redis.call('ZREMRANGEBYSCORE', user_key, '-inf', cutoff) | ||
|
|
||
| -- 2. Check if session is already tracked | ||
| local is_tracked_key = redis.call('ZSCORE', key_key, session_id) | ||
| local is_tracked_user = redis.call('ZSCORE', user_key, session_id) | ||
|
|
||
| -- 3. Get current concurrency counts | ||
| local current_key_count = redis.call('ZCARD', key_key) | ||
| local current_user_count = redis.call('ZCARD', user_key) | ||
|
|
||
| -- 4. Check Key limit (exclude already tracked session) | ||
| if key_limit > 0 and not is_tracked_key and current_key_count >= key_limit then | ||
| return {0, 1, current_key_count, 0, current_user_count, 0} | ||
| end | ||
|
|
||
| -- 5. Check User limit (exclude already tracked session) | ||
| -- Self-heal: 如果 session 已在 key 集合中,则视为“已存在于该 user”,避免误拦截 | ||
| if user_limit > 0 and not (is_tracked_user or is_tracked_key) and current_user_count >= user_limit then | ||
| return {0, 2, current_key_count, 0, current_user_count, 0} | ||
| end | ||
|
|
||
| -- 6. Track session (ZADD updates timestamp for existing members) | ||
| redis.call('ZADD', global_key, now, session_id) | ||
| redis.call('ZADD', key_key, now, session_id) | ||
| redis.call('ZADD', user_key, now, session_id) |
There was a problem hiding this comment.
The new Lua script CHECK_AND_TRACK_KEY_USER_SESSION implements atomic session tracking but fails to perform score-based cleanup on the global:active_sessions ZSET. While key_key and user_key are cleaned up using ZREMRANGEBYSCORE (lines 116-117), the global_key is only appended to (line 139). Although the key has a TTL (line 146), continuous traffic will keep the key alive while its member count grows indefinitely. An attacker can exploit this by sending a large number of requests with unique session IDs to exhaust Redis memory. Additionally, the lack of length validation on the user-provided session ID (extracted in session-guard.ts) increases the risk of rapid memory consumption.
| if key_limit > 0 and not is_tracked_key and current_key_count >= key_limit then | ||
| return {0, 1, current_key_count, 0, current_user_count, 0} | ||
| end | ||
|
|
||
| -- 5. Check User limit (exclude already tracked session) | ||
| -- Self-heal: 如果 session 已在 key 集合中,则视为“已存在于该 user”,避免误拦截 | ||
| if user_limit > 0 and not (is_tracked_user or is_tracked_key) and current_user_count >= user_limit then | ||
| return {0, 2, current_key_count, 0, current_user_count, 0} | ||
| end |
There was a problem hiding this comment.
The rate limiting logic in the Lua script intentionally allows requests for "already tracked" sessions to bypass the concurrency limit (lines 128 and 134) to ensure conversation continuity. However, because the session_id is partially derived from user-controlled input (clientSessionId extracted in src/app/v1/_lib/proxy/session-guard.ts lines 87-92), an attacker can bypass the concurrent session limit by reusing the same session ID for all concurrent requests. This allows a single user to execute an unlimited number of concurrent requests, defeating the purpose of the limitConcurrentSessions control.
src/lib/redis/lua-scripts.ts
Outdated
|
|
||
| -- 5. Check User limit (exclude already tracked session) | ||
| -- Self-heal: 如果 session 已在 key 集合中,则视为“已存在于该 user”,避免误拦截 | ||
| if user_limit > 0 and not (is_tracked_user or is_tracked_key) and current_user_count >= user_limit then |
There was a problem hiding this comment.
[High] [LOGIC-BUG] Self-heal logic in Lua step 5 incorrectly uses is_tracked_key to bypass user concurrency limit
Why this is a problem: The user limit check uses not (is_tracked_user or is_tracked_key) to decide whether to enforce the user concurrency limit. This means if a session exists in the key ZSET but not in the user ZSET (a data inconsistency from a partial write), the user limit check is bypassed entirely.
Concrete scenario: User has user_limit = 2 and already has 2 active sessions in the user ZSET. A new request arrives with a session that happens to be in the key ZSET (from a previous partial write or because the key/user ZSETs drifted). is_tracked_key is truthy, so the user limit check is skipped, and the session gets tracked in step 6 -- now the user has 3 concurrent sessions against a limit of 2.
The key ZSET and user ZSET are independent dimensions (a user can have multiple keys). Presence in the key set should not grant exemption from the user limit.
Suggested fix:
-- 5. Check User limit (exclude already tracked session)
if user_limit > 0 and not is_tracked_user and current_user_count >= user_limit then
return {0, 2, current_key_count, 0, current_user_count, 0}
endStep 6 already unconditionally adds the session to both ZSETs via ZADD, which handles the self-heal (adding the missing user entry) naturally on the next allowed request.
There was a problem hiding this comment.
Code Review Summary
This PR introduces an atomic Lua script (CHECK_AND_TRACK_KEY_USER_SESSION) to fix a race condition where concurrent requests could bypass user-level session limits. The approach is sound -- combining check + track into a single atomic Redis operation is the correct fix for this class of TOCTOU bug. However, the Lua script contains a logic error in its user limit enforcement that can allow the limit to be bypassed.
PR Size: M
- Lines changed: 437 (391 additions + 46 deletions)
- Files changed: 6
Issues Found
| Category | Critical | High | Medium | Low |
|---|---|---|---|---|
| Logic/Bugs | 0 | 1 | 0 | 0 |
| Security | 0 | 0 | 0 | 0 |
| Error Handling | 0 | 0 | 0 | 0 |
| Types | 0 | 0 | 0 | 0 |
| Comments/Docs | 0 | 0 | 0 | 0 |
| Tests | 0 | 0 | 0 | 0 |
| Simplification | 0 | 0 | 0 | 0 |
High Priority Issues (Should Fix)
src/lib/redis/lua-scripts.ts:134-- The self-heal logic in Lua step 5 usesnot (is_tracked_user or is_tracked_key)to gate the user limit check. Theis_tracked_keycondition allows sessions that exist in the key ZSET but not the user ZSET to bypass the user concurrency limit. Since key and user ZSETs are independent dimensions (one user can have multiple keys), presence in the key set should not exempt a session from the user limit. Fix: change tonot is_tracked_user.
Review Coverage
- Logic and correctness
- Security (OWASP Top 10)
- Error handling
- Type safety
- Documentation accuracy
- Test coverage
- Code clarity
Automated review by Claude AI
- Lua:增加 global:active_sessions 的过期清理,避免长时间流量下膨胀\n- RateLimitGuard:sessionId 缺失时兜底生成,并继续走原子 check+track,避免回退到非原子路径\n- Tests:补充 sessionId 缺失用例;稳定 endpoint-circuit-breaker 的动态 import mock
| if (keyLimit <= 0 && userLimit <= 0) { | ||
| return { allowed: true, keyCount: 0, userCount: 0, trackedKey: false, trackedUser: false }; | ||
| } |
There was a problem hiding this comment.
early exit when both limits are 0 prevents Lua execution, but this bypasses global:active_sessions tracking entirely. the Lua script (line 142) always tracks to global_key even when limits are disabled, but this early exit means sessions won't appear in the Sessions page when both limits are 0.
this creates observability inconsistency: sessions with disabled concurrent limits won't be visible in the admin UI, while the PR description states "global 仅用于观测(Sessions 页面)". consider removing this early exit and letting the Lua script handle the no-limit case uniformly.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/rate-limit/service.ts
Line: 574:576
Comment:
early exit when both limits are 0 prevents Lua execution, but this bypasses `global:active_sessions` tracking entirely. the Lua script (line 142) always tracks to `global_key` even when limits are disabled, but this early exit means sessions won't appear in the Sessions page when both limits are 0.
this creates observability inconsistency: sessions with disabled concurrent limits won't be visible in the admin UI, while the PR description states "`global` 仅用于观测(Sessions 页面)". consider removing this early exit and letting the Lua script handle the no-limit case uniformly.
How can I resolve this? If you propose a fix, please make it concise.| const hasConcurrentSessionLimit = | ||
| (session.authState?.key?.limitConcurrentSessions ?? 0) > 0 || | ||
| (session.authState?.user?.limitConcurrentSessions ?? 0) > 0; |
There was a problem hiding this comment.
condition doesn't account for inherited key limits from resolveKeyConcurrentSessionLimit(). when key.limitConcurrentSessions = 0 but user.limitConcurrentSessions > 0, this evaluates to false (no skip), but rate-limit-guard.ts:124-126 resolves effectiveKeyConcurrentLimit to inherit the user limit.
result: SessionTracker.trackSession() fires here (line 147), AND the Lua script tracks again in rate-limit-guard, creating duplicate tracking in both global:active_sessions and the limit-enforcement ZSETs.
| const hasConcurrentSessionLimit = | |
| (session.authState?.key?.limitConcurrentSessions ?? 0) > 0 || | |
| (session.authState?.user?.limitConcurrentSessions ?? 0) > 0; | |
| const effectiveKeyConcurrentLimit = | |
| (session.authState?.key?.limitConcurrentSessions ?? 0) > 0 | |
| ? session.authState.key.limitConcurrentSessions | |
| : session.authState?.user?.limitConcurrentSessions ?? 0; | |
| const hasConcurrentSessionLimit = | |
| effectiveKeyConcurrentLimit > 0 || | |
| (session.authState?.user?.limitConcurrentSessions ?? 0) > 0; |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/session-guard.ts
Line: 142:144
Comment:
condition doesn't account for inherited key limits from `resolveKeyConcurrentSessionLimit()`. when `key.limitConcurrentSessions = 0` but `user.limitConcurrentSessions > 0`, this evaluates to `false` (no skip), but rate-limit-guard.ts:124-126 resolves `effectiveKeyConcurrentLimit` to inherit the user limit.
result: `SessionTracker.trackSession()` fires here (line 147), AND the Lua script tracks again in rate-limit-guard, creating duplicate tracking in both `global:active_sessions` and the limit-enforcement ZSETs.
```suggestion
const effectiveKeyConcurrentLimit =
(session.authState?.key?.limitConcurrentSessions ?? 0) > 0
? session.authState.key.limitConcurrentSessions
: session.authState?.user?.limitConcurrentSessions ?? 0;
const hasConcurrentSessionLimit =
effectiveKeyConcurrentLimit > 0 ||
(session.authState?.user?.limitConcurrentSessions ?? 0) > 0;
```
How can I resolve this? If you propose a fix, please make it concise.| // 理论上 session guard 一定会分配 sessionId;这里兜底生成,避免降级回非原子路径 | ||
| const ensuredSessionId = session.sessionId ?? SessionManager.generateSessionId(); | ||
| if (!session.sessionId) { | ||
| session.setSessionId(ensuredSessionId); | ||
| } |
There was a problem hiding this comment.
generating sessionId here as fallback is good defensive programming, but relying on this fallback undermines the atomic guarantee. if session.sessionId is missing at this point, it means SessionGuard didn't run or failed silently.
the Lua script will track this late-generated sessionId, but any prior request in the same logical session won't be deduplicated (different sessionId = treated as separate session). this can still allow concurrent limit bypass if multiple requests from the same client session arrive before any get a sessionId.
consider logging a warning when fallback triggers to detect upstream issues:
| // 理论上 session guard 一定会分配 sessionId;这里兜底生成,避免降级回非原子路径 | |
| const ensuredSessionId = session.sessionId ?? SessionManager.generateSessionId(); | |
| if (!session.sessionId) { | |
| session.setSessionId(ensuredSessionId); | |
| } | |
| const ensuredSessionId = session.sessionId ?? SessionManager.generateSessionId(); | |
| if (!session.sessionId) { | |
| logger.warn('[RateLimit] SessionId missing in rate-limit-guard, using fallback (potential atomicity gap)'); | |
| session.setSessionId(ensuredSessionId); | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/rate-limit-guard.ts
Line: 136:140
Comment:
generating sessionId here as fallback is good defensive programming, but relying on this fallback undermines the atomic guarantee. if `session.sessionId` is missing at this point, it means `SessionGuard` didn't run or failed silently.
the Lua script will track this late-generated sessionId, but any prior request in the same logical session won't be deduplicated (different sessionId = treated as separate session). this can still allow concurrent limit bypass if multiple requests from the same client session arrive before any get a sessionId.
consider logging a warning when fallback triggers to detect upstream issues:
```suggestion
const ensuredSessionId = session.sessionId ?? SessionManager.generateSessionId();
if (!session.sessionId) {
logger.warn('[RateLimit] SessionId missing in rate-limit-guard, using fallback (potential atomicity gap)');
session.setSessionId(ensuredSessionId);
}
```
How can I resolve this? If you propose a fix, please make it concise.
src/lib/redis/lua-scripts.ts
Outdated
| -- Self-heal: 如果 session 已在同一个 key 的集合中,则可视为该 user 的“已存在会话”,避免因为 user 集合缺失 | ||
| -- 单条 member 而误拦截(该脚本后续会通过 ZADD 补齐 user 集合)。 | ||
| if user_limit > 0 and not (is_tracked_user or is_tracked_key) and current_user_count >= user_limit then | ||
| return {0, 2, current_key_count, 0, current_user_count, 0} | ||
| end |
There was a problem hiding this comment.
self-healing logic treats "session exists in key_key" as equivalent to "exists in user_key" for User limit check. while this prevents false rejections when ZSETs are out of sync, it allows a session tracked only at Key level to bypass User limit enforcement.
scenario: if user_key ZSET is empty (Redis failure, manual edit), all sessions in key_key pass User limit check via line 137, potentially exceeding User limit. this is a trade-off between availability (prevent false blocks) and correctness (strict limit enforcement).
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/redis/lua-scripts.ts
Line: 135:139
Comment:
self-healing logic treats "session exists in `key_key`" as equivalent to "exists in `user_key`" for User limit check. while this prevents false rejections when ZSETs are out of sync, it allows a session tracked only at Key level to bypass User limit enforcement.
scenario: if `user_key` ZSET is empty (Redis failure, manual edit), all sessions in `key_key` pass User limit check via line 137, potentially exceeding User limit. this is a trade-off between availability (prevent false blocks) and correctness (strict limit enforcement).
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.- 当 sessionId 在 rate-limit-guard 阶段缺失时记录 warn 日志,便于定位异常链路\n- 提示可能存在原子性缺口(SessionGuard 未按预期分配 sessionId)
| const hasConcurrentSessionLimit = | ||
| (session.authState?.key?.limitConcurrentSessions ?? 0) > 0 || | ||
| (session.authState?.user?.limitConcurrentSessions ?? 0) > 0; |
There was a problem hiding this comment.
doesn't account for inherited key limits from resolveKeyConcurrentSessionLimit(). when key.limitConcurrentSessions = 0 but user.limitConcurrentSessions > 0, this evaluates to false (no skip), but rate-limit-guard.ts:124-126 inherits the user limit for key.
result: SessionTracker.trackSession() fires on line 147, AND the Lua script tracks again in rate-limit-guard, creating duplicate tracking.
fix: use resolveKeyConcurrentSessionLimit() here to match rate-limit-guard's logic:
| const hasConcurrentSessionLimit = | |
| (session.authState?.key?.limitConcurrentSessions ?? 0) > 0 || | |
| (session.authState?.user?.limitConcurrentSessions ?? 0) > 0; | |
| const effectiveKeyConcurrentLimit = resolveKeyConcurrentSessionLimit( | |
| session.authState?.key?.limitConcurrentSessions ?? 0, | |
| session.authState?.user?.limitConcurrentSessions | |
| ); | |
| const hasConcurrentSessionLimit = | |
| effectiveKeyConcurrentLimit > 0 || | |
| (session.authState?.user?.limitConcurrentSessions ?? 0) > 0; |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/session-guard.ts
Line: 142:144
Comment:
doesn't account for inherited key limits from `resolveKeyConcurrentSessionLimit()`. when `key.limitConcurrentSessions = 0` but `user.limitConcurrentSessions > 0`, this evaluates to `false` (no skip), but rate-limit-guard.ts:124-126 inherits the user limit for key.
result: `SessionTracker.trackSession()` fires on line 147, AND the Lua script tracks again in rate-limit-guard, creating duplicate tracking.
fix: use `resolveKeyConcurrentSessionLimit()` here to match rate-limit-guard's logic:
```suggestion
const effectiveKeyConcurrentLimit = resolveKeyConcurrentSessionLimit(
session.authState?.key?.limitConcurrentSessions ?? 0,
session.authState?.user?.limitConcurrentSessions
);
const hasConcurrentSessionLimit =
effectiveKeyConcurrentLimit > 0 ||
(session.authState?.user?.limitConcurrentSessions ?? 0) > 0;
```
How can I resolve this? If you propose a fix, please make it concise.| const normalizedUserConcurrentLimit = | ||
| typeof user.limitConcurrentSessions === "number" && | ||
| Number.isFinite(user.limitConcurrentSessions) && | ||
| user.limitConcurrentSessions > 0 | ||
| ? Math.floor(user.limitConcurrentSessions) | ||
| : 0; |
There was a problem hiding this comment.
inline normalization here duplicates logic from normalizePositiveLimit() in concurrent-session-limit.ts. inconsistent approach: effectiveKeyConcurrentLimit uses the helper (line 124-126), normalizedUserConcurrentLimit reinvents it.
extract to shared helper for consistency:
| const normalizedUserConcurrentLimit = | |
| typeof user.limitConcurrentSessions === "number" && | |
| Number.isFinite(user.limitConcurrentSessions) && | |
| user.limitConcurrentSessions > 0 | |
| ? Math.floor(user.limitConcurrentSessions) | |
| : 0; | |
| const normalizedUserConcurrentLimit = normalizePositiveLimit(user.limitConcurrentSessions); |
(requires importing normalizePositiveLimit from @/lib/rate-limit/concurrent-session-limit or exporting it)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/rate-limit-guard.ts
Line: 128:133
Comment:
inline normalization here duplicates logic from `normalizePositiveLimit()` in concurrent-session-limit.ts. inconsistent approach: `effectiveKeyConcurrentLimit` uses the helper (line 124-126), `normalizedUserConcurrentLimit` reinvents it.
extract to shared helper for consistency:
```suggestion
const normalizedUserConcurrentLimit = normalizePositiveLimit(user.limitConcurrentSessions);
```
(requires importing `normalizePositiveLimit` from `@/lib/rate-limit/concurrent-session-limit` or exporting it)
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.| const ensuredSessionId = session.sessionId ?? SessionManager.generateSessionId(); | ||
| if (!session.sessionId) { | ||
| logger.warn( | ||
| `[RateLimit] SessionId missing in rate-limit-guard, using fallback: key=${key.id}, user=${user.id} (potential atomicity gap)` | ||
| ); | ||
| session.setSessionId(ensuredSessionId); | ||
| } |
There was a problem hiding this comment.
fallback sessionId generation undermines atomic guarantee. if SessionGuard didn't run or failed, multiple concurrent requests from the same client session get different sessionIds, treating them as separate sessions.
scenario: 10 concurrent requests arrive, all missing sessionId. each generates unique fallback ID, all pass Lua script's "already tracked" check, bypassing limit (10 sessions counted instead of 1).
current warning is good, but consider logging request metadata (IP, user-agent) to help debug upstream issues
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/rate-limit-guard.ts
Line: 137:143
Comment:
fallback sessionId generation undermines atomic guarantee. if `SessionGuard` didn't run or failed, multiple concurrent requests from the same client session get different sessionIds, treating them as separate sessions.
scenario: 10 concurrent requests arrive, all missing sessionId. each generates unique fallback ID, all pass Lua script's "already tracked" check, bypassing limit (10 sessions counted instead of 1).
current warning is good, but consider logging request metadata (IP, user-agent) to help debug upstream issues
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/app/v1/_lib/proxy/rate-limit-guard.ts`:
- Around line 137-143: The nullish coalescing (??) used when computing
ensuredSessionId does not cover empty-string session.sessionId and so can pass
"" into Redis; change the fallback logic to treat empty strings as missing —
e.g. compute ensuredSessionId by checking session.sessionId for falsy/empty
(e.g. session.sessionId && session.sessionId !== "" or use ||) and then call
SessionManager.generateSessionId() when empty, keeping the subsequent
session.setSessionId(ensuredSessionId) and the logger.warn call as-is
(references: session.sessionId, ensuredSessionId,
SessionManager.generateSessionId, session.setSessionId, logger.warn).
In `@src/lib/redis/lua-scripts.ts`:
- Around line 99-169: The CHECK_AND_TRACK_KEY_USER_SESSION Lua script uses three
separate KEYS (global_key, key_key, user_key) which will cause CROSSSLOT errors
on Redis Cluster; update the script callers and key construction so all three
KEYS share the same hash tag (e.g., include a common {...} tag in the KEYS such
as `{session}` or `{active_sessions}`) to force them into the same slot and
document the required key format, and also fix minor indentation by removing the
extra leading spaces before the comments around the cleanup/check sections (the
lines with the extra space before the comment near the current "// 1.
Cleanup..." and the comment before the user-limit self-heal block) so comment
indentation is consistent.
🧹 Nitpick comments (4)
src/lib/redis/lua-scripts.ts (2)
115-119: 缩进不一致(nit)Line 115、116、134-136 有多余的前导空格,与脚本其他行的缩进风格不统一(对比 line 121-127)。虽然不影响 Lua 执行,但建议统一格式。
建议修复
- -- 1. Cleanup expired sessions (TTL window ago) - local cutoff = now - ttl - redis.call('ZREMRANGEBYSCORE', global_key, '-inf', cutoff) + -- 1. Cleanup expired sessions (TTL window ago) +local cutoff = now - ttl +redis.call('ZREMRANGEBYSCORE', global_key, '-inf', cutoff)Line 134-136 同理。
134-139: Self-heal 逻辑的语义边界值得关注
not (is_tracked_user or is_tracked_key)中的is_tracked_key仅检查当前 key 的 ZSET。如果同一 user 通过 key A 建立了 session,然后通过 key B 发起请求,is_tracked_key(key B 的 ZSET)为 false,is_tracked_user也可能为 false(如果之前未写入 user ZSET),这时 self-heal 不会生效,可能误拦截。注释已说明"同一个 key"的限定,但建议补充一句说明跨 key 场景下依赖
is_tracked_user来保证正确性,避免未来维护者误解此处意图。src/app/v1/_lib/proxy/rate-limit-guard.ts (1)
185-185: 注释编号跳跃:步骤 4 缺失(nit)Line 185 标注为 "5. User RPM",但原来的步骤 4(User 并发 Session 检查)已合并到步骤 3 的原子路径中。建议将 RPM 重新编号为步骤 4,以保持注释的连续性。
tests/unit/proxy/rate-limit-guard.test.ts (1)
5-12:checkSessionLimitmock 已无调用方
rateLimitServiceMock.checkSessionLimit(line 7)和beforeEach中对应的mockResolvedValue(line 123)仍然保留,但 guard 代码已全面切换到checkAndTrackKeyUserSession,checkSessionLimit不再被调用。建议移除以减少维护负担和阅读干扰。建议修复
const rateLimitServiceMock = { checkTotalCostLimit: vi.fn(), - checkSessionLimit: vi.fn(), checkAndTrackKeyUserSession: vi.fn(), checkRpmLimit: vi.fn(), checkCostLimitsWithLease: vi.fn(), checkUserDailyCost: vi.fn(), };同时移除
beforeEach中的:- rateLimitServiceMock.checkSessionLimit.mockResolvedValue({ allowed: true });
| const hasConcurrentSessionLimit = | ||
| (session.authState?.key?.limitConcurrentSessions ?? 0) > 0 || | ||
| (session.authState?.user?.limitConcurrentSessions ?? 0) > 0; |
There was a problem hiding this comment.
doesn't match rate-limit-guard's resolveKeyConcurrentSessionLimit() logic. when key.limitConcurrentSessions = 0 but user.limitConcurrentSessions > 0, this evaluates to false, so SessionTracker.trackSession() fires on line 147. but rate-limit-guard:124-126 inherits the user limit, making the Lua script also track. result: double tracking in ZSETs.
fix: use resolveKeyConcurrentSessionLimit() to match rate-limit-guard:
| const hasConcurrentSessionLimit = | |
| (session.authState?.key?.limitConcurrentSessions ?? 0) > 0 || | |
| (session.authState?.user?.limitConcurrentSessions ?? 0) > 0; | |
| const effectiveKeyConcurrentLimit = resolveKeyConcurrentSessionLimit( | |
| session.authState?.key?.limitConcurrentSessions ?? 0, | |
| session.authState?.user?.limitConcurrentSessions | |
| ); | |
| const hasConcurrentSessionLimit = effectiveKeyConcurrentLimit > 0; |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/session-guard.ts
Line: 142:144
Comment:
doesn't match rate-limit-guard's `resolveKeyConcurrentSessionLimit()` logic. when `key.limitConcurrentSessions = 0` but `user.limitConcurrentSessions > 0`, this evaluates to `false`, so `SessionTracker.trackSession()` fires on line 147. but rate-limit-guard:124-126 inherits the user limit, making the Lua script also track. result: double tracking in ZSETs.
fix: use `resolveKeyConcurrentSessionLimit()` to match rate-limit-guard:
```suggestion
const effectiveKeyConcurrentLimit = resolveKeyConcurrentSessionLimit(
session.authState?.key?.limitConcurrentSessions ?? 0,
session.authState?.user?.limitConcurrentSessions
);
const hasConcurrentSessionLimit = effectiveKeyConcurrentLimit > 0;
```
How can I resolve this? If you propose a fix, please make it concise.| if (keyLimit <= 0 && userLimit <= 0) { | ||
| return { allowed: true, keyCount: 0, userCount: 0, trackedKey: false, trackedUser: false }; | ||
| } |
There was a problem hiding this comment.
early exit breaks observability when both limits are 0. the PR description states "global 仅用于观测(Sessions 页面)", but this prevents Lua script execution, so sessions won't appear in global:active_sessions when concurrent limits are disabled.
result: Sessions page won't show sessions that have no concurrent limits configured, creating inconsistent admin visibility.
consider removing this early exit and letting the Lua script handle the no-limit case (it already guards with line 115-116 and 125-127).
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/rate-limit/service.ts
Line: 579:581
Comment:
early exit breaks observability when both limits are 0. the PR description states "`global` 仅用于观测(Sessions 页面)", but this prevents Lua script execution, so sessions won't appear in `global:active_sessions` when concurrent limits are disabled.
result: Sessions page won't show sessions that have no concurrent limits configured, creating inconsistent admin visibility.
consider removing this early exit and letting the Lua script handle the no-limit case (it already guards with line 115-116 and 125-127).
How can I resolve this? If you propose a fix, please make it concise.| redis.call('ZADD', global_key, now, session_id) | ||
| redis.call('ZADD', key_key, now, session_id) | ||
| redis.call('ZADD', user_key, now, session_id) | ||
|
|
There was a problem hiding this comment.
unconditionally tracks to global_key even when both limits are 0, but service.ts:579-581 early-exits and never calls this script when both limits are 0.
this creates path inconsistency: if only one limit > 0, the script executes and adds to global_key. but if both limits are 0, early exit prevents execution and nothing gets tracked.
observability impact: Sessions page won't show sessions with no concurrent limits, which contradicts the stated goal that "global 仅用于观测".
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/redis/lua-scripts.ts
Line: 146:149
Comment:
unconditionally tracks to `global_key` even when both limits are 0, but service.ts:579-581 early-exits and never calls this script when both limits are 0.
this creates path inconsistency: if only one limit > 0, the script executes and adds to `global_key`. but if both limits are 0, early exit prevents execution and nothing gets tracked.
observability impact: Sessions page won't show sessions with no concurrent limits, which contradicts the stated goal that "`global` 仅用于观测".
How can I resolve this? If you propose a fix, please make it concise.| const ensuredSessionId = session.sessionId || SessionManager.generateSessionId(); | ||
| if (!session.sessionId) { | ||
| logger.warn( | ||
| `[RateLimit] SessionId missing in rate-limit-guard, using fallback: key=${key.id}, user=${user.id} (potential atomicity gap)` | ||
| ); | ||
| session.setSessionId(ensuredSessionId); | ||
| } |
There was a problem hiding this comment.
fallback sessionId generation undermines atomicity guarantee. if multiple concurrent requests from the same client session arrive without sessionId (e.g. SessionGuard failed), each generates unique fallback ID.
scenario: 10 concurrent requests arrive, all missing sessionId. each generates unique ID like sess_gen_1, sess_gen_2, etc. all pass Lua script's "already tracked" check (line 112-113 in lua-scripts.ts), treating them as separate sessions. result: 10 sessions counted instead of 1, bypassing limit.
while the warning on line 139-141 helps debug, this still creates a window for limit bypass.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/rate-limit-guard.ts
Line: 137:143
Comment:
fallback sessionId generation undermines atomicity guarantee. if multiple concurrent requests from the same client session arrive without sessionId (e.g. SessionGuard failed), each generates unique fallback ID.
scenario: 10 concurrent requests arrive, all missing sessionId. each generates unique ID like `sess_gen_1`, `sess_gen_2`, etc. all pass Lua script's "already tracked" check (line 112-113 in lua-scripts.ts), treating them as separate sessions. result: 10 sessions counted instead of 1, bypassing limit.
while the warning on line 139-141 helps debug, this still creates a window for limit bypass.
How can I resolve this? If you propose a fix, please make it concise.| const normalizedUserConcurrentLimit = | ||
| typeof user.limitConcurrentSessions === "number" && | ||
| Number.isFinite(user.limitConcurrentSessions) && | ||
| user.limitConcurrentSessions > 0 | ||
| ? Math.floor(user.limitConcurrentSessions) | ||
| : 0; |
There was a problem hiding this comment.
duplicates normalizePositiveLimit() logic from concurrent-session-limit.ts:7-13. effectiveKeyConcurrentLimit (line 124) uses resolveKeyConcurrentSessionLimit() which internally calls normalizePositiveLimit(), but this reinvents the same logic.
| const normalizedUserConcurrentLimit = | |
| typeof user.limitConcurrentSessions === "number" && | |
| Number.isFinite(user.limitConcurrentSessions) && | |
| user.limitConcurrentSessions > 0 | |
| ? Math.floor(user.limitConcurrentSessions) | |
| : 0; | |
| const normalizedUserConcurrentLimit = normalizePositiveLimit(user.limitConcurrentSessions); |
(requires import { normalizePositiveLimit } from '@/lib/rate-limit/concurrent-session-limit', but that function needs to be exported first)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/rate-limit-guard.ts
Line: 128:133
Comment:
duplicates `normalizePositiveLimit()` logic from `concurrent-session-limit.ts:7-13`. `effectiveKeyConcurrentLimit` (line 124) uses `resolveKeyConcurrentSessionLimit()` which internally calls `normalizePositiveLimit()`, but this reinvents the same logic.
```suggestion
const normalizedUserConcurrentLimit = normalizePositiveLimit(user.limitConcurrentSessions);
```
(requires `import { normalizePositiveLimit } from '@/lib/rate-limit/concurrent-session-limit'`, but that function needs to be exported first)
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.| redis.call('ZREMRANGEBYSCORE', user_key, '-inf', cutoff) | ||
|
|
||
| -- 2. Check if session is already tracked |
There was a problem hiding this comment.
global_key not cleaned up here but is cleaned at line 107. expired sessions removed from key_key and user_key for limit enforcement, but global_key cleanup depends solely on TTL (line 146).
while TTL prevents indefinite growth, Sessions page could show inflated counts until expiration. relying on line 107's cleanup requires the script to be called again, which won't happen if both limits become 0 (service.ts:579-581 early exit).
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/redis/lua-scripts.ts
Line: 122:124
Comment:
`global_key` not cleaned up here but is cleaned at line 107. expired sessions removed from `key_key` and `user_key` for limit enforcement, but `global_key` cleanup depends solely on TTL (line 146).
while TTL prevents indefinite growth, Sessions page could show inflated counts until expiration. relying on line 107's cleanup requires the script to be called again, which won't happen if both limits become 0 (service.ts:579-581 early exit).
How can I resolve this? If you propose a fix, please make it concise.| end | ||
|
|
||
| -- 5. Check User limit (exclude already tracked session) |
There was a problem hiding this comment.
self-healing allows session in key_key but not in user_key to bypass User limit. while this prevents false rejections when ZSETs are out of sync, it sacrifices strict limit enforcement.
scenario: if user_key ZSET is empty (Redis failure/manual deletion), all sessions tracked in key_key pass User limit check via this condition. if User has limitConcurrentSessions = 5 and 20 sessions exist across different keys in key_key ZSETs, all 20 bypass User limit.
this trades correctness (strict limits) for availability (prevent false blocks). consider logging when self-healing triggers to detect ZSET inconsistencies.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/redis/lua-scripts.ts
Line: 135:137
Comment:
self-healing allows session in `key_key` but not in `user_key` to bypass User limit. while this prevents false rejections when ZSETs are out of sync, it sacrifices strict limit enforcement.
scenario: if `user_key` ZSET is empty (Redis failure/manual deletion), all sessions tracked in `key_key` pass User limit check via this condition. if User has `limitConcurrentSessions = 5` and 20 sessions exist across different keys in `key_key` ZSETs, all 20 bypass User limit.
this trades correctness (strict limits) for availability (prevent false blocks). consider logging when self-healing triggers to detect ZSET inconsistencies.
How can I resolve this? If you propose a fix, please make it concise.|
sync: 修复已推送,CI 全绿 |
|
补充一个边界修复:terminate session 时会同步从 user 维度 active_sessions ZSET 移除该 sessionId,避免 Lua 原子并发计数(ZCARD)在 TTL 窗口内把已终止会话继续计入并发,从而误拦截新会话。已新增单测覆盖(SessionManager.terminateSession)。 |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/actions/users.ts (1)
1560-1592:⚠️ Potential issue | 🟡 Minor重置用户统计时应同步清理 user 级 active_sessions
目前只删除 key 维度的 active_sessions。新增的 user 维度 ZSET 若不清理,短时间内仍可能影响并发统计或限额判断。
修复建议
- const { getKeyActiveSessionsKey } = await import("@/lib/redis/active-session-keys"); + const { getKeyActiveSessionsKey, getUserActiveSessionsKey } = await import( + "@/lib/redis/active-session-keys" + ); ... for (const keyId of keyIds) { pipeline.del(getKeyActiveSessionsKey(keyId)); } + pipeline.del(getUserActiveSessionsKey(userId));
🤖 Fix all issues with AI agents
In `@scripts/clear-session-bindings.ts`:
- Around line 35-37: The imports in scripts/clear-session-bindings.ts use
relative paths to src; update them to the project alias by changing the module
references for schema and the active-session-keys module to use the '@/...'
alias (replace imports that reference "../src/drizzle/schema" and
"../src/lib/redis/active-session-keys" with "@/drizzle/schema" and
"@/lib/redis/active-session-keys" respectively) so the file imports the same
symbols (schema, getGlobalActiveSessionsKey, getKeyActiveSessionsKey) via the
path alias.
In `@src/lib/rate-limit/service.ts`:
- Around line 609-623: The returned object currently sets a hardcoded Chinese
reason string; change this to an i18n-friendly error code plus structured
parameters so the caller can localize it. In the block that builds the rejection
response (variables: allowed, rejectTarget, keyLimit, userLimit, keyCount,
userCount, typeLabel), remove the hardcoded reason and instead return a stable
error identifier (e.g. reasonCode: "ERR_CONCURRENT_SESSION_LIMIT") and param
fields (e.g. reasonParams: { target: rejectTarget, typeLabel, count, limit }) or
use the app's i18n helper signature so translation is handled by upper layers;
keep existing fields (allowed, keyCount, userCount, trackedKey, trackedUser,
rejectedBy) intact.
🧹 Nitpick comments (4)
tests/unit/lib/session-ttl-validation.test.ts (1)
92-163: 测试断言建议复用 getGlobalActiveSessionsKey,避免字符串漂移后续 key 规则再次调整时,测试无需逐处修改。
修改建议
+import { getGlobalActiveSessionsKey } from "@/lib/redis/active-session-keys"; ... - expect(redisClientRef.zremrangebyscore).toHaveBeenCalledWith( - "{active_sessions}:global:active_sessions", + const globalKey = getGlobalActiveSessionsKey(); + expect(redisClientRef.zremrangebyscore).toHaveBeenCalledWith( + globalKey, "-inf", expectedCutoff );src/lib/session-tracker.ts (2)
76-121: trackSession 中 WRONGTYPE 自动修复范围有限(已有行为)。Line 108-111:当 pipeline 中任意命令出现
WRONGTYPE时,仅调用initialize()清理全局 key。如果类型冲突发生在keyZSetKey或userZSetKey上,并不会被修复,下次调用仍会报错。这是已有行为,本次迁移未引入新问题,但建议后续完善自动修复逻辑以覆盖所有 ZSET key。
179-234: refreshSession 中 globalKey / keyZSetKey 未刷新 expire(已有行为)。Line 196-197:对
globalKey和keyZSetKey执行了 ZADD 刷新时间戳,但未像trackSession(Line 88, 92)那样设置expire。providerZSetKey则有设置(Line 200)。虽然 session 活跃期间会被trackSession重新设置 TTL,但若只有refreshSession被调用,这些 key 可能缺少兜底 TTL 而无限存活。这是迁移前就存在的行为,建议后续补齐。src/app/v1/_lib/proxy/rate-limit-guard.ts (1)
36-48: 顶部注释编号与实际逻辑略有偏差。注释 Line 39-41 描述检查顺序为 "3-5. Key 并发 → User 并发 → User RPM",但实际实现中 Key 和 User 并发已合并为一次
checkAndTrackKeyUserSession原子调用(Line 145),不再是两步独立检查。后续 RPM 注释(Line 185)也标注为 "4." 而非 "5."。建议更新注释使编号与实现一致。
|
已按 AI review 建议调整 Lua step 5:User 并发判断只以 user ZSET 是否已追踪为准(去掉 is_tracked_key 自愈分支),避免 key ZSET 存在但 user ZSET 缺失时绕过 user 并发上限。 |
| const normalizedUserConcurrentLimit = | ||
| typeof user.limitConcurrentSessions === "number" && | ||
| Number.isFinite(user.limitConcurrentSessions) && | ||
| user.limitConcurrentSessions > 0 | ||
| ? Math.floor(user.limitConcurrentSessions) |
There was a problem hiding this comment.
inline normalization duplicates normalizePositiveLimit() from concurrent-session-limit.ts:7-13
consider using the same helper for consistency with how effectiveKeyConcurrentLimit is resolved
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/rate-limit-guard.ts
Line: 128:132
Comment:
inline normalization duplicates `normalizePositiveLimit()` from `concurrent-session-limit.ts:7-13`
consider using the same helper for consistency with how `effectiveKeyConcurrentLimit` is resolved
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.- resetUserAllStatistics 同步清理 user 级 active_sessions\n- checkAndTrackKeyUserSession 改为返回 reasonCode/reasonParams(便于 i18n)\n- 更新相关单元测试与脚本导入路径
|
已根据 CodeRabbit 的变更请求完成修正:
本地验证:
|
| if (keyLimit <= 0 && userLimit <= 0) { | ||
| return { allowed: true, keyCount: 0, userCount: 0, trackedKey: false, trackedUser: false }; | ||
| } |
There was a problem hiding this comment.
consider observability impact when both limits are 0
when concurrent limits are disabled (keyLimit <= 0 && userLimit <= 0), this returns immediately without calling the Lua script. the Lua script tracks to global:active_sessions, but this early exit means sessions won't appear on the Sessions page when both limits are 0.
if global tracking is needed for observability regardless of limits, consider letting the Lua script handle the no-limit case (it already guards limit enforcement with if key_limit > 0 checks)
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/rate-limit/service.ts
Line: 581:583
Comment:
consider observability impact when both limits are 0
when concurrent limits are disabled (`keyLimit <= 0 && userLimit <= 0`), this returns immediately without calling the Lua script. the Lua script tracks to `global:active_sessions`, but this early exit means sessions won't appear on the Sessions page when both limits are 0.
if global tracking is needed for observability regardless of limits, consider letting the Lua script handle the no-limit case (it already guards limit enforcement with `if key_limit > 0` checks)
How can I resolve this? If you propose a fix, please make it concise.
背景 / 问题
#772 合入后,Key 的
limitConcurrentSessions默认会继承 User 的并发上限,解决了配置/展示不一致的问题。但并发 Session 仍是“先检查(只读计数)→ 后追踪(异步 ZADD)”的两段式流程:在多 Key + 高并发创建新 Session 时,会出现竞态窗口,导致 User 级并发上限被短时间击穿。
关联:
根因
checkSessionLimit()仅做读计数,不负责原子追踪。ProxySessionGuard中 fire-and-forget,多个并发请求可能同时通过检查后再各自追踪。解决方案
CHECK_AND_TRACK_KEY_USER_SESSION:对global/key/user的 active_sessions 做 原子性 的「清理过期 → 检查上限 → 追踪 sessionId」。{active_sessions}:global:active_sessions仅用于 Sessions 页面观测,不参与限额判断;同时在脚本中加入过期清理,避免长时间流量下膨胀。{active_sessions}hash tag,避免 CROSSSLOT。RateLimitService.checkAndTrackKeyUserSession()封装脚本调用与结果解析(Redis 不可用时 Fail Open)。reasonCode/reasonParams(不硬编码文案,交给上层 i18n)。ProxySessionGuard:当启用 Key/User 并发上限时,跳过SessionTracker.trackSession()的预追踪,避免“先追踪再检查”导致击穿。ProxyRateLimitGuard:并发检查统一改为调用checkAndTrackKeyUserSession();若sessionId异常缺失则兜底生成,避免回退到非原子路径。resetUserAllStatistics:同步清理 user 级 active_sessions,避免残留影响并发判断。RPM 复核
本 PR 未改动 RPM(
checkRpmLimit/checkUserRPM)逻辑;RPM 以user:${userId}:rpm_window维度统计,天然覆盖多 Key 同时使用的场景。测试
本地运行:
npm run buildnpm run lintnpm run lint:fixnpm run typechecknpm run test为什么这个 PR 仍然必要(影响与复现)
即使 #772 让 Key 的
limitConcurrentSessions在(解析/展示)层面默认继承 User 上限,旧实现的并发 Session 限制依然存在竞态窗口:并发校验与追踪是两段式流程(读计数 → 异步追踪),在高并发创建新 Session 时无法保证严格上限。复现思路(旧实现):
limitConcurrentSessions设为 15;limitConcurrentSessions保持 0(按 修复 Key 并发限制继承用户并发上限 #772 逻辑继承 User=15);metadata.session_id,或客户端不传 session_id 导致服务端生成不同 session);checkSessionLimit()的只读计数检查,从而放行超过 15 个新 session,随后才陆续异步追踪,导致 user 级 active_sessions 计数瞬时 > 15。影响:
因此需要将「清理过期 → 检查上限 → 追踪」合并为单个 Redis 原子操作,才能保证严格上限。
Greptile Overview
Greptile Summary
This PR fixes a race condition in concurrent session limit enforcement that could allow user-level limits to be bypassed when multiple keys are used simultaneously under high concurrency.
Root Cause
The old implementation used a two-phase check-then-track approach:
checkSessionLimit()performed a read-only count check, followed by asynchronousSessionTracker.trackSession(). Multiple concurrent requests could pass the check before any tracking occurred, causing the user concurrent limit to be exceeded temporarily.Solution
Introduced atomic Redis Lua script
CHECK_AND_TRACK_KEY_USER_SESSIONthat combines cleanup, checking, and tracking in a single atomic operation:{active_sessions}hash tag for Redis Cluster compatibilityKey Changes
checkAndTrackKeyUserSession()method with fail-open behaviorresolveKeyUserConcurrentSessionLimits().enabledresetUserAllStatisticsnow clears user-level active_sessions ZSETTest Coverage
Added comprehensive tests for limit resolution logic, Redis interactions, and guard integration covering edge cases like null/undefined/NaN/Infinity values.
Confidence Score: 4/5
Important Files Changed
Sequence Diagram
sequenceDiagram participant Client participant SessionGuard participant RateLimitGuard participant RateLimitService participant Redis participant LuaScript Client->>SessionGuard: Request with metadata SessionGuard->>SessionGuard: Extract/generate sessionId Note over SessionGuard: Check if concurrent limits enabled alt Concurrent limits disabled SessionGuard->>Redis: SessionTracker.trackSession (async) else Concurrent limits enabled SessionGuard->>SessionGuard: Skip tracking (defer to atomic path) end SessionGuard->>RateLimitGuard: Pass session with sessionId RateLimitGuard->>RateLimitGuard: Resolve key/user limits Note over RateLimitGuard: Fallback generate sessionId if missing RateLimitGuard->>RateLimitService: checkAndTrackKeyUserSession(keyId, userId, sessionId, keyLimit, userLimit) RateLimitService->>Redis: EVAL CHECK_AND_TRACK_KEY_USER_SESSION Redis->>LuaScript: Execute atomic script Note over LuaScript: Step 1: Clean expired sessions from all 3 ZSETs LuaScript->>LuaScript: ZREMRANGEBYSCORE (global, key, user) Note over LuaScript: Step 2: Check if session already tracked LuaScript->>LuaScript: ZSCORE key_key, ZSCORE user_key Note over LuaScript: Step 3: Get current counts LuaScript->>LuaScript: ZCARD key_key, ZCARD user_key Note over LuaScript: Step 4-5: Check limits (skip if already tracked) alt Key limit exceeded AND not tracked LuaScript-->>Redis: {0, 1, keyCount, 0, userCount, 0} Redis-->>RateLimitService: Rejected by key RateLimitService-->>RateLimitGuard: allowed=false, rejectedBy=key RateLimitGuard-->>Client: 429 Key concurrent limit exceeded else User limit exceeded AND not tracked LuaScript-->>Redis: {0, 2, keyCount, 0, userCount, 0} Redis-->>RateLimitService: Rejected by user RateLimitService-->>RateLimitGuard: allowed=false, rejectedBy=user RateLimitGuard-->>Client: 429 User concurrent limit exceeded else Allowed Note over LuaScript: Step 6-7: Track session in all 3 ZSETs + set TTL LuaScript->>LuaScript: ZADD global/key/user + EXPIRE LuaScript-->>Redis: {1, 0, newKeyCount, tracked, newUserCount, tracked} Redis-->>RateLimitService: Allowed with counts RateLimitService-->>RateLimitGuard: allowed=true RateLimitGuard->>RateLimitGuard: Continue with other rate limit checks RateLimitGuard-->>Client: Proceed to upstream endLast reviewed commit: 16e89d8