test(e2e): harden gateway e2e tests#272
Conversation
📝 WalkthroughWalkthroughIntroduce a centralized E2E test bootstrap module and helpers, switch admin usage tests to an SQLite-backed usage fixture, consolidate request construction and response assertion helpers, strengthen upstream/downstream payload validations across multiple E2E tests, and make release scenario scripts fail-fast on HTTP/JSON errors. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Greptile SummaryThis PR hardens the gateway e2e test suite by centralising server setup helpers into
Confidence Score: 4/5Safe to merge after fixing the double-close of the usage logger in setup_test.go. One P1 finding: the tests/e2e/setup_test.go — the Important Files Changed
Sequence DiagramsequenceDiagram
participant T as Test
participant F as setupSQLiteUsageFixture
participant L as usage.Logger
participant DB as SQLite DB
participant S as e2e Admin Server
T->>F: setupSQLiteUsageFixture(t)
F->>DB: sql.Open(":memory:")
F->>L: usage.NewLogger(store, cfg)
F-->>T: e2eUsageFixture{reader, logger}
Note over F,L: t.Cleanup registered: logger.Close() + db.Close()
T->>S: setupE2EAdminServer(t, opts{usageLogger: logger})
T->>S: POST /v1/chat/completions (x2)
S->>L: Log usage entry (buffered)
T->>F: flush(t) → logger.Close() ← first Close
L->>DB: Flush buffered entries
T->>S: GET /admin/api/v1/usage/summary
S->>DB: Query persisted rows
DB-->>S: {TotalRequests:2, ...}
S-->>T: 200 OK
Note over T,L: Test ends — t.Cleanup runs
T->>L: logger.Close() ← second Close ⚠️ potential double-close
T->>DB: db.Close()
Reviews (1): Last reviewed commit: "test(e2e): harden gateway e2e tests" | Re-trigger Greptile |
| t.Cleanup(func() { | ||
| require.NoError(t, logger.Close()) | ||
| }) | ||
|
|
||
| return &e2eUsageFixture{ | ||
| reader: reader, | ||
| logger: logger, | ||
| } | ||
| } | ||
|
|
||
| func (f *e2eUsageFixture) flush(t *testing.T) { | ||
| t.Helper() | ||
|
|
||
| require.NoError(t, f.logger.Close()) | ||
| } |
There was a problem hiding this comment.
Double
logger.Close() — test cleanup will fail
flush closes the logger, but the t.Cleanup registered on line 131 also calls logger.Close(). Cleanup functions run in LIFO order after the test, so the cleanup will invoke Close() a second time on an already-closed logger. If usage.Logger.Close() is not idempotent (e.g. closes a channel a second time or returns an error), the require.NoError in the cleanup will fail every time flush is called.
The simplest fix is to guard against the double-close with a flag, or remove the automatic cleanup and require callers to manage teardown:
func (f *e2eUsageFixture) flush(t *testing.T) {
t.Helper()
if f.closed {
return
}
require.NoError(t, f.logger.Close())
f.closed = true
}Then the t.Cleanup in setupSQLiteUsageFixture should also check f.closed or be removed.
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/e2e/helpers_test.go (1)
149-219: 🧹 Nitpick | 🔵 TrivialTightening SSE parsing: good change, plus a heads-up on scanner buffer size.
Failing fast on JSON unmarshal errors and asserting
scanner.Err()is the right call — silent corruption of streamed chunks won't go unnoticed anymore.One thing to be aware of:
bufio.Scannerdefaults to a 64KiB max token size. If a future SSE chunk (e.g., a large tool-call argument blob or base64 image) exceeds that,scanner.Err()will surfacebufio.ErrTooLongand tests will fail with a misleading error. For the current mock payloads this is fine, but worth usingscanner.Buffer(...)proactively if upstream payloads grow.♻️ Optional buffer expansion
scanner := bufio.NewScanner(body) + scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/helpers_test.go` around lines 149 - 219, The SSE parsers readStreamingResponse and readResponsesStream currently use bufio.NewScanner with the default 64KiB token limit which can surface bufio.ErrTooLong for large SSE data; fix by calling scanner.Buffer(...) after creating the scanner (in both readStreamingResponse and readResponsesStream) to increase the initial and maximum token size to a safe larger value (e.g., a few megabytes) so large data chunks (base64 blobs, tool args) won’t cause ErrTooLong while preserving the existing JSON unmarshal and scanner.Err() checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/e2e/admin_test.go`:
- Around line 189-194: The double-close occurs because usageFixture.flush(t)
calls logger.Close() and the t.Cleanup registered in setupSQLiteUsageFixture
also calls logger.Close(); modify setupSQLiteUsageFixture (the function that
registers t.Cleanup) and/or flush(t) so logger.Close() is only invoked once —
either remove the redundant Close from flush(t) or make the cleanup registration
skip closing when flush has already closed, or make logger.Close() idempotent;
ensure the unique symbols involved are setupSQLiteUsageFixture, flush (on
usageFixture), and logger.Close so reviewers can locate and apply the
single-close fix.
- Around line 211-238: In the "daily includes persisted usage" test, avoid the
UTC-midnight race and magic numbers: capture the current date once (today :=
time.Now().UTC().Format("2006-01-02")) before issuing any chat/usage requests,
then when locating the usage entry in the daily slice accept either today or
yesterday (compute yesterday := time.Now().UTC().Add(-24*time.Hour).Format(...))
when setting todayEntry; replace literal token counts in the assertions (2, 20,
40, 60) with named constants like expectedRequests, expectedInputTokens,
expectedOutputTokens, expectedTotalTokens and add a short comment that these
derive from the mock provider (10 input + 20 output per request × 2), then
assert against those constants.
In `@tests/e2e/setup_test.go`:
- Around line 95-108: The empty-string branch in setupE2ERegistry causes
inconsistent provider registration; ensure a single default provider type
("test") is always used: update callers (setupE2EServer / setupAdminServer /
setupE2EAdminServer / setupAuthServer) to pass "test" or set providerType =
"test" at the top of setupE2ERegistry, then remove the RegisterProvider(path)
branch and always call registry.RegisterProviderWithType(testProvider,
providerType) so registration behavior is consistent across tests.
- Around line 45-66: The two helpers duplicate behavior; replace the positional
helper by converging on the options-struct API: remove or deprecate
setupAdminServer and update its callers to build an e2eServerOptions (setting
masterKey, adminEndpointsEnabled=endpointsEnabled, adminUIEnabled=uiEnabled,
providerType defaulting to "test") and call setupE2EAdminServer or directly call
setupE2EServer wrapped in httptest.NewServer; alternatively make
setupAdminServer a thin wrapper that constructs e2eServerOptions and returns
setupE2EAdminServer(t, opts). Ensure you reference e2eServerOptions,
setupE2EServer, setupE2EAdminServer and setupAdminServer when making the change.
---
Outside diff comments:
In `@tests/e2e/helpers_test.go`:
- Around line 149-219: The SSE parsers readStreamingResponse and
readResponsesStream currently use bufio.NewScanner with the default 64KiB token
limit which can surface bufio.ErrTooLong for large SSE data; fix by calling
scanner.Buffer(...) after creating the scanner (in both readStreamingResponse
and readResponsesStream) to increase the initial and maximum token size to a
safe larger value (e.g., a few megabytes) so large data chunks (base64 blobs,
tool args) won’t cause ErrTooLong while preserving the existing JSON unmarshal
and scanner.Err() checks.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: b6298b07-dacb-42cf-a61e-04a4b2bc5bae
📒 Files selected for processing (8)
tests/e2e/admin_test.gotests/e2e/auditlog_test.gotests/e2e/auth_test.gotests/e2e/chat_test.gotests/e2e/helpers_test.gotests/e2e/release-e2e-scenarios.mdtests/e2e/responses_test.gotests/e2e/setup_test.go
💤 Files with no reviewable changes (1)
- tests/e2e/auth_test.go
| // setupAdminServer creates a new server instance with admin features configured. | ||
| func setupAdminServer(t *testing.T, masterKey string, endpointsEnabled, uiEnabled bool) *httptest.Server { | ||
| t.Helper() | ||
|
|
||
| srv := setupE2EServer(t, e2eServerOptions{ | ||
| masterKey: masterKey, | ||
| adminEndpointsEnabled: endpointsEnabled, | ||
| adminUIEnabled: uiEnabled, | ||
| providerType: "test", | ||
| }) | ||
| return httptest.NewServer(srv) | ||
| } | ||
|
|
||
| func setupE2EAdminServer(t *testing.T, opts e2eServerOptions) *httptest.Server { | ||
| t.Helper() | ||
|
|
||
| opts.adminEndpointsEnabled = true | ||
| if opts.providerType == "" { | ||
| opts.providerType = "test" | ||
| } | ||
| return httptest.NewServer(setupE2EServer(t, opts)) | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
setupAdminServer and setupE2EAdminServer overlap — consider unifying.
Both helpers ultimately call setupE2EServer and wrap in httptest.NewServer, but with different positional/options shapes and different defaults (setupE2EAdminServer force-enables adminEndpointsEnabled; setupAdminServer lets the caller pass endpointsEnabled). The duplication is small but invites drift. Since there's only one external caller of each pattern, prefer collapsing on the options-struct API and removing the 4-arg variant in a follow-up.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/e2e/setup_test.go` around lines 45 - 66, The two helpers duplicate
behavior; replace the positional helper by converging on the options-struct API:
remove or deprecate setupAdminServer and update its callers to build an
e2eServerOptions (setting masterKey, adminEndpointsEnabled=endpointsEnabled,
adminUIEnabled=uiEnabled, providerType defaulting to "test") and call
setupE2EAdminServer or directly call setupE2EServer wrapped in
httptest.NewServer; alternatively make setupAdminServer a thin wrapper that
constructs e2eServerOptions and returns setupE2EAdminServer(t, opts). Ensure you
reference e2eServerOptions, setupE2EServer, setupE2EAdminServer and
setupAdminServer when making the change.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/e2e/helpers_test.go`:
- Around line 71-82: The helper requireErrorResponse leaves the response body
open; after decoding the JSON envelope in requireErrorResponse, drain the
remaining body and close it to make the helper self-contained—use
io.Copy(io.Discard, resp.Body) (ignoring non-fatal copy errors) and then
resp.Body.Close() (or defer closing immediately after entering the function) so
callers don't rely on external cleanup.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: ab0dc196-51ce-416a-b220-b1426eff6687
📒 Files selected for processing (3)
tests/e2e/admin_test.gotests/e2e/helpers_test.gotests/e2e/setup_test.go
| func requireErrorResponse(t *testing.T, resp *http.Response, status int, errorType core.ErrorType, messageContains string) { | ||
| t.Helper() | ||
|
|
||
| require.Equal(t, status, resp.StatusCode) | ||
|
|
||
| var envelope core.OpenAIErrorEnvelope | ||
| require.NoError(t, json.NewDecoder(resp.Body).Decode(&envelope)) | ||
| require.Equal(t, errorType, envelope.Error.Type) | ||
| if messageContains != "" { | ||
| require.Contains(t, envelope.Error.Message, messageContains) | ||
| } | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Optional: consider draining/closing the body inside requireErrorResponse.
The helper decodes the envelope but leaves the body open and partially un-drained. Callers in this codebase typically defer closeBody(resp), so this is fine in practice; just worth flagging in case a future caller forgets — draining via io.Copy(io.Discard, resp.Body) after Decode (or closing here) would make the helper more self-contained.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/e2e/helpers_test.go` around lines 71 - 82, The helper
requireErrorResponse leaves the response body open; after decoding the JSON
envelope in requireErrorResponse, drain the remaining body and close it to make
the helper self-contained—use io.Copy(io.Discard, resp.Body) (ignoring non-fatal
copy errors) and then resp.Body.Close() (or defer closing immediately after
entering the function) so callers don't rely on external cleanup.
Summary
Tests
Summary by CodeRabbit