Conversation
📝 WalkthroughWalkthroughThis change implements a multi-layered timeout strategy for the backend. Tower-http dependency is updated with a timeout feature, a 10-second HTTP request timeout is configured at the router level, database connections receive a 10-second statement timeout via pool initialization hook, and block pagination switches from OFFSET to keyset-based approach. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
backend/crates/atlas-api/src/main.rs (1)
8-11: Align timeout responses with the standard API error envelope.All handlers return errors through
ApiResult<Json<T>>, which serializes viaApiError::into_response()to produce a consistent JSON envelope:{"error": "message"}. However,TimeoutLayerat lines 214-217 returns a bare 408 status code without a response body, creating inconsistency for clients expecting the standard error format.Consider wrapping with a custom error handler layer to serialize timeouts into the same envelope as other API errors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/crates/atlas-api/src/main.rs` around lines 8 - 11, TimeoutLayer currently returns a bare 408 response that bypasses our ApiResult/ApiError JSON envelope; modify the timeout handling so timeouts are converted into our standard ApiError and serialized via ApiError::into_response (or wrap TimeoutLayer with a custom layer/handler) so controllers still return ApiResult<Json<T>> style errors. Locate where TimeoutLayer is added (the tower layer setup using TimeoutLayer) and replace or wrap it with a layer that intercepts timeout errors, maps them to an ApiError variant (e.g., ApiError::timeout or ApiError::new with message "request timed out"), and produces the same JSON envelope by calling ApiError::into_response before returning the response. Ensure the handler uses the existing ApiError type and serialization path so clients always receive {"error": "..."} for timeouts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/crates/atlas-api/src/handlers/blocks.rs`:
- Around line 22-36: The cursor calculation for keyset pagination uses
Pagination::offset() which multiplies by the raw, unclamped limit causing
skipped/duplicated pages; fix by using the clamped limit when computing the
cursor (e.g. replace cursor = (total_count - 1) - pagination.offset() with a
calculation that uses pagination.limit(), or change Pagination::offset() to
compute (page.saturating_sub(1)) * self.limit.min(100)); update the cursor
assignment in the handler that computes `cursor` (and/or the Pagination::offset
method) so both use the same clamped limit to keep pages consistent with the SQL
LIMIT binding.
In `@backend/crates/atlas-common/src/db.rs`:
- Around line 4-15: The pool created by create_pool uses after_connect to set a
10s statement_timeout which can cause run_migrations to fail on long
index/insert operations; change the implementation so migrations run with a
pool/connection that does not set the 10s timeout or make the timeout
configurable: add an optional parameter (e.g., statement_timeout:
Option<Duration> or a bool like set_statement_timeout) to create_pool and only
install the after_connect timeout when provided, or add a new helper
create_migration_pool that omits the after_connect handler, and update
run_migrations to use the migration-specific pool/connection instead of the
timeout-bearing pool.
---
Nitpick comments:
In `@backend/crates/atlas-api/src/main.rs`:
- Around line 8-11: TimeoutLayer currently returns a bare 408 response that
bypasses our ApiResult/ApiError JSON envelope; modify the timeout handling so
timeouts are converted into our standard ApiError and serialized via
ApiError::into_response (or wrap TimeoutLayer with a custom layer/handler) so
controllers still return ApiResult<Json<T>> style errors. Locate where
TimeoutLayer is added (the tower layer setup using TimeoutLayer) and replace or
wrap it with a layer that intercepts timeout errors, maps them to an ApiError
variant (e.g., ApiError::timeout or ApiError::new with message "request timed
out"), and produces the same JSON envelope by calling ApiError::into_response
before returning the response. Ensure the handler uses the existing ApiError
type and serialization path so clients always receive {"error": "..."} for
timeouts.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backend/Cargo.tomlbackend/crates/atlas-api/src/handlers/blocks.rsbackend/crates/atlas-api/src/main.rsbackend/crates/atlas-common/src/db.rs
| // Convert page-based navigation to a keyset cursor using block numbers. | ||
| // Blocks are sequential so: cursor = max_block - (page - 1) * limit | ||
| // WHERE number <= cursor is O(log N) via primary key; OFFSET was O(N). | ||
| let limit = pagination.limit(); | ||
| let cursor = (total_count - 1) - pagination.offset(); | ||
|
|
||
| let blocks: Vec<Block> = sqlx::query_as( | ||
| "SELECT number, hash, parent_hash, timestamp, gas_used, gas_limit, transaction_count, indexed_at | ||
| FROM blocks | ||
| WHERE number <= $2 | ||
| ORDER BY number DESC | ||
| LIMIT $1 OFFSET $2" | ||
| LIMIT $1" | ||
| ) | ||
| .bind(pagination.limit()) | ||
| .bind(pagination.offset()) | ||
| .bind(limit) | ||
| .bind(cursor) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
rg -n "struct Pagination|impl Pagination|fn offset" backend/crates/atlas-common/src/types.rsRepository: evstack/atlas
Length of output: 144
🏁 Script executed:
sed -n '358,400p' backend/crates/atlas-common/src/types.rsRepository: evstack/atlas
Length of output: 974
Pagination keyset cursor calculation has a critical bug: offset() uses unclamped limit.
The offset() method in Pagination returns (page - 1) * self.limit using the raw client-provided limit, while limit() applies a cap of 100. This mismatch breaks the cursor calculation in the keyset query.
Example: A client requesting page=2, limit=200 will get:
offset()= (2-1) × 200 = 200 (using raw limit)limit()= min(200, 100) = 100 (clamped)cursor= total_count - 1 - 200- Query fetches only 100 rows but skips 200, causing duplicate/skipped pages.
Fix: Either apply the same clamping in offset() — (self.page.saturating_sub(1)) * self.limit.min(100) — or recompute the cursor using the clamped limit directly: cursor = (total_count - 1) - (pagination.page.saturating_sub(1)) * pagination.limit().
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/crates/atlas-api/src/handlers/blocks.rs` around lines 22 - 36, The
cursor calculation for keyset pagination uses Pagination::offset() which
multiplies by the raw, unclamped limit causing skipped/duplicated pages; fix by
using the clamped limit when computing the cursor (e.g. replace cursor =
(total_count - 1) - pagination.offset() with a calculation that uses
pagination.limit(), or change Pagination::offset() to compute
(page.saturating_sub(1)) * self.limit.min(100)); update the cursor assignment in
the handler that computes `cursor` (and/or the Pagination::offset method) so
both use the same clamped limit to keep pages consistent with the SQL LIMIT
binding.
| /// Create a database connection pool. | ||
| /// Sets statement_timeout = 10s on every connection to prevent slow queries | ||
| /// from exhausting the pool. | ||
| pub async fn create_pool(database_url: &str, max_connections: u32) -> Result<PgPool, sqlx::Error> { | ||
| PgPoolOptions::new() | ||
| .max_connections(max_connections) | ||
| .after_connect(|conn, _meta| { | ||
| Box::pin(async move { | ||
| conn.execute("SET statement_timeout = '10s'").await?; | ||
| Ok(()) | ||
| }) | ||
| }) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Step 1: Find migration files
find . -path "*/migrations/*" -name "*.sql" -o -path "*/.migrations/*" -name "*.sql" | head -20Repository: evstack/atlas
Length of output: 443
🏁 Script executed:
#!/bin/bash
# Step 2: Check directory structure for migrations
ls -la backend/ 2>/dev/null || echo "backend/ not found"
find . -type d -name migrations | head -10Repository: evstack/atlas
Length of output: 544
🏁 Script executed:
#!/bin/bash
# Step 3: Search for run_migrations function to see how the pool is used
rg -A 10 "fn run_migrations|async fn run_migrations" --type rustRepository: evstack/atlas
Length of output: 315
🏁 Script executed:
#!/bin/bash
# Step 4: Check the current file to see the full pool creation and migration usage
cat -n backend/crates/atlas-common/src/db.rs | head -50Repository: evstack/atlas
Length of output: 1010
🏁 Script executed:
#!/bin/bash
# Read all migration files to check for long-running operations
for file in ./backend/migrations/*.sql; do
echo "=== $(basename $file) ==="
cat "$file"
echo ""
doneRepository: evstack/atlas
Length of output: 15018
Statement timeout will block migrations on startup.
run_migrations receives the pool created with after_connect setting statement_timeout = 10s on every connection. The migration in 20240105 includes INSERT INTO tx_hash_lookup SELECT ... FROM transactions and creates GIN trigram indexes, both of which can exceed 10s on large tables. Other migrations also create indexes and ALTER TABLE operations that may timeout. Consider either running migrations with a separate connection/pool without the timeout, or making the timeout configurable for migration execution.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/crates/atlas-common/src/db.rs` around lines 4 - 15, The pool created
by create_pool uses after_connect to set a 10s statement_timeout which can cause
run_migrations to fail on long index/insert operations; change the
implementation so migrations run with a pool/connection that does not set the
10s timeout or make the timeout configurable: add an optional parameter (e.g.,
statement_timeout: Option<Duration> or a bool like set_statement_timeout) to
create_pool and only install the after_connect timeout when provided, or add a
new helper create_migration_pool that omits the after_connect handler, and
update run_migrations to use the migration-specific pool/connection instead of
the timeout-bearing pool.
Overview
Summary by CodeRabbit