feat: add timeout for http and db requests by pthmas · Pull Request #15 · evstack/atlas

pthmas · 2026-02-24T17:42:13Z

Overview

Replace expensive OFFSET scans on the blocks table with a keyset cursor (WHERE number <= cursor) derived from the block number, cutting last-page query time from 30s+ to <15ms.
Add a 10s Postgres statement_timeout on every pool connection and an HTTP TimeoutLayer to prevent any slow query from exhausting the connection pool and crashing the API.

Summary by CodeRabbit

New Features
- Implemented request timeout handling—requests exceeding 10 seconds will return HTTP 408 status
- Improved block listing pagination with keyset-based approach for better performance with large datasets
- Added database query timeout (10 seconds) to prevent queries from running indefinitely

coderabbitai · 2026-02-24T17:42:34Z

📝 Walkthrough

Walkthrough

This change implements a multi-layered timeout strategy for the backend. Tower-http dependency is updated with a timeout feature, a 10-second HTTP request timeout is configured at the router level, database connections receive a 10-second statement timeout via pool initialization hook, and block pagination switches from OFFSET to keyset-based approach.

Changes

Cohort / File(s)	Summary
Timeout Infrastructure `backend/Cargo.toml`, `backend/crates/atlas-api/src/main.rs`, `backend/crates/atlas-common/src/db.rs`	Added timeout feature to tower-http; configured TimeoutLayer middleware returning HTTP 408 on 10-second request timeout; implemented database-level statement timeout via after_connect hook on PgPoolOptions.
Query Optimization `backend/crates/atlas-api/src/handlers/blocks.rs`	Replaced OFFSET-based pagination with keyset pagination using block number cursor, improving performance for block listing queries.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 Timeouts tick and cursors hop,
Keyset bounds make queries stop,
Ten seconds, measured, clean and tight,
Database and requests work just right!
No more waiting, code runs true,
Optimization in every queue! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes: adding timeout mechanisms for HTTP requests and database operations across multiple files.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch pierrick/fix-last-page-query

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

backend/crates/atlas-api/src/main.rs (1)
8-11: Align timeout responses with the standard API error envelope.

All handlers return errors through ApiResult<Json<T>>, which serializes via ApiError::into_response() to produce a consistent JSON envelope: {"error": "message"}. However, TimeoutLayer at lines 214-217 returns a bare 408 status code without a response body, creating inconsistency for clients expecting the standard error format.

Consider wrapping with a custom error handler layer to serialize timeouts into the same envelope as other API errors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/crates/atlas-api/src/main.rs` around lines 8 - 11, TimeoutLayer
currently returns a bare 408 response that bypasses our ApiResult/ApiError JSON
envelope; modify the timeout handling so timeouts are converted into our
standard ApiError and serialized via ApiError::into_response (or wrap
TimeoutLayer with a custom layer/handler) so controllers still return
ApiResult<Json<T>> style errors. Locate where TimeoutLayer is added (the tower
layer setup using TimeoutLayer) and replace or wrap it with a layer that
intercepts timeout errors, maps them to an ApiError variant (e.g.,
ApiError::timeout or ApiError::new with message "request timed out"), and
produces the same JSON envelope by calling ApiError::into_response before
returning the response. Ensure the handler uses the existing ApiError type and
serialization path so clients always receive {"error": "..."} for timeouts.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/crates/atlas-api/src/handlers/blocks.rs`:
- Around line 22-36: The cursor calculation for keyset pagination uses
Pagination::offset() which multiplies by the raw, unclamped limit causing
skipped/duplicated pages; fix by using the clamped limit when computing the
cursor (e.g. replace cursor = (total_count - 1) - pagination.offset() with a
calculation that uses pagination.limit(), or change Pagination::offset() to
compute (page.saturating_sub(1)) * self.limit.min(100)); update the cursor
assignment in the handler that computes `cursor` (and/or the Pagination::offset
method) so both use the same clamped limit to keep pages consistent with the SQL
LIMIT binding.

In `@backend/crates/atlas-common/src/db.rs`:
- Around line 4-15: The pool created by create_pool uses after_connect to set a
10s statement_timeout which can cause run_migrations to fail on long
index/insert operations; change the implementation so migrations run with a
pool/connection that does not set the 10s timeout or make the timeout
configurable: add an optional parameter (e.g., statement_timeout:
Option<Duration> or a bool like set_statement_timeout) to create_pool and only
install the after_connect timeout when provided, or add a new helper
create_migration_pool that omits the after_connect handler, and update
run_migrations to use the migration-specific pool/connection instead of the
timeout-bearing pool.

---

Nitpick comments:
In `@backend/crates/atlas-api/src/main.rs`:
- Around line 8-11: TimeoutLayer currently returns a bare 408 response that
bypasses our ApiResult/ApiError JSON envelope; modify the timeout handling so
timeouts are converted into our standard ApiError and serialized via
ApiError::into_response (or wrap TimeoutLayer with a custom layer/handler) so
controllers still return ApiResult<Json<T>> style errors. Locate where
TimeoutLayer is added (the tower layer setup using TimeoutLayer) and replace or
wrap it with a layer that intercepts timeout errors, maps them to an ApiError
variant (e.g., ApiError::timeout or ApiError::new with message "request timed
out"), and produces the same JSON envelope by calling ApiError::into_response
before returning the response. Ensure the handler uses the existing ApiError
type and serialization path so clients always receive {"error": "..."} for
timeouts.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af3daf4 and 4fb1182.

📒 Files selected for processing (4)

backend/Cargo.toml
backend/crates/atlas-api/src/handlers/blocks.rs
backend/crates/atlas-api/src/main.rs
backend/crates/atlas-common/src/db.rs

coderabbitai · 2026-02-24T17:50:29Z

backend/crates/atlas-api/src/handlers/blocks.rs

+    // Convert page-based navigation to a keyset cursor using block numbers.
+    // Blocks are sequential so: cursor = max_block - (page - 1) * limit
+    // WHERE number <= cursor is O(log N) via primary key; OFFSET was O(N).
+    let limit = pagination.limit();
+    let cursor = (total_count - 1) - pagination.offset();
+
    let blocks: Vec<Block> = sqlx::query_as(
        "SELECT number, hash, parent_hash, timestamp, gas_used, gas_limit, transaction_count, indexed_at
         FROM blocks
+         WHERE number <= $2
         ORDER BY number DESC
-         LIMIT $1 OFFSET $2"
+         LIMIT $1"
    )
-    .bind(pagination.limit())
-    .bind(pagination.offset())
+    .bind(limit)
+    .bind(cursor)


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

rg -n "struct Pagination|impl Pagination|fn offset" backend/crates/atlas-common/src/types.rs

Repository: evstack/atlas

Length of output: 144

🏁 Script executed:

sed -n '358,400p' backend/crates/atlas-common/src/types.rs

Repository: evstack/atlas

Length of output: 974

Pagination keyset cursor calculation has a critical bug: offset() uses unclamped limit.

The offset() method in Pagination returns (page - 1) * self.limit using the raw client-provided limit, while limit() applies a cap of 100. This mismatch breaks the cursor calculation in the keyset query.

Example: A client requesting page=2, limit=200 will get:

offset() = (2-1) × 200 = 200 (using raw limit)

limit() = min(200, 100) = 100 (clamped)

cursor = total_count - 1 - 200

Query fetches only 100 rows but skips 200, causing duplicate/skipped pages.

Fix: Either apply the same clamping in offset() — (self.page.saturating_sub(1)) * self.limit.min(100) — or recompute the cursor using the clamped limit directly: cursor = (total_count - 1) - (pagination.page.saturating_sub(1)) * pagination.limit().

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/crates/atlas-api/src/handlers/blocks.rs` around lines 22 - 36, The cursor calculation for keyset pagination uses Pagination::offset() which multiplies by the raw, unclamped limit causing skipped/duplicated pages; fix by using the clamped limit when computing the cursor (e.g. replace cursor = (total_count - 1) - pagination.offset() with a calculation that uses pagination.limit(), or change Pagination::offset() to compute (page.saturating_sub(1)) * self.limit.min(100)); update the cursor assignment in the handler that computes `cursor` (and/or the Pagination::offset method) so both use the same clamped limit to keep pages consistent with the SQL LIMIT binding.

coderabbitai · 2026-02-24T17:50:29Z

backend/crates/atlas-common/src/db.rs

+/// Create a database connection pool.
+/// Sets statement_timeout = 10s on every connection to prevent slow queries
+/// from exhausting the pool.
 pub async fn create_pool(database_url: &str, max_connections: u32) -> Result<PgPool, sqlx::Error> {
    PgPoolOptions::new()
        .max_connections(max_connections)
+        .after_connect(|conn, _meta| {
+            Box::pin(async move {
+                conn.execute("SET statement_timeout = '10s'").await?;
+                Ok(())
+            })
+        })


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Step 1: Find migration files find . -path "*/migrations/*" -name "*.sql" -o -path "*/.migrations/*" -name "*.sql" | head -20

Repository: evstack/atlas

Length of output: 443

🏁 Script executed:

#!/bin/bash # Step 2: Check directory structure for migrations ls -la backend/ 2>/dev/null || echo "backend/ not found" find . -type d -name migrations | head -10

Repository: evstack/atlas

Length of output: 544

🏁 Script executed:

#!/bin/bash # Step 3: Search for run_migrations function to see how the pool is used rg -A 10 "fn run_migrations|async fn run_migrations" --type rust

Repository: evstack/atlas

Length of output: 315

🏁 Script executed:

#!/bin/bash # Step 4: Check the current file to see the full pool creation and migration usage cat -n backend/crates/atlas-common/src/db.rs | head -50

Repository: evstack/atlas

Length of output: 1010

🏁 Script executed:

#!/bin/bash # Read all migration files to check for long-running operations for file in ./backend/migrations/*.sql; do echo "=== $(basename $file) ===" cat "$file" echo "" done

Repository: evstack/atlas

Length of output: 15018

Statement timeout will block migrations on startup.

run_migrations receives the pool created with after_connect setting statement_timeout = 10s on every connection. The migration in 20240105 includes INSERT INTO tx_hash_lookup SELECT ... FROM transactions and creates GIN trigram indexes, both of which can exceed 10s on large tables. Other migrations also create indexes and ALTER TABLE operations that may timeout. Consider either running migrations with a separate connection/pool without the timeout, or making the timeout configurable for migration execution.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/crates/atlas-common/src/db.rs` around lines 4 - 15, The pool created by create_pool uses after_connect to set a 10s statement_timeout which can cause run_migrations to fail on long index/insert operations; change the implementation so migrations run with a pool/connection that does not set the 10s timeout or make the timeout configurable: add an optional parameter (e.g., statement_timeout: Option<Duration> or a bool like set_statement_timeout) to create_pool and only install the after_connect timeout when provided, or add a new helper create_migration_pool that omits the after_connect handler, and update run_migrations to use the migration-specific pool/connection instead of the timeout-bearing pool.

add timeout for http and db requests

4fb1182

coderabbitai bot reviewed Feb 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat: add timeout for http and db requests#15

feat: add timeout for http and db requests#15
pthmas wants to merge 1 commit intomainfrom
pierrick/fix-last-page-query

pthmas commented Feb 24, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 24, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 24, 2026

Uh oh!

coderabbitai bot Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

pthmas commented Feb 24, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pthmas commented Feb 24, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 24, 2026 •

edited

Loading