revert(#35): bulk-COPY backfill crash-loops on dup-key after restart#36
Conversation
📝 WalkthroughWalkthroughThis PR refactors the backfill pipeline from a bulk-COPY batched-write strategy to a pipelined concurrent-fetch model with sequential per-height writes. The backfill now concurrently fetches blocks within a sliding window bounded by Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/sync/src/block_writer.rs`:
- Around line 68-90: The unchecked casts for RawTxRow fields timestamp
(b.block.timestamp), gas_used (t.gas_used), and status (t.status) can silently
wrap on negative values; change them to safe, checked conversions (e.g., use
u64::try_from(b.block.timestamp).ok() /
u64::try_from(t.gas_used.unwrap_or(0)).ok() and u8::try_from(t.status).ok()) and
if any conversion fails emit a tracing::warn (including the offending values)
and return Ok(()) to skip the malformed row, mirroring the defensive pattern
used for block_height (which uses .as_u64()); alternatively, if you can
guarantee non-negativity, add a clear comment above RawTxRow construction
documenting that invariant and assert it with debug_assert!(...) so the intent
is explicit.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 7faa5246-5f09-427a-b0e3-58eaa7ca5084
📒 Files selected for processing (4)
CHANGELOG.mdcompose.env.examplecrates/sync/src/backfill.rscrates/sync/src/block_writer.rs
💤 Files with no reviewable changes (2)
- compose.env.example
- CHANGELOG.md
| let block_height = match b.block.height.as_u64() { | ||
| Some(h) => h, | ||
| None => { | ||
| tracing::warn!( | ||
| height = ?b.block.height, | ||
| "analytics: skipping row — block height not convertible to u64 \ | ||
| (cursor sentinel reached writer; this should not happen)" | ||
| ); | ||
| return Ok(()); | ||
| } | ||
| buf.push('\t'); | ||
| write_text(&mut buf, &t.fee.to_string()); | ||
| buf.push('\t'); | ||
| write_int(&mut buf, t.nonce); | ||
| buf.push('\t'); | ||
| match &t.data { | ||
| Some(s) => write_text(&mut buf, s), | ||
| None => buf.push_str("\\N"), | ||
| } | ||
| buf.push('\t'); | ||
| write_int(&mut buf, t.status as i64); | ||
| buf.push('\t'); | ||
| match &t.contract_address { | ||
| Some(s) => write_text(&mut buf, s), | ||
| None => buf.push_str("\\N"), | ||
| }; | ||
| for t in &b.txs { | ||
| let row = RawTxRow { | ||
| block_height, | ||
| timestamp: b.block.timestamp as u64, | ||
| tx_hash: t.hash.clone(), | ||
| from_addr: t.from_addr.clone(), | ||
| to_addr: t.to_addr.clone(), | ||
| value_str: t.value.to_string(), | ||
| fee_str: t.fee.to_string(), | ||
| gas_used: t.gas_used.unwrap_or(0) as u64, | ||
| status: t.status as u8, | ||
| tx_type: t.tx_type.as_str().to_string(), |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect source field types on the domain models and sink field types on RawTxRow.
rg -n -C3 'pub struct Block\b|pub struct Transaction\b|pub struct RawTxRow\b|pub timestamp:|pub gas_used:|pub status:|block_height:' --glob '*.rs'Repository: Sentriscloud/indexer-rs
Length of output: 20841
Confirm source and sink types in the unchecked analytics casts are safe, or apply checked conversions.
Verification confirms the concern:
timestampcastsi64→u64(unchecked, signed to unsigned)gas_usedcastsi64→u64(unchecked, signed to unsigned)statuscastsi16→u8(unchecked, signed to unsigned and narrowing)
Unlike block_height, which defensively uses .as_u64() returning Option, these three casts are bare as operators. They will silently wrap if any source value is negative. While domain semantics (Unix timestamps, gas amounts, success/failure flags) suggest these should always be non-negative, unchecked signed-to-unsigned conversions can corrupt analytics data. Either confirm the source fields are always non-negative with a comment, or switch to checked conversions (e.g., u64::try_from(…)) that skip malformed rows with a warning, matching the defensive pattern used for block_height.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/sync/src/block_writer.rs` around lines 68 - 90, The unchecked casts
for RawTxRow fields timestamp (b.block.timestamp), gas_used (t.gas_used), and
status (t.status) can silently wrap on negative values; change them to safe,
checked conversions (e.g., use u64::try_from(b.block.timestamp).ok() /
u64::try_from(t.gas_used.unwrap_or(0)).ok() and u8::try_from(t.status).ok()) and
if any conversion fails emit a tracing::warn (including the offending values)
and return Ok(()) to skip the malformed row, mirroring the defensive pattern
used for block_height (which uses .as_u64()); alternatively, if you can
guarantee non-negativity, add a clear comment above RawTxRow construction
documenting that invariant and assert it with debug_assert!(...) so the intent
is explicit.
Revert #35.
Why
Bulk-COPY path drops
ON CONFLICT DO NOTHING(COPY can't express it). Cursor monotonicity holds within ONE backfill run but NOT across restarts: indexer restart → reads cursor → restarts backfill from cursor+1 → first ~100 block batch may overlap with already-INSERT'd rows from a partially-flushed prior run →duplicate key value violates unique constraint blocks_pkey→ batch transaction rolls back → backfill loop iterates same range forever.Reproduced live on vps4 mainnet 2026-05-14 16:13Z immediately after deploy: