Skip to content

Disk buffer instrumented#540

Merged
xgreenx merged 14 commits intomainfrom
disk-buffer-instrumented
Apr 19, 2026
Merged

Disk buffer instrumented#540
xgreenx merged 14 commits intomainfrom
disk-buffer-instrumented

Conversation

@xgreenx
Copy link
Copy Markdown
Collaborator

@xgreenx xgreenx commented Apr 19, 2026

No description provided.

mchristopher and others added 14 commits February 4, 2026 10:58
Add instrumentation to track object lifetimes via AtomicI64 counters
that only decrement from Drop impls, proving actual deallocation.

Tracked types: GraphqlFetcher, BlockStream, AvroFileWriters,
AvroFileWriter, FinalizedBatchFiles, plus tokio alive task count.

Counters log every 100 run() iterations — any counter trending
upward over time confirms a leak of that object type.
@cursor
Copy link
Copy Markdown

cursor Bot commented Apr 19, 2026

PR Summary

Medium Risk
Changes the core sv-dune ingestion/upload pipeline to use on-disk Avro buffering and new file-streaming S3 uploads, which could affect correctness and retry behavior under failures. Also alters stream reconnection behavior by recreating GraphQL fetchers and tightening channel capacities, which may impact stability if misconfigured.

Overview
Switches sv-dune batch handling from in-memory accumulation to a disk-backed Avro buffer. New DiskBuffer writes blocks/txs/receipts to temporary Avro files (with per-block flush) and finalizes to file paths for upload, then resets to prepare the next batch.

Adds streaming uploads to avoid loading large Avro files into memory. Processor gains process_data_from_file, S3Storage adds store_from_file (streaming put_object or multipart upload from disk), and the service uploads blocks/transactions/receipts sequentially from the finalized files.

Introduces leak instrumentation and reconnection hardening. Adds alloc_counter plus TrackedFetcher/TrackedStream, recreates GraphqlFetcher on each reconnect with reduced channel capacities, and periodically logs counters (including tokio alive task metrics); also tightens multipart upload handling by requiring ETags and making abort best-effort.

Also updates the dune Docker image to run as nobody and installs extra debugging tools, and bumps a few lockfile dependencies plus adds tempfile for new tests.

Reviewed by Cursor Bugbot for commit c07dea3. Bugbot is set up for automated code reviews on this repo. Configure here.

@xgreenx xgreenx merged commit e4899cb into main Apr 19, 2026
7 of 8 checks passed
@xgreenx xgreenx deleted the disk-buffer-instrumented branch April 19, 2026 01:11
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit c07dea3. Configure here.

let file_path = self.create_output(data, &key).await?;
tracing::info!("New file saved: {}", file_path);
Ok(file_path)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused process_data method is dead code

Low Severity

The newly added process_data method on Processor is never called anywhere in the codebase. Only process_data_from_file is used (by process_finalized_batch in service.rs). This is dead code that adds maintenance burden without providing value.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit c07dea3. Configure here.

// Read a chunk
let bytes_read = file.read(&mut buffer).map_err(|e| {
StorageError::StoreError(format!("Failed to read file: {}", e))
})?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short reads may produce undersized S3 multipart parts

Low Severity

upload_multipart_from_file uses a single file.read(&mut buffer) call per chunk, which is not guaranteed to fill the buffer. Read::read may return fewer bytes than requested. S3 requires all non-final multipart parts to be at least 5 MiB; an undersized part would cause complete_multipart_upload to fail with EntityTooSmall. A read loop or Read::read_exact (with EOF handling for the last chunk) would be more robust.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit c07dea3. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants