Conversation
Add instrumentation to track object lifetimes via AtomicI64 counters that only decrement from Drop impls, proving actual deallocation. Tracked types: GraphqlFetcher, BlockStream, AvroFileWriters, AvroFileWriter, FinalizedBatchFiles, plus tokio alive task count. Counters log every 100 run() iterations — any counter trending upward over time confirms a leak of that object type.
PR SummaryMedium Risk Overview Adds streaming uploads to avoid loading large Avro files into memory. Introduces leak instrumentation and reconnection hardening. Adds Also updates the dune Docker image to run as Reviewed by Cursor Bugbot for commit c07dea3. Bugbot is set up for automated code reviews on this repo. Configure here. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit c07dea3. Configure here.
| let file_path = self.create_output(data, &key).await?; | ||
| tracing::info!("New file saved: {}", file_path); | ||
| Ok(file_path) | ||
| } |
There was a problem hiding this comment.
Unused process_data method is dead code
Low Severity
The newly added process_data method on Processor is never called anywhere in the codebase. Only process_data_from_file is used (by process_finalized_batch in service.rs). This is dead code that adds maintenance burden without providing value.
Reviewed by Cursor Bugbot for commit c07dea3. Configure here.
| // Read a chunk | ||
| let bytes_read = file.read(&mut buffer).map_err(|e| { | ||
| StorageError::StoreError(format!("Failed to read file: {}", e)) | ||
| })?; |
There was a problem hiding this comment.
Short reads may produce undersized S3 multipart parts
Low Severity
upload_multipart_from_file uses a single file.read(&mut buffer) call per chunk, which is not guaranteed to fill the buffer. Read::read may return fewer bytes than requested. S3 requires all non-final multipart parts to be at least 5 MiB; an undersized part would cause complete_multipart_upload to fail with EntityTooSmall. A read loop or Read::read_exact (with EOF handling for the last chunk) would be more robust.
Reviewed by Cursor Bugbot for commit c07dea3. Configure here.


No description provided.