Skip to content

Conversation

@JonoPrest
Copy link
Collaborator

@JonoPrest JonoPrest commented Aug 29, 2025

Summary by CodeRabbit

  • New Features

    • Client-selectable serialization (JSON or Cap’n Proto)
    • End-to-end Cap’n Proto query/response support and generation tooling
    • New response types: archive height, chain id, rollback metadata
  • Improvements

    • Strongly-typed field selections and join modes for blocks, transactions, logs, traces
    • Safer bloom-filter handling with equality and from-bytes helper
    • Compression and serialization benchmarking added
  • Tests

    • Added Cap’n Proto client streaming test
  • Chores

    • Package version bumps and dependency updates
  • Documentation

    • Removed Dependencies section from README.md

@JonoPrest JonoPrest force-pushed the jp/hack-efficient-queries branch from fa82206 to 91d2fc4 Compare October 24, 2025 16:23
@JonoPrest JonoPrest changed the title Jp/hack efficient queries Capnp encoding for queries Oct 24, 2025
commit f91c16a
Author: Jono Prest <65739024+JonoPrest@users.noreply.github.com>
Date:   Fri Oct 24 14:25:31 2025 +0200

    Allow defining "include" and "exclude" filters on selections in query (#80)

    * Add exclude selections to query

    * Bump version

    * Use empty vecs instead of optionals for exclude selections

    * Change structure to allow and or semantics

    * Add fmt and clippy

    * Bump minor versions since the public api breaks

    * Add backwards compatability tests

    * Add serialization test for new queries

    * Add pretty assertions

    * Fix serialization compatability test and improve serialization efficiency

    * Use checksum addresses for passing tests

    * Add rc flag for release

commit 14920fe
Author: Jono Prest <65739024+JonoPrest@users.noreply.github.com>
Date:   Thu Oct 16 17:34:00 2025 +0200

    Handle null effectiveGas for zeta receipts (#81)

commit a1ec5d3
Author: Özgür Akkurt <oezgurmakkurt@gmail.com>
Date:   Wed Sep 10 20:57:05 2025 +0600

    format: make quantity compatible with bincode (#77)

    * format: make quantity compatible with bincode

    * improve code path

    * fix serde expect string

    * version bump

commit 13b5362
Merge: fd505a4 0f67fb5
Author: Dmitry Zakharov <dzakh.dev@gmail.com>
Date:   Thu Sep 4 15:55:11 2025 +0400

    Merge pull request #76 from enviodev/dz/improve-get-events-join-logic

    Improve Get Event join logic

commit 0f67fb5
Author: Dmitry Zakharov <dzakh.dev@gmail.com>
Date:   Thu Sep 4 15:49:57 2025 +0400

    Update EventResponse type

commit 5f6d2a3
Author: Dmitry Zakharov <dzakh.dev@gmail.com>
Date:   Thu Sep 4 15:44:00 2025 +0400

    Fixes after review

commit 8208307
Author: Dmitry Zakharov <dzakh.dev@gmail.com>
Date:   Wed Sep 3 20:19:58 2025 +0400

    Fix clippy

commit 001c2ef
Author: Dmitry Zakharov <dzakh.dev@gmail.com>
Date:   Wed Sep 3 20:05:20 2025 +0400

    Improve Get Event join logic

commit fd505a4
Author: Jason Smythe <jason@wildcards.world>
Date:   Thu Sep 4 13:17:47 2025 +0200

    BlockNumber as integer rather than Hex for Sonic RPC (#72)

    * BlockNumber as integer rather than Hex for Sonic RPC

    Enhance Quantity deserialization to accept numeric values

    Updated the QuantityVisitor to handle both hex strings and numeric values (u64 and i64) for deserialization. Added tests to verify the correct handling of numeric JSON values.

    * bump hypersync format version

    * Improve deserializer for quantity

    * Bump versions again for release

    ---------

    Co-authored-by: Jono Prest <jjprest@gmail.com>
Merge branch 'main' into jp/hack-efficient-queries
@JonoPrest JonoPrest marked this pull request as ready for review October 27, 2025 12:20
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (14)
hypersync-net-types/hypersync_net_types.capnp (3)

212-224: Reorder field declarations to match field numbers.

The joinMode @0 field is declared last (Line 223), while fields @1 through @10 are declared first. While Cap'n Proto allows any declaration order, having the @0 field at the end is confusing and hurts readability.

Consider reordering to match the field numbers:

 struct QueryBody {
+    joinMode @0 :JoinMode;
     logs @1 :List(Selection(LogFilter));
     transactions @2 :List(Selection(TransactionFilter));
     traces @3 :List(Selection(TraceFilter));
     blocks @4 :List(Selection(BlockFilter));
     includeAllBlocks @5 :Bool;
     fieldSelection @6 :FieldSelection;
     maxNumBlocks @7 :OptUInt64;
     maxNumTransactions @8 :OptUInt64;
     maxNumLogs @9 :OptUInt64;
     maxNumTraces @10 :OptUInt64;
-    joinMode @0 :JoinMode;
 }

236-242: Consider documenting the optional value pattern.

The OptUInt64 and OptUInt8 wrapper structs implement optional values in Cap'n Proto. While this pattern is valid, it's worth documenting whether the optionality is determined by:

  • Checking if the struct is present (using Cap'n Proto's has*() methods)
  • Checking if the value field itself is set
  • Some other convention

This clarification would help downstream implementations correctly handle these optional values.


1-242: Add schema-level documentation.

The schema would benefit from comments explaining:

  • The purpose and usage of the schema
  • The semantics of key patterns (Selection, Filter suffix fields, optional types)
  • Examples of common query patterns
  • Version compatibility notes

Consider adding file-level comments at the top and inline comments for complex structures.

hypersync-net-types/Makefile (1)

1-2: Mark the target as .PHONY.

The generate_capnp_types target doesn't produce a file with that name, so it should be declared as a phony target to ensure it always runs when invoked.

Apply this diff:

+.PHONY: generate_capnp_types
 generate_capnp_types:
 	capnp compile ./hypersync_net_types.capnp -o rust:./src/__generated__
hypersync-client/src/lib.rs (2)

389-423: Prefer explicit Accept header and offload parsing to spawn_blocking

  • Add Accept: application/x-capnp to make content negotiation explicit.
  • Use tokio::task::spawn_blocking instead of block_in_place to avoid blocking a runtime worker thread under load.
 async fn get_arrow_impl_json(&self, query: &Query) -> Result<(ArrowResponse, u64)> {
@@
-        if let Some(bearer_token) = &self.bearer_token {
+        if let Some(bearer_token) = &self.bearer_token {
             req = req.bearer_auth(bearer_token);
         }
+        // Response is a packed capnp message containing Arrow data
+        req = req.header("accept", "application/x-capnp");
@@
-        let res = tokio::task::block_in_place(|| {
-            parse_query_response(&bytes).context("parse query response")
-        })?;
+        let bytes_cloned = bytes.clone();
+        let res = tokio::task::spawn_blocking(move || parse_query_response(&bytes_cloned))
+            .await
+            .map_err(|e| anyhow!("spawn_blocking join error: {e}"))??;
@@
-        Ok((res, bytes.len().try_into().unwrap()))
+        Ok((res, u64::try_from(bytes.len()).unwrap_or(u64::MAX)))
 }

424-464: Cap’n Proto path: add Accept header and avoid fallible usize→u64 unwrap

  • Mirror Accept header.
  • Use u64::try_from to avoid unwrap on exotic targets.
 async fn get_arrow_impl_capnp(&self, query: &Query) -> Result<(ArrowResponse, u64)> {
@@
-        let res = req
-            .header("content-type", "application/x-capnp")
+        let res = req
+            .header("content-type", "application/x-capnp")
+            .header("accept", "application/x-capnp")
             .body(query_bytes)
             .send()
             .await
             .context("execute http req")?;
@@
-        let res = tokio::task::block_in_place(|| {
-            parse_query_response(&bytes).context("parse query response")
-        })?;
+        let bytes_cloned = bytes.clone();
+        let res = tokio::task::spawn_blocking(move || parse_query_response(&bytes_cloned))
+            .await
+            .map_err(|e| anyhow!("spawn_blocking join error: {e}"))??;
@@
-        Ok((res, bytes.len().try_into().unwrap()))
+        Ok((res, u64::try_from(bytes.len()).unwrap_or(u64::MAX)))
     }
hypersync-net-types/src/trace.rs (2)

32-114: Avoid unchecked usize→u32 casts when initializing capnp lists

Use u32::try_from(..) and propagate a capnp::Error on overflow. It’s defensive and consistent across builders.

-            let mut from_list = builder.reborrow().init_from(self.from.len() as u32);
+            let mut from_list = builder
+                .reborrow()
+                .init_from(u32::try_from(self.from.len()).map_err(|_| {
+                    capnp::Error::failed("trace.from length exceeds u32".into())
+                })?);
@@
-            let mut to_list = builder.reborrow().init_to(self.to.len() as u32);
+            let mut to_list = builder
+                .reborrow()
+                .init_to(u32::try_from(self.to.len()).map_err(|_| {
+                    capnp::Error::failed("trace.to length exceeds u32".into())
+                })?);
@@
-            let mut addr_list = builder.reborrow().init_address(self.address.len() as u32);
+            let mut addr_list = builder
+                .reborrow()
+                .init_address(u32::try_from(self.address.len()).map_err(|_| {
+                    capnp::Error::failed("trace.address length exceeds u32".into())
+                })?);
@@
-                .init_call_type(self.call_type.len() as u32);
+                .init_call_type(u32::try_from(self.call_type.len()).map_err(|_| {
+                    capnp::Error::failed("trace.call_type length exceeds u32".into())
+                })?);
@@
-                .init_reward_type(self.reward_type.len() as u32);
+                .init_reward_type(u32::try_from(self.reward_type.len()).map_err(|_| {
+                    capnp::Error::failed("trace.reward_type length exceeds u32".into())
+                })?);
@@
-            let mut type_list = builder.reborrow().init_type(self.type_.len() as u32);
+            let mut type_list = builder
+                .reborrow()
+                .init_type(u32::try_from(self.type_.len()).map_err(|_| {
+                    capnp::Error::failed("trace.type length exceeds u32".into())
+                })?);
@@
-            let mut sighash_list = builder.reborrow().init_sighash(self.sighash.len() as u32);
+            let mut sighash_list = builder
+                .reborrow()
+                .init_sighash(u32::try_from(self.sighash.len()).map_err(|_| {
+                    capnp::Error::failed("trace.sighash length exceeds u32".into())
+                })?);

116-256: De-duplicate list parsing with small helpers

from_reader repeats near-identical 20-byte address and 4-byte sighash loops. Extract helpers (e.g., read_addr_list(reader.get_from()?), read_sighash_list(...)) to cut repetition and reduce future bugs.

hypersync-net-types/src/response.rs (1)

4-27: DTOs align with client usage; optional: derive Deserialize for RollbackGuard

ArchiveHeight/ChainId are correct for JSON endpoints. If RollbackGuard may ever be JSON-returned, consider adding Deserialize now for symmetry. Otherwise, good as-is.

hypersync-client/src/simple_types.rs (1)

56-65: Document invariants with debug_asserts to guard unwraps in join path

OnlyLogJoinField assumes corresponding log join fields are present and original single-field selections are removed. Add debug_asserts after mutation to make this contract explicit and catch regressions in tests.

     pub(crate) fn add_join_fields_to_selection(&self, field_selection: &mut FieldSelection) {
         match self.block {
             InternalJoinStrategy::NotSelected => (),
             InternalJoinStrategy::OnlyLogJoinField => {
                 field_selection.log.insert(LOG_JOIN_FIELD_WITH_BLOCK);
                 field_selection.block.remove(&BLOCK_JOIN_FIELD);
+                debug_assert!(
+                    field_selection.log.contains(&LOG_JOIN_FIELD_WITH_BLOCK),
+                    "LOG_JOIN_FIELD_WITH_BLOCK must be present for OnlyLogJoinField (block)"
+                );
             }
             InternalJoinStrategy::FullJoin => {
                 field_selection.log.insert(LOG_JOIN_FIELD_WITH_BLOCK);
                 field_selection.block.insert(BLOCK_JOIN_FIELD);
             }
         }
@@
             InternalJoinStrategy::NotSelected => (),
             InternalJoinStrategy::OnlyLogJoinField => {
                 field_selection.log.insert(LOG_JOIN_FIELD_WITH_TX);
                 field_selection.transaction.remove(&TX_JOIN_FIELD);
+                debug_assert!(
+                    field_selection.log.contains(&LOG_JOIN_FIELD_WITH_TX),
+                    "LOG_JOIN_FIELD_WITH_TX must be present for OnlyLogJoinField (tx)"
+                );
             }
             InternalJoinStrategy::FullJoin => {
                 field_selection.log.insert(LOG_JOIN_FIELD_WITH_TX);
                 field_selection.transaction.insert(TX_JOIN_FIELD);
             }
         }

Also applies to: 80-86, 92-98

hypersync-net-types/src/log.rs (3)

22-54: Use u32::try_from for capnp list sizes

Mirror the defensive casting suggestion from trace.rs to avoid unchecked usize→u32 casts.

-            let mut addr_list = builder.reborrow().init_address(self.address.len() as u32);
+            let mut addr_list = builder
+                .reborrow()
+                .init_address(u32::try_from(self.address.len()).map_err(|_| {
+                    capnp::Error::failed("log.address length exceeds u32".into())
+                })?);
@@
-            let mut topics_list = builder.reborrow().init_topics(self.topics.len() as u32);
+            let mut topics_list = builder
+                .reborrow()
+                .init_topics(u32::try_from(self.topics.len()).map_err(|_| {
+                    capnp::Error::failed("log.topics length exceeds u32".into())
+                })?);
@@
-                    .init(i as u32, topic_vec.len() as u32);
+                    .init(
+                        i as u32,
+                        u32::try_from(topic_vec.len()).map_err(|_| {
+                            capnp::Error::failed("log.topic length exceeds u32".into())
+                        })?,
+                    );

80-87: Remove stale commented code

Old TODO/commented lines around address_filter deserialization can be dropped to reduce noise; the real implementation is present below.


91-109: Avoid silent drops with ArrayVec; prefer try_push and log on overflow

If topics exceed capacity (schema change or malformed input), try_push will signal it instead of silently discarding. Optionally log::warn on overflow.

-                if i < 4 && !topic_vec.is_empty() {
-                    topics.push(topic_vec);
-                }
+                if !topic_vec.is_empty() {
+                    if topics.try_push(topic_vec).is_err() {
+                        // Consider logging to surface unexpected over-capacity conditions.
+                        // log::warn!("topics capacity exceeded; extra topics dropped");
+                    }
+                }
hypersync-net-types/src/block.rs (1)

44-80: Consider documenting the silent-drop behavior and adding observability.

The silent dropping of invalid entries (hashes ≠32 bytes, miners ≠20 bytes) is confirmed to be an intentional, consistent pattern across all filter implementations (block.rs, log.rs, trace.rs, transaction.rs). However, the codebase provides no documentation or observability for this behavior.

While this defensive parsing approach provides resilience, consider:

  • Adding a doc comment to explain why invalid entries are dropped (e.g., for malformed Cap'n Proto messages)
  • Including debug logging when entries are skipped (useful for troubleshooting serialization issues)
  • Tracking a metric/counter for monitoring data loss
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f91c16a and c3726a8.

⛔ Files ignored due to path filters (2)
  • hypersync-net-types/src/__generated__/hypersync_net_types_capnp.rs is excluded by !**/__generated__/**
  • hypersync-net-types/src/__generated__/mod.rs is excluded by !**/__generated__/**
📒 Files selected for processing (19)
  • hypersync-client/Cargo.toml (2 hunks)
  • hypersync-client/src/config.rs (1 hunks)
  • hypersync-client/src/lib.rs (5 hunks)
  • hypersync-client/src/preset_query.rs (7 hunks)
  • hypersync-client/src/simple_types.rs (4 hunks)
  • hypersync-client/tests/api_test.rs (4 hunks)
  • hypersync-format/src/types/bloom_filter_wrapper.rs (2 hunks)
  • hypersync-net-types/Cargo.toml (2 hunks)
  • hypersync-net-types/Makefile (1 hunks)
  • hypersync-net-types/build.rs (0 hunks)
  • hypersync-net-types/hypersync_net_types.capnp (1 hunks)
  • hypersync-net-types/src/block.rs (1 hunks)
  • hypersync-net-types/src/lib.rs (2 hunks)
  • hypersync-net-types/src/log.rs (1 hunks)
  • hypersync-net-types/src/query.rs (1 hunks)
  • hypersync-net-types/src/response.rs (1 hunks)
  • hypersync-net-types/src/trace.rs (1 hunks)
  • hypersync-net-types/src/transaction.rs (1 hunks)
  • hypersync-net-types/src/types.rs (1 hunks)
💤 Files with no reviewable changes (1)
  • hypersync-net-types/build.rs
🧰 Additional context used
🧬 Code graph analysis (11)
hypersync-net-types/src/types.rs (1)
hypersync-format/src/types/fixed_size_data.rs (1)
  • FixedSizeData (105-105)
hypersync-net-types/src/trace.rs (4)
hypersync-net-types/src/log.rs (8)
  • populate_builder (23-54)
  • from_reader (57-116)
  • cmp (154-156)
  • partial_cmp (160-162)
  • all (166-169)
  • to_capnp (172-191)
  • from_capnp (194-213)
  • schema (226-230)
hypersync-net-types/src/transaction.rs (10)
  • populate_builder (63-84)
  • populate_builder (118-207)
  • from_reader (87-114)
  • from_reader (210-365)
  • cmp (437-439)
  • partial_cmp (443-445)
  • all (449-452)
  • to_capnp (455-566)
  • from_capnp (569-680)
  • schema (693-697)
hypersync-net-types/src/lib.rs (5)
  • populate_builder (52-52)
  • populate_builder (64-79)
  • from_reader (54-56)
  • from_reader (81-91)
  • from (43-48)
hypersync-schema/src/lib.rs (1)
  • trace (131-160)
hypersync-client/src/preset_query.rs (3)
hypersync-net-types/src/block.rs (2)
  • BlockField (264-267)
  • all (145-148)
hypersync-net-types/src/log.rs (2)
  • LogField (231-234)
  • all (166-169)
hypersync-net-types/src/transaction.rs (2)
  • TransactionField (698-701)
  • all (449-452)
hypersync-net-types/src/log.rs (6)
hypersync-net-types/src/block.rs (6)
  • populate_builder (21-42)
  • from_reader (45-79)
  • all (145-148)
  • to_capnp (151-196)
  • from_capnp (199-244)
  • schema (259-263)
hypersync-net-types/src/trace.rs (6)
  • populate_builder (33-113)
  • from_reader (116-256)
  • all (323-326)
  • to_capnp (329-365)
  • from_capnp (368-404)
  • schema (417-421)
hypersync-net-types/src/lib.rs (5)
  • populate_builder (52-52)
  • populate_builder (64-79)
  • from_reader (54-56)
  • from_reader (81-91)
  • from (43-48)
hypersync-format/src/types/bloom_filter_wrapper.rs (3)
  • new (27-29)
  • from (103-105)
  • from_bytes (58-62)
hypersync-net-types/src/query.rs (1)
  • test_query_serde (457-496)
hypersync-schema/src/lib.rs (1)
  • log (113-129)
hypersync-net-types/src/transaction.rs (6)
hypersync-net-types/src/block.rs (6)
  • populate_builder (21-42)
  • from_reader (45-79)
  • all (145-148)
  • to_capnp (151-196)
  • from_capnp (199-244)
  • schema (259-263)
hypersync-net-types/src/trace.rs (6)
  • populate_builder (33-113)
  • from_reader (116-256)
  • all (323-326)
  • to_capnp (329-365)
  • from_capnp (368-404)
  • schema (417-421)
hypersync-net-types/src/lib.rs (5)
  • populate_builder (52-52)
  • populate_builder (64-79)
  • from_reader (54-56)
  • from_reader (81-91)
  • from (43-48)
hypersync-net-types/src/query.rs (1)
  • test_query_serde (457-496)
hypersync-format/src/types/bloom_filter_wrapper.rs (3)
  • from (103-105)
  • new (27-29)
  • from_bytes (58-62)
hypersync-schema/src/lib.rs (1)
  • transaction (61-111)
hypersync-client/src/simple_types.rs (3)
hypersync-net-types/src/block.rs (1)
  • BlockField (264-267)
hypersync-net-types/src/log.rs (1)
  • LogField (231-234)
hypersync-net-types/src/transaction.rs (1)
  • TransactionField (698-701)
hypersync-client/tests/api_test.rs (5)
hypersync-net-types/src/block.rs (2)
  • BlockField (264-267)
  • all (145-148)
hypersync-net-types/src/log.rs (2)
  • LogField (231-234)
  • all (166-169)
hypersync-net-types/src/transaction.rs (2)
  • TransactionField (698-701)
  • all (449-452)
hypersync-client/src/lib.rs (1)
  • new (70-93)
hypersync-client/src/config.rs (2)
  • default (39-42)
  • default (92-94)
hypersync-client/src/lib.rs (1)
hypersync-client/src/parse_response.rs (1)
  • parse_query_response (29-96)
hypersync-net-types/src/query.rs (6)
hypersync-net-types/src/block.rs (3)
  • BlockField (264-267)
  • from_capnp (199-244)
  • from_reader (45-79)
hypersync-net-types/src/log.rs (3)
  • LogField (231-234)
  • from_capnp (194-213)
  • from_reader (57-116)
hypersync-net-types/src/trace.rs (3)
  • TraceField (422-425)
  • from_capnp (368-404)
  • from_reader (116-256)
hypersync-net-types/src/transaction.rs (4)
  • TransactionField (698-701)
  • from_capnp (569-680)
  • from_reader (87-114)
  • from_reader (210-365)
hypersync-client/src/preset_query.rs (2)
  • logs (66-82)
  • transactions (120-133)
hypersync-net-types/src/lib.rs (2)
  • from_reader (54-56)
  • from_reader (81-91)
hypersync-net-types/src/block.rs (6)
hypersync-net-types/src/log.rs (6)
  • populate_builder (23-54)
  • from_reader (57-116)
  • all (166-169)
  • to_capnp (172-191)
  • from_capnp (194-213)
  • schema (226-230)
hypersync-net-types/src/trace.rs (6)
  • populate_builder (33-113)
  • from_reader (116-256)
  • all (323-326)
  • to_capnp (329-365)
  • from_capnp (368-404)
  • schema (417-421)
hypersync-net-types/src/transaction.rs (8)
  • populate_builder (63-84)
  • populate_builder (118-207)
  • from_reader (87-114)
  • from_reader (210-365)
  • all (449-452)
  • to_capnp (455-566)
  • from_capnp (569-680)
  • schema (693-697)
hypersync-net-types/src/lib.rs (5)
  • populate_builder (52-52)
  • populate_builder (64-79)
  • from_reader (54-56)
  • from_reader (81-91)
  • from (43-48)
hypersync-net-types/src/query.rs (1)
  • test_query_serde (457-496)
hypersync-schema/src/lib.rs (1)
  • block_header (27-59)
hypersync-net-types/src/lib.rs (4)
hypersync-net-types/src/block.rs (2)
  • populate_builder (21-42)
  • from_reader (45-79)
hypersync-net-types/src/log.rs (2)
  • populate_builder (23-54)
  • from_reader (57-116)
hypersync-net-types/src/trace.rs (2)
  • populate_builder (33-113)
  • from_reader (116-256)
hypersync-net-types/src/transaction.rs (4)
  • populate_builder (63-84)
  • populate_builder (118-207)
  • from_reader (87-114)
  • from_reader (210-365)
🪛 checkmake (0.2.2)
hypersync-net-types/Makefile

[warning] 1-1: Missing required phony target "all"

(minphony)


[warning] 1-1: Missing required phony target "clean"

(minphony)


[warning] 1-1: Missing required phony target "test"

(minphony)

🪛 GitHub Actions: CI
hypersync-client/Cargo.toml

[error] 1-1: Process completed with exit code 101 during cargo test --release.

hypersync-net-types/Cargo.toml

[error] 1-1: Process completed with exit code 101 during cargo test --release.

hypersync-client/tests/api_test.rs

[error] 563-563: test_api_capnp_client panicked: called Result::unwrap() on an Err value: start inner stream -> Caused by: error sending request for url (http://localhost:1131/height) -> Connection refused (os error 111). The integration test failed due to the local server not starting or being unavailable.

🔇 Additional comments (50)
hypersync-net-types/hypersync_net_types.capnp (5)

26-29: LGTM: Generic Selection pattern.

The generic Selection(T) struct with include/exclude semantics is a clean pattern for filtering.


32-74: Clarify the "Filter" suffix field pattern.

Several filter structs have dual patterns: list fields (e.g., address @0 :List(Data)) and singular "Filter" suffix fields (e.g., addressFilter @1 :Data). The semantic difference between these patterns is not documented in the schema.

For example, LogFilter has both address (list) and addressFilter (singular data), while BlockFilter only has list fields. This inconsistency and lack of documentation could lead to incorrect usage.

Consider adding comments to explain:

  • What the "Filter" suffix fields represent (bloom filters? bitmap filters?)
  • When to use list fields vs. filter fields
  • Why BlockFilter doesn't follow this pattern

76-81: LGTM: Field selection pattern.

The FieldSelection struct with lists of field enums enables efficient field-level filtering to reduce payload size.


83-87: LGTM: JoinMode enum.

Clear enumeration of join modes for query behavior.


89-210: LGTM: Comprehensive field enums.

The field enums (BlockField, TransactionField, LogField, TraceField) provide extensive coverage of blockchain data fields, including L1/L2 optimizations and EIP-4844 blob fields. Sequential numbering is correct.

hypersync-format/src/types/bloom_filter_wrapper.rs (2)

18-24: LGTM: Equality implementation is correct.

The PartialEq and Eq implementations correctly compare bloom filters by their byte representations, which is the appropriate way to check equality for this wrapper type.


58-62: LGTM: Deserialization method correctly implemented.

The from_bytes method provides a clean constructor for deserializing bloom filters from raw bytes, properly handling errors via the Error::BloomFilterFromBytes variant.

hypersync-client/Cargo.toml (2)

3-3: LGTM: Version bump aligns with release candidate progression.


50-50: LGTM: Dependency version aligned with net-types changes.

hypersync-net-types/src/types.rs (1)

1-3: LGTM: Type alias improves clarity and type safety.

The Sighash type alias correctly represents 4-byte Ethereum function signatures using FixedSizeData<4>, providing better semantic meaning than using the raw type directly.

hypersync-net-types/Cargo.toml (3)

3-3: LGTM: Version bump aligns with release candidate progression.


14-16: LGTM: Dependencies support enum-based field selections.

The addition of schemars, strum, and strum_macros appropriately supports the migration to strongly-typed field enums (BlockField, TransactionField, LogField) introduced in this PR.


20-20: LGTM: Dev dependency added for testing.

hypersync-client/src/config.rs (2)

24-27: LGTM: New configuration field enables serialization format selection.

The serialization_format field properly uses #[serde(default)] to ensure backward compatibility with existing configurations that don't specify this field.


29-43: LGTM: Well-designed enum with backward-compatible default.

The SerializationFormat enum provides a clear choice between JSON and Cap'n Proto serialization. The default implementation maintains backward compatibility while the comment explains the transitional nature, which is helpful for future maintainers.

hypersync-client/src/preset_query.rs (6)

6-8: LGTM: Imports support type-safe field selections.


42-46: LGTM: Consistent use of typed field enums.


67-67: LGTM: Using all() method from LogField enum.


98-98: LGTM: Consistent enum usage for log fields.


121-121: LGTM: Transaction field enum usage.


144-144: LGTM: Consistent pattern throughout the file.

hypersync-client/tests/api_test.rs (3)

5-12: LGTM: Updated imports support type-safe field selections and Cap'n Proto testing.


21-23: LGTM: Migration to type-safe BlockField enum.


453-455: LGTM: Consistent use of TransactionField enum.

hypersync-client/src/lib.rs (3)

38-38: Public re-export looks good

Re-exporting SerializationFormat at the crate root is a helpful API addition for callers. No issues.


64-66: Config wiring for serialization_format is correct

Field added and initialized from ClientConfig.serialization_format; no behavior risk detected.

Also applies to: 91-91


466-472: Dispatching by SerializationFormat is clear

Match on SerializationFormat with dedicated impls keeps concerns separated. LGTM.

hypersync-net-types/src/trace.rs (3)

7-30: TraceFilter shape and serde look correct

Fields and serde(rename = "type") align with hypersync-schema::trace(). No issues.


259-405: TraceField enum and capnp mappings are comprehensive

Variant set matches schema, with stable ordering and to_capnp/from_capnp symmetry. LGTM.


407-482: Tests meaningfully guard schema/mapping/serde

The schema parity check and round-trip tests are valuable here. Nice coverage.

hypersync-client/src/simple_types.rs (1)

9-11: Move to typed field enums eliminates string drift

Importing BlockField/TransactionField/LogField and using typed consts is a solid improvement.

Also applies to: 31-34

hypersync-net-types/src/log.rs (2)

119-214: LogField enum and capnp mappings look correct

Full coverage with stable ordering and symmetric conversions. LGTM.


216-295: Tests cover schema parity and serde round-trips

Good unit coverage and realistic full-value case.

hypersync-net-types/src/block.rs (4)

6-18: LGTM!

The BlockFilter structure is well-documented and follows consistent patterns with the other filter types (LogFilter, TransactionFilter) in the codebase. The use of empty vectors to represent "match all" semantics is clearly documented.


21-42: LGTM!

The populate_builder implementation correctly follows the Cap'n Proto builder pattern and is consistent with similar implementations in log.rs and trace.rs. The use of reborrow() prevents borrow checker issues when populating multiple fields.


82-245: LGTM!

The BlockField enum and its Cap'n Proto conversion methods are well-implemented:

  • Exhaustive matches in to_capnp() and from_capnp() ensure compile-time safety if the schema changes
  • The Ord implementation based on string representation provides consistent ordering
  • Using strum's IntoEnumIterator for the all() method is idiomatic and maintainable

247-308: LGTM!

The test suite provides good coverage:

  • Schema alignment verification ensures BlockField variants match the Arrow schema
  • Serde/strum consistency check prevents serialization mismatches
  • Round-trip test with actual data validates the Cap'n Proto implementation
hypersync-net-types/src/query.rs (4)

108-128: LGTM!

The Cap'n Proto serialization methods follow standard patterns and use packed serialization for efficiency.


130-255: LGTM!

The populate_capnp_query method correctly populates all fields and uses proper reborrow patterns. The comment "Hehe" at line 142 is a bit informal but doesn't affect functionality.


344-446: LGTM!

The parsing logic for max limits, join mode, and selections correctly propagates errors and follows consistent patterns.


457-627: LGTM!

The test suite provides comprehensive coverage including edge cases (default values, explicit defaults, large payloads). The benchmarking output is useful for performance analysis, though printing in tests is unconventional.

hypersync-net-types/src/lib.rs (3)

1-49: LGTM!

The module organization is clean and the re-exports maintain backward compatibility. The Selection<T> struct with include/exclude semantics is well-documented.


51-57: LGTM!

The BuilderReader trait provides a clean abstraction for Cap'n Proto serialization/deserialization and is appropriately scoped as pub(crate).


59-92: LGTM!

The BuilderReader implementation for Selection<T> correctly handles the include/exclude pattern with proper borrow management using scoped blocks and reborrow().

hypersync-net-types/src/transaction.rs (6)

1-58: LGTM!

The AuthorizationSelection and TransactionFilter structures are well-documented and follow consistent patterns with other filter types. The use of FilterWrapper for probabilistic filtering is clearly documented.


60-115: LGTM with caveat.

The BuilderReader implementation for AuthorizationSelection follows the same pattern as other filter types. Note that it shares the same behavior of silently dropping invalid address entries (line 105), as discussed in the block.rs review.


117-207: LGTM!

The populate_builder implementation for TransactionFilter correctly handles all fields including nested AuthorizationSelection entries.


209-366: LGTM with caveat.

The from_reader implementation for TransactionFilter correctly handles all fields including nested authorization lists. It shares the same silent-drop behavior for invalid entries as noted in earlier reviews. The error handling for FilterWrapper deserialization is appropriate.


368-681: LGTM!

The TransactionField enum and conversion methods are well-implemented with exhaustive matches ensuring compile-time safety against schema changes.


683-824: LGTM!

The test suite provides comprehensive coverage including nested AuthorizationSelection and various edge cases. The schema alignment test ensures consistency with the Arrow schema.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c3726a8 and fe5874c.

⛔ Files ignored due to path filters (1)
  • hypersync-net-types/src/__generated__/hypersync_net_types_capnp.rs is excluded by !**/__generated__/**
📒 Files selected for processing (2)
  • hypersync-client/tests/api_test.rs (4 hunks)
  • hypersync-net-types/Makefile (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • hypersync-client/tests/api_test.rs
🧰 Additional context used
🪛 checkmake (0.2.2)
hypersync-net-types/Makefile

[warning] 1-1: Missing required phony target "all"

(minphony)


[warning] 1-1: Missing required phony target "clean"

(minphony)


[warning] 1-1: Missing required phony target "test"

(minphony)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test_release
  • GitHub Check: lint
🔇 Additional comments (1)
hypersync-net-types/Makefile (1)

2-2: Let me examine the complete Makefile to understand the full context and verify what line 3 references:

Now let me check the complete Makefile to see line 3 and verify the full context:

All concerns in the review comment have been verified and resolved—no changes needed.

Based on verification of the codebase:

  • The capnp compiler generates files following the pattern {schema_name}_capnp.rs, and the file hypersync_net_types_capnp.rs correctly follows this convention.
  • The schema file hypersync_net_types.capnp exists at hypersync-net-types/hypersync_net_types.capnp, which is in the correct working directory relative to the Makefile.
  • The output directory ./src/__generated__ exists at the specified path and correctly receives the generated artifacts.
  • The capnp compiler generates source code in the specified directory with a default of the current directory, and the command correctly specifies ./src/__generated__ for output.

The Makefile command is properly configured and all path references are accurate.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
hypersync-net-types/src/query.rs (1)

320-332: Do not silently substitute enum values on decode; propagate errors.

Using .get(i).ok().unwrap_or(...) masks schema/versioning/corruption issues and can return incorrect data. Replace with proper error propagation.

Apply this diff pattern to all four sets:

-                    (0..block_list.len())
-                        .map(|i| {
-                            BlockField::from_capnp(
-                                block_list
-                                    .get(i)
-                                    .ok()
-                                    .unwrap_or(hypersync_net_types_capnp::BlockField::Number),
-                            )
-                        })
-                        .collect::<BTreeSet<_>>()
+                    (0..block_list.len())
+                        .map(|i| block_list.get(i).map(BlockField::from_capnp))
+                        .collect::<Result<BTreeSet<_>, capnp::Error>>()?
@@
-                        (0..tx_list.len())
-                            .map(|i| {
-                                TransactionField::from_capnp(tx_list.get(i).ok().unwrap_or(
-                                    hypersync_net_types_capnp::TransactionField::BlockHash,
-                                ))
-                            })
-                            .collect::<BTreeSet<_>>()
+                        (0..tx_list.len())
+                            .map(|i| tx_list.get(i).map(TransactionField::from_capnp))
+                            .collect::<Result<BTreeSet<_>, capnp::Error>>()?
@@
-                        (0..log_list.len())
-                            .map(|i| {
-                                LogField::from_capnp(log_list.get(i).ok().unwrap_or(
-                                    hypersync_net_types_capnp::LogField::TransactionHash,
-                                ))
-                            })
-                            .collect::<BTreeSet<_>>()
+                        (0..log_list.len())
+                            .map(|i| log_list.get(i).map(LogField::from_capnp))
+                            .collect::<Result<BTreeSet<_>, capnp::Error>>()?
@@
-                        (0..trace_list.len())
-                            .map(|i| {
-                                TraceField::from_capnp(trace_list.get(i).ok().unwrap_or(
-                                    hypersync_net_types_capnp::TraceField::TransactionHash,
-                                ))
-                            })
-                            .collect::<BTreeSet<_>>()
+                        (0..trace_list.len())
+                            .map(|i| trace_list.get(i).map(TraceField::from_capnp))
+                            .collect::<Result<BTreeSet<_>, capnp::Error>>()?

Also applies to: 336-348, 350-362, 364-376

🧹 Nitpick comments (9)
hypersync-client/src/lib.rs (1)

389-423: Be explicit about wire format: add Accept and zstd Content‑Encoding.

To make proxies/load‑balancers behave consistently and document expectations to the server:

  • For JSON path, declare you expect a Cap’n Proto response.
  • For Cap’n Proto path, also declare zstd encoding for the request body and expected response.

Confirm the server expects zstd-compressed Cap’n Proto requests at /query/arrow-ipc/capnp and returns Cap’n Proto for both endpoints.

Apply this diff:

@@
-        let res = req.json(&query).send().await.context("execute http req")?;
+        let res = req
+            .header("accept", "application/x-capnp")
+            .json(&query)
+            .send()
+            .await
+            .context("execute http req")?;
@@
-        let res = req
-            .header("content-type", "application/x-capnp")
+        let res = req
+            .header("content-type", "application/x-capnp")
+            .header("content-encoding", "zstd")
+            .header("accept", "application/x-capnp")
             .body(query_bytes)
             .send()
             .await
             .context("execute http req")?;

Optional: the two impls duplicate URL/headers/error handling; consider a small helper to build the URL and common headers, and pass a serializer (JSON vs Cap’n Proto).

Also applies to: 424-465

hypersync-net-types/src/query.rs (4)

186-193: Remove stray “Hehe” comment.

Keep comments purposeful.

-        // Hehe
         let mut body_builder = query.reborrow().init_body();

124-129: Bound decompression to mitigate zip‑bomb risk.

from_bytes unconditionally fully decompresses input. Add a reasonable cap (configurable) to prevent unbounded allocations.

-    pub fn from_bytes(bytes: &[u8]) -> Result<Self, Box<dyn std::error::Error>> {
-        // Check compression.rs benchmarks
-        let decompressed_bytes = zstd::decode_all(bytes)?;
-        let query = Query::from_capnp_bytes(&decompressed_bytes)?;
-        Ok(query)
-    }
+    pub fn from_bytes(bytes: &[u8]) -> Result<Self, Box<dyn std::error::Error>> {
+        use std::io::Read;
+        // Cap decompression to a sane upper bound (tune as needed).
+        const MAX_DECOMPRESSED: u64 = 16 * 1024 * 1024; // 16 MiB
+        let mut decoder = zstd::stream::read::Decoder::new(bytes)?;
+        let mut limited = decoder.take(MAX_DECOMPRESSED);
+        let mut decompressed_bytes = Vec::new();
+        limited.read_to_end(&mut decompressed_bytes)?;
+        let query = Query::from_capnp_bytes(&decompressed_bytes)?;
+        Ok(query)
+    }

47-61: Consider u64 for max_num_ fields to avoid truncation on 32‑bit.*

max_num_* are Option but serialized as u64. Casting back via as usize may truncate on 32‑bit. Align types to u64 end‑to‑end or validate ranges.

Please confirm supported platforms; if 32‑bit targets are in scope, changing these to Option is safer.

Also applies to: 388-416


108-122: Minor: unify error types for to_bytes/from_bytes.

to_bytes returns Result<Vec, String> while from_bytes returns Result<Self, Box>. Consider a single error type for consistency.

hypersync-net-types/benches/compression.rs (4)

368-373: Remove duplicate JSON encode/decode run (dead work).

This Encoding::new(... "json" ...) result is unused; it just burns time.

-    Encoding::new(
-        &query,
-        "json".to_string(),
-        |q| serde_json::to_vec(q).unwrap(),
-        |bytes| serde_json::from_slice(bytes).unwrap(),
-    );

140-147: Be explicit about ArrayVec capacity (readability, easier refactors).

Topics are 0..=3 on EVM; make CAP=4 explicit to document intent and avoid inference surprises.

-        let mut topics = ArrayVec::new();
+        let mut topics: ArrayVec<Vec<FixedSizeData<32>>, 4> = ArrayVec::new();

288-307: Prefer expect(...) over unwrap() in bench I/O for clearer failures.

Unwrap panics hide the failing step. Using expect("context") improves debuggability without changing behavior.

Example:

-                enc.write_all(bytes).unwrap();
-                enc.finish().unwrap()
+                enc.write_all(bytes).expect("zlib write_all");
+                enc.finish().expect("zlib finish")

Apply similarly to zlib/zstd/lz4 decode paths.

Also applies to: 316-321


381-387: Nit: idiomatic tail return.

You can return the vector without return.

-    println!("Benchmark {label}\n{table}\n");
-    return table_rows;
+    println!("Benchmark {label}\n{table}\n");
+    table_rows
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ae9ee15 and 5f7ccee.

📒 Files selected for processing (4)
  • hypersync-client/src/lib.rs (5 hunks)
  • hypersync-net-types/Cargo.toml (2 hunks)
  • hypersync-net-types/benches/compression.rs (1 hunks)
  • hypersync-net-types/src/query.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • hypersync-net-types/Cargo.toml
🧰 Additional context used
🧬 Code graph analysis (3)
hypersync-net-types/benches/compression.rs (2)
hypersync-net-types/src/log.rs (1)
  • LogField (231-234)
hypersync-net-types/src/query.rs (5)
  • default (91-93)
  • to_capnp_bytes_packed (152-161)
  • from_capnp_bytes_packed (164-172)
  • to_capnp_bytes (132-141)
  • from_capnp_bytes (144-150)
hypersync-net-types/src/query.rs (4)
hypersync-net-types/src/block.rs (3)
  • BlockField (264-267)
  • from_capnp (199-244)
  • from_reader (45-79)
hypersync-net-types/src/log.rs (3)
  • LogField (231-234)
  • from_capnp (194-213)
  • from_reader (57-116)
hypersync-net-types/src/trace.rs (3)
  • TraceField (422-425)
  • from_capnp (368-404)
  • from_reader (116-256)
hypersync-net-types/src/transaction.rs (4)
  • TransactionField (698-701)
  • from_capnp (569-680)
  • from_reader (87-114)
  • from_reader (210-365)
hypersync-client/src/lib.rs (1)
hypersync-client/src/parse_response.rs (1)
  • parse_query_response (29-96)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test_release
  • GitHub Check: lint
🔇 Additional comments (3)
hypersync-client/src/lib.rs (2)

64-66: Good addition: runtime‑selectable serialization.

Field and initialization look correct; defaulting via ClientConfig keeps API predictable.

Also applies to: 91-91


466-473: Dispatch looks clean and future‑proof.

Simple and extensible switch over SerializationFormat.

hypersync-net-types/benches/compression.rs (1)

11-114: Bench configuration is already correct; harness = false is properly set.

Verification confirms the [[bench]] section in hypersync-net-types/Cargo.toml (lines 28–30) already specifies name = "compression" and harness = false. The concern raised is not applicable.

Likely an incorrect or invalid review comment.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
hypersync-net-types/benches/compression.rs (2)

216-223: Guard percentage formatting against 0 baseline to avoid "inf%".

If lowest.as_micros() is 0, division by zero yields "inf%". This issue was previously flagged and remains unresolved.

Apply this diff to guard against zero baseline:

 fn add_percentage(val: Duration, lowest: Duration) -> String {
     if val == lowest {
         format!("{val:?}")
     } else {
-        let percentage = percentage_incr(val.as_micros() as f64, lowest.as_micros() as f64);
-        format!("{val:?} ({percentage}%)")
+        let base = lowest.as_micros() as f64;
+        if base == 0.0 {
+            return format!("{val:?} (+N/A)");
+        }
+        let percentage = percentage_incr(val.as_micros() as f64, base);
+        format!("{val:?} ({percentage}%)")
     }
 }

314-319: LZ4 decode can OOM/panic: don't pass u32::MAX as decompressed size.

lz4_flex::decompress(bytes, u32::MAX as usize) may attempt multi-GB allocation and crash. This issue was previously flagged and remains unresolved.

Apply this diff to use size-prepended block APIs:

 fn lz4_encode(bytes: &[u8], _level: u32) -> Vec<u8> {
-    lz4_flex::compress(bytes)
+    lz4_flex::block::compress_prepend_size(bytes)
 }
 fn lz4_decode(bytes: &[u8], _level: u32) -> Vec<u8> {
-    lz4_flex::decompress(bytes, u32::MAX as usize).unwrap()
+    lz4_flex::block::decompress_size_prepended(bytes).expect("lz4 decode")
 }
🧹 Nitpick comments (3)
hypersync-net-types/benches/compression.rs (3)

18-18: Fix typo in comment.

"len fist" should be "len first".

-    // sort by len fist
+    // sort by len first

90-92: Consider removing or documenting commented-out benchmark.

If this "huge payload less contracts" benchmark is intentionally disabled, add a comment explaining why. Otherwise, remove it to reduce clutter.


17-53: Consider extracting rank-computation logic to reduce duplication.

The three ranking blocks (bytes, decode time, encode time) follow the same pattern. You could extract a generic helper function to reduce duplication.

Example refactor:

fn update_ranks<T: Ord + Copy + Default>(
    encodings: &[Encoding],
    key_fn: impl Fn(&Encoding) -> T,
    ranks: &mut HashMap<String, usize>,
) {
    let mut sorted = encodings.to_vec();
    sorted.sort_by_key(key_fn);
    let mut prev_val = T::default();
    let mut current_pos = 0;
    for encoding in sorted.iter() {
        let val = key_fn(encoding);
        if val > prev_val {
            current_pos += 1;
            prev_val = val;
        }
        let current = ranks.get(&encoding.name).unwrap_or(&0);
        ranks.insert(encoding.name.clone(), current + current_pos);
    }
}

Then call it three times:

update_ranks(&encodings, |e| e.bytes.len(), &mut bytes_ranks);
update_ranks(&encodings, |e| e.decode_time, &mut decode_time_ranks);
update_ranks(&encodings, |e| e.encode_time, &mut encode_time_ranks);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5f7ccee and 8e4e85a.

📒 Files selected for processing (2)
  • README.md (1 hunks)
  • hypersync-net-types/benches/compression.rs (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
hypersync-net-types/benches/compression.rs (2)
hypersync-net-types/src/log.rs (1)
  • LogField (231-234)
hypersync-net-types/src/query.rs (5)
  • default (91-93)
  • to_capnp_bytes_packed (152-161)
  • from_capnp_bytes_packed (164-172)
  • to_capnp_bytes (132-141)
  • from_capnp_bytes (144-150)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: lint
  • GitHub Check: test_dev
  • GitHub Check: test_release
🔇 Additional comments (2)
README.md (2)

5-5: Indentation formatting change.

Line 5 shows a modification to the crates.io badge image tag. The current indentation appears appropriate for nested HTML elements. However, verify that this formatting aligns with the project's markdown conventions and that the visual rendering of the badge is not affected.


1-11: Verify that removal of Dependencies section is intentional and complete.

According to the AI summary, the Dependencies section (including installation instructions for Linux, Windows, and macOS) has been removed. While this may be intentional given the PR's architectural changes, ensure that critical setup instructions are still available to users elsewhere in the documentation (e.g., linked docs pages or a new setup section).

Consider whether any essential getting-started information should be retained or relocated within the README.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
hypersync-net-types/src/query.rs (2)

38-38: Fix documentation typos.

The following typos remain from previous reviews:

  • Line 38: "Weather" → "Whether"
  • Line 66: "Applites similarly" → "Applies similarly"
  • Line 84-85: "Applites similarly" → "Applies similarly"

Also applies to: 66-66, 84-85


321-377: Silent substitution of fallback field values masks errors.

Lines 324-331, 340-346, 354-360, and 369-374 use .ok().unwrap_or(...) with hardcoded fallback enum values when deserializing field selections. This silently substitutes arbitrary fields when the Cap'n Proto enum value is unknown, the data is corrupted, or deserialization fails. This could result in queries returning incorrect data without any indication to the user.

🧹 Nitpick comments (1)
hypersync-net-types/src/query.rs (1)

110-130: Consider consistent error types for symmetry.

The to_bytes method returns anyhow::Result<Vec<u8>> while from_bytes returns Result<Self, Box<dyn std::error::Error>>. For API consistency, consider using the same error type for both methods.

Apply this diff for consistency:

-    pub fn from_bytes(bytes: &[u8]) -> Result<Self, Box<dyn std::error::Error>> {
+    pub fn from_bytes(bytes: &[u8]) -> anyhow::Result<Self> {
         // Check compression.rs benchmarks
         let decompressed_bytes = zstd::decode_all(bytes)?;
         let query = Query::from_capnp_bytes(&decompressed_bytes)?;
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8e4e85a and abcee8a.

📒 Files selected for processing (3)
  • hypersync-net-types/Cargo.toml (2 hunks)
  • hypersync-net-types/Makefile (1 hunks)
  • hypersync-net-types/src/query.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • hypersync-net-types/Cargo.toml
🧰 Additional context used
🧬 Code graph analysis (1)
hypersync-net-types/src/query.rs (4)
hypersync-net-types/src/block.rs (3)
  • BlockField (264-267)
  • from_capnp (199-244)
  • from_reader (45-79)
hypersync-net-types/src/log.rs (3)
  • LogField (231-234)
  • from_capnp (194-213)
  • from_reader (57-116)
hypersync-net-types/src/trace.rs (3)
  • TraceField (422-425)
  • from_capnp (368-404)
  • from_reader (116-256)
hypersync-net-types/src/transaction.rs (4)
  • TransactionField (698-701)
  • from_capnp (569-680)
  • from_reader (87-114)
  • from_reader (210-365)
🪛 checkmake (0.2.2)
hypersync-net-types/Makefile

[warning] 1-1: Missing required phony target "all"

(minphony)


[warning] 1-1: Missing required phony target "clean"

(minphony)


[warning] 1-1: Missing required phony target "test"

(minphony)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test_release
  • GitHub Check: test_dev
🔇 Additional comments (7)
hypersync-net-types/Makefile (1)

1-8: LGTM! Makefile targets properly address previous feedback.

The .PHONY declaration and specialized clean target (clean_generated_capnp_types) appropriately address the previous review suggestions. The specialized naming is reasonable for a project-specific Makefile focused solely on Cap'n Proto generation. The static analysis warnings about missing conventional targets (all, clean, test) can be safely ignored in this context.

hypersync-net-types/src/query.rs (6)

73-77: Clean helper function implementation.

The is_default helper provides a clean way to skip serializing default-valued fields in serde, improving payload efficiency.


79-95: Well-structured JoinMode enum.

Clear documentation and appropriate Default implementation.


97-107: Good field selection structure.

Using BTreeSet provides ordering and deduplication, while skip_serializing_if optimizes the serialized payload.


132-173: Proper Cap'n Proto serialization implementation.

The standard and packed serialization methods correctly follow Cap'n Proto patterns. Clear labeling of packed versions as "for testing" is helpful.


175-300: Comprehensive serialization with proper Cap'n Proto patterns.

The method correctly handles all fields, properly uses reborrow(), and appropriately manages optional values. The implementation is thorough and correct.


493-575: Excellent test coverage.

The test module provides comprehensive coverage across multiple serialization formats (Cap'n Proto standard, packed, and JSON) and properly tests default, explicit default, and non-default scenarios. The use of pretty_assertions will aid debugging.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
hypersync-net-types/src/query.rs (1)

507-551: Consider adding tests with non-empty selections.

The current tests verify serialization formats but all use empty vectors for logs, transactions, traces, and blocks. Adding a test case with populated selections would improve coverage of the nested serialization logic in populate_capnp_query and from_capnp_query.

Example test case:

#[test]
pub fn test_query_serde_with_selections() {
    let query = Query {
        from_block: 100,
        to_block: Some(200),
        logs: vec![LogSelection::default()],
        transactions: vec![TransactionSelection::default()],
        field_selection: FieldSelection {
            block: BlockField::all().into_iter().take(3).collect(),
            log: LogField::all().into_iter().take(3).collect(),
            ..Default::default()
        },
        ..Default::default()
    };
    test_query_serde(query, "query with selections");
}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between abcee8a and 4d866f3.

📒 Files selected for processing (1)
  • hypersync-net-types/src/query.rs (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
hypersync-net-types/src/query.rs (5)
hypersync-net-types/src/block.rs (3)
  • BlockField (264-267)
  • from_capnp (199-244)
  • from_reader (45-79)
hypersync-net-types/src/log.rs (3)
  • LogField (231-234)
  • from_capnp (194-213)
  • from_reader (57-116)
hypersync-net-types/src/trace.rs (3)
  • TraceField (422-425)
  • from_capnp (368-404)
  • from_reader (116-256)
hypersync-net-types/src/transaction.rs (4)
  • TransactionField (698-701)
  • from_capnp (569-680)
  • from_reader (87-114)
  • from_reader (210-365)
hypersync-client/src/preset_query.rs (2)
  • logs (66-82)
  • transactions (120-133)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test_dev
🔇 Additional comments (4)
hypersync-net-types/src/query.rs (4)

12-107: Well-structured Query model with proper serde attributes.

The struct definitions are clean and well-documented. The use of skip_serializing_if for optional and default fields will keep serialized payloads minimal. The is_default helper is a good pattern for skipping default-valued fields.


109-173: Good serialization strategy with appropriate compression.

The use of zstd compression at level 6 with regular Cap'n Proto bytes (rather than packed) is well-justified by the comment referencing benchmarks. The separate packed methods for testing provide useful flexibility.


175-300: Comprehensive Cap'n Proto builder population.

The method correctly handles all Query fields, including optional values and nested structures. The use of reborrow() is appropriate for Cap'n Proto's builder pattern.


320-354: Silent substitution issue correctly fixed.

The field selection deserialization now properly propagates errors using .map().collect::<Result<_, _>>()? instead of silently substituting fallback values. This ensures that unknown or corrupted enum values will result in an error rather than incorrect query data.

Comment on lines +367 to +394
let max_num_blocks = if body.has_max_num_blocks() {
let max_blocks_reader = body.get_max_num_blocks()?;
let value = max_blocks_reader.get_value();
Some(value as usize)
} else {
None
};
let max_num_transactions = if body.has_max_num_transactions() {
let max_tx_reader = body.get_max_num_transactions()?;
let value = max_tx_reader.get_value();
Some(value as usize)
} else {
None
};
let max_num_logs = if body.has_max_num_logs() {
let max_logs_reader = body.get_max_num_logs()?;
let value = max_logs_reader.get_value();
Some(value as usize)
} else {
None
};
let max_num_traces = if body.has_max_num_traces() {
let max_traces_reader = body.get_max_num_traces()?;
let value = max_traces_reader.get_value();
Some(value as usize)
} else {
None
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Consider overflow protection for u64 to usize conversions.

On 32-bit systems, casting u64 to usize (lines 370, 377, 384, 391) could overflow if the values exceed usize::MAX. While unlikely in practice for query limits, consider using TryFrom or adding explicit bounds checking.

Apply this pattern for safer conversions:

 let max_num_blocks = if body.has_max_num_blocks() {
     let max_blocks_reader = body.get_max_num_blocks()?;
     let value = max_blocks_reader.get_value();
-    Some(value as usize)
+    Some(usize::try_from(value).map_err(|_| capnp::Error::failed("max_num_blocks exceeds usize::MAX".to_string()))?)
 } else {
     None
 };

Apply similar fixes for max_num_transactions, max_num_logs, and max_num_traces.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let max_num_blocks = if body.has_max_num_blocks() {
let max_blocks_reader = body.get_max_num_blocks()?;
let value = max_blocks_reader.get_value();
Some(value as usize)
} else {
None
};
let max_num_transactions = if body.has_max_num_transactions() {
let max_tx_reader = body.get_max_num_transactions()?;
let value = max_tx_reader.get_value();
Some(value as usize)
} else {
None
};
let max_num_logs = if body.has_max_num_logs() {
let max_logs_reader = body.get_max_num_logs()?;
let value = max_logs_reader.get_value();
Some(value as usize)
} else {
None
};
let max_num_traces = if body.has_max_num_traces() {
let max_traces_reader = body.get_max_num_traces()?;
let value = max_traces_reader.get_value();
Some(value as usize)
} else {
None
};
let max_num_blocks = if body.has_max_num_blocks() {
let max_blocks_reader = body.get_max_num_blocks()?;
let value = max_blocks_reader.get_value();
Some(usize::try_from(value).map_err(|_| capnp::Error::failed("max_num_blocks exceeds usize::MAX".to_string()))?)
} else {
None
};
let max_num_transactions = if body.has_max_num_transactions() {
let max_tx_reader = body.get_max_num_transactions()?;
let value = max_tx_reader.get_value();
Some(usize::try_from(value).map_err(|_| capnp::Error::failed("max_num_transactions exceeds usize::MAX".to_string()))?)
} else {
None
};
let max_num_logs = if body.has_max_num_logs() {
let max_logs_reader = body.get_max_num_logs()?;
let value = max_logs_reader.get_value();
Some(usize::try_from(value).map_err(|_| capnp::Error::failed("max_num_logs exceeds usize::MAX".to_string()))?)
} else {
None
};
let max_num_traces = if body.has_max_num_traces() {
let max_traces_reader = body.get_max_num_traces()?;
let value = max_traces_reader.get_value();
Some(usize::try_from(value).map_err(|_| capnp::Error::failed("max_num_traces exceeds usize::MAX".to_string()))?)
} else {
None
};
🤖 Prompt for AI Agents
In hypersync-net-types/src/query.rs around lines 367 to 394, the code casts u64
values to usize directly which can overflow on 32-bit targets; replace the
direct casts with fallible conversions (e.g., use
usize::try_from(value).map(Some).unwrap_or_else(|| /* handle overflow */ None)
or explicitly check value <= usize::MAX and then cast) so each max_num_*
assignment uses TryFrom or a bounds check and either maps to Some(usize) on
success or None/returns an error on overflow depending on desired behavior;
apply the same pattern for max_num_blocks, max_num_transactions, max_num_logs,
and max_num_traces.

@JonoPrest JonoPrest merged commit d7de2d3 into main Oct 29, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants