fix parquet table scans by aheev · Pull Request #473 · LadybugDB/ladybug

aheev · 2026-05-10T06:10:30Z

fixed multi-hop, var-len query scans in parquet tables by adding isVisible funcs and setting selVectorToFlat in each scan
refactored parquet_rel_table scan to return only rows of a single boundNode
removed unnecessary scanState data
expanded test suite

aheev · 2026-05-10T06:10:42Z

@adsharma could you PTAL?

aheev · 2026-05-10T06:25:42Z

benchmark is hanging. Lemme fix it

aheev · 2026-05-10T13:40:31Z

minimal_test succeeds everytime in my local. Not sure why it fails in CI. Lemme take a look

adsharma · 2026-05-10T13:59:26Z

Unexpected error for query: Assertion failed in file "/home/runner/work/ladybug/ladybug/src/common/vector/value_vector.cpp" on line 135: srcVector->dataType.getPhysicalType() == dataType.getPhysicalType()

The CI tests run as relwithdebinfo with assertions enabled. Perhaps you're running the release builds locally?

aheev · 2026-05-10T14:05:33Z

Unexpected error for query: Assertion failed in file "/home/runner/work/ladybug/ladybug/src/common/vector/value_vector.cpp" on line 135: srcVector->dataType.getPhysicalType() == dataType.getPhysicalType()
The CI tests run as relwithdebinfo with assertions enabled. Perhaps you're running the release builds locally?

Understood. I am running test-build

aheev · 2026-05-10T16:25:40Z

The latest commit should fix the build

adsharma

I'll go ahead and merge this one because it fixes many problems.

Let's continue fixing the more complex query cases where there may be duplicate tuples in the input/bound side in future PRs.

adsharma · 2026-05-10T19:52:01Z

        ParquetRelTableScanState& parquetRelScanState,
        const std::vector<uint64_t>& rowGroupsToProcess,
-        const std::unordered_set<common::offset_t>& boundNodeOffsets);
+        const std::unordered_map<common::offset_t, common::sel_t>& boundNodeOffsets);


A node can appear multiple times in a factorized/vectorized input because the same bound node may be paired with different upstream values. Example:

MATCH (seed:user), (a:user {id: 100}) WITH seed, a MATCH (a)-[:follows]->(b) RETURN seed.name, b.name

Here a is the same node in every tuple, but each tuple has a different seed. If the scanner stores only:

unordered_map<offset_t, sel_t>

then duplicate a.offset values collapse to one sel_t.

This is not a problem for native tables. Should we use the same method?

Native rel table scan does not use unordered_map<offset_t, vector<sel_t>>.

It uses the original selection vector directly and processes bound-node positions in order:

RelTableScanState::cachedBoundNodeSelVector stores the input sel_ts.

RelTableScanState::currBoundNodeIdx is an index into that cached selection vector.

Native CSR scan repeatedly reads:

cachedBoundNodeSelVector[currBoundNodeIdx]

then uses that selected row to get the bound node offset.

Relevant files:

src/include/storage/table/rel_table.h:22

src/storage/table/rel_table.cpp:91

src/storage/table/csr_node_group.cpp:181

So duplicates are naturally preserved because the scan state is position-driven, not offset-keyed. If the same
node offset appears in three input rows, it appears three times in cachedBoundNodeSelVector, and native scan can
flatten back to each sel_t separately.

The parquet path’s unordered_map<offset_t, sel_t> is different: it indexes by node offset, so duplicate offsets
collapse. That is less general than native CSR behavior. A closer parquet equivalent would be either:

unordered_map<offset_t, vector<sel_t>>

or, more native-like, avoid offset-key ownership and drive scanning from cachedBoundNodeSelVector /
currBoundNodeIdx, preserving each input position independently.

drive scanning from cachedBoundNodeSelVector

This is the plan. I have already implemented this in my previous PR. But had to revert it to avoid risk. It also helps use indptr for fwd scans atleast

A node can appear multiple times in a factorized/vectorized input because the same bound node may be paired with different upstream values

Right now it doesn't cause issues because parquet_node_table always sends one offset at a time. We might want to revisit this if node_table behaviour changes. Although not sure if there's aggregation going on at the higher(operator) level

Our test coverage doesn't exercise many such self-join and complex code paths that would catch a less general implementation. We might have to create 3 hop tests that specifically catch these cases before our users do.

aheev · 2026-05-11T00:57:16Z

benchmark times for first 10 queries. Full suite is running for more than a half a day now

==================================================================================================
Query                          Description                                         parquet avg(ms)
==================================================================================================
q01_count_nodes                Count all nodes                                               2.3
q02_count_edges_meta           Count all edges (metadata fast path)                          0.3
q03_outdeg_high                Out-degree of high-degree node (9766, deg=14                  3.5
q04_outdeg_med                 Out-degree of medium-degree node (3, deg=454                 69.7
q05_outdeg_low                 Out-degree of low-degree node (1000)                         78.0
q06_top10_outdeg               Top-10 nodes by out-degree                               146411.1
q07_count_active_src           Count nodes with at least one out-edge                   146435.0
q08_full_scan_rel              Full edge scan — count bound rel variable                146376.4
q09_full_scan_star             Full edge scan — count(*) (no rel var)                        0.2
q10_outdeg_node_a              Out-degree of second high-degree node (9765,                  2.3

adsharma · 2026-05-11T01:28:48Z

Anything that takes that long is hard to reproduce and will slow us down. How about we look at query plans for:

q06_top10_outdeg               Top-10 nodes by out-degree                               146411.1
q07_count_active_src           Count nodes with at least one out-edge                   146435.0
q08_full_scan_rel              Full edge scan — count bound rel variable                146376.4

And compare them to native tables to see if there is a slow down in the parquet path? I would disable hash indices for that work so disk usage/access is comparable.

aheev · 2026-05-11T01:35:05Z

Anything that takes that long is hard to reproduce and will slow us down. How about we look at query plans for:
q06_top10_outdeg               Top-10 nodes by out-degree                               146411.1
q07_count_active_src           Count nodes with at least one out-edge                   146435.0
q08_full_scan_rel              Full edge scan — count bound rel variable                146376.4
And compare them to native tables to see if there is a slow down in the parquet path? I would disable hash indices for that work so disk usage/access is comparable.

I think parquet would be slower for sure because we are effectively not using indptrs and also parquet doesn't have index so some queries might be even slower

I will run and see the difference

aheev · 2026-05-11T12:31:03Z

ran into a lot of issue while running a disable_hash_index build, so gone ahead with 0.16.1 official release

The times are not even comparable

python3 benchmark.py --backends native

==================================================================================================
Query                          Description                                          native avg(ms)
==================================================================================================
q01_count_nodes                Count all nodes                                               1.4
q02_count_edges_meta           Count all edges (metadata fast path)                          6.6
q03_outdeg_high                Out-degree of high-degree node (9766, deg=14                  6.4
q04_outdeg_med                 Out-degree of medium-degree node (3, deg=454                  6.3
q05_outdeg_low                 Out-degree of low-degree node (1000)                          6.4
q06_top10_outdeg               Top-10 nodes by out-degree                                  320.4
q07_count_active_src           Count nodes with at least one out-edge                      232.3
q08_full_scan_rel              Full edge scan — count bound rel variable                    89.7
q09_full_scan_star             Full edge scan — count(*) (no rel var)                        5.4
q10_outdeg_node_a              Out-degree of second high-degree node (9765,                  4.8

db size was 1.2 GB. where as icebug-disk parquet files were

-rw-rw-r-- 1 aheev aheev 253821746 May  9 09:25 indices_follows.parquet
-rw-rw-r-- 1 aheev aheev  17354289 May  9 09:25 indptr_follows.parquet
-rw-rw-r-- 1 aheev aheev  16574296 May  9 10:01 nodes_user.parquet

aheev · 2026-05-11T13:38:11Z

aha! failing on large queries. Even with 32 GB max_db_size

RUN ERROR: Buffer manager exception: Unable to allocate memory! The buffer pool is full and no memory could be freed!

aheev · 2026-05-15T16:51:14Z

without the hash_index even faster? lbdb size is 1 GB now. Will run on bigger queries tmrw


==================================================================================================
Query                          Description                                          native avg(ms)
==================================================================================================
q01_count_nodes                Count all nodes                                               0.7
q02_count_edges_meta           Count all edges (metadata fast path)                          5.4
q03_outdeg_high                Out-degree of high-degree node (9766, deg=14                  5.0
q04_outdeg_med                 Out-degree of medium-degree node (3, deg=454                  4.2
q05_outdeg_low                 Out-degree of low-degree node (1000)                          4.3
q06_top10_outdeg               Top-10 nodes by out-degree                                  211.0
q07_count_active_src           Count nodes with at least one out-edge                      171.7
q08_full_scan_rel              Full edge scan — count bound rel variable                    55.5
q09_full_scan_star             Full edge scan — count(*) (no rel var)                        4.3
q10_outdeg_node_a              Out-degree of second high-degree node (9765,                  4.2

aheev · 2026-05-16T00:52:26Z

same as native with hash_index failed at query 24, fails with Buffer Manager exception. But still way faster than parquet

python3 benchmark.py --backends native

============================================================
Setting up backend: native
[0, 'id', 'INT32', 'NULL', True]
Running 1 warmup + 5 timed runs per query...
  [native] q01_count_nodes: Count all nodes
    result=[3997962]  avg=1.7ms  min=1.4ms  max=2.7ms
  [native] q02_count_edges_meta: Count all edges (metadata fast path)
    result=[69362378]  avg=8.6ms  min=7.9ms  max=9.5ms
  [native] q03_outdeg_high: Out-degree of high-degree node (9766, deg=14815)
    result=[12870]  avg=7.3ms  min=6.1ms  max=11.9ms
  [native] q04_outdeg_med: Out-degree of medium-degree node (3, deg=454)
    result=[2]  avg=6.3ms  min=5.7ms  max=7.6ms
  [native] q05_outdeg_low: Out-degree of low-degree node (1000)
    result=[28]  avg=6.3ms  min=5.9ms  max=6.6ms
  [native] q06_top10_outdeg: Top-10 nodes by out-degree
    result=[9767, 14815]  avg=350.4ms  min=328.0ms  max=367.2ms
  [native] q07_count_active_src: Count nodes with at least one out-edge
    result=[3997962]  avg=284.1ms  min=264.8ms  max=298.3ms
  [native] q08_full_scan_rel: Full edge scan — count bound rel variable
    result=[69362378]  avg=90.4ms  min=85.5ms  max=101.0ms
  [native] q09_full_scan_star: Full edge scan — count(*) (no rel var)
    result=[69362378]  avg=5.7ms  min=5.5ms  max=5.9ms
  [native] q10_outdeg_node_a: Out-degree of second high-degree node (9765, deg≈12870)
    result=[48]  avg=4.1ms  min=3.7ms  max=4.5ms
  [native] q11_indeg_high: In-degree (follower count) of high-degree node (9766)
    result=[12870]  avg=3.6ms  min=3.4ms  max=3.9ms
  [native] q12_top10_indeg: Top-10 most-followed nodes (influencer detection)
    result=[9767, 14815]  avg=4120.5ms  min=3588.2ms  max=4553.9ms
  [native] q13_mutual_follows_count: Count reciprocal (mutual) follow pairs across the full graph
    result=[34681189]  avg=9252.4ms  min=8649.7ms  max=9636.1ms
  [native] q14_mutual_follows_of_node: List nodes in mutual-follow relationship with 9766
    result=[12870]  avg=1217.2ms  min=1110.4ms  max=1456.4ms
  [native] q15_pymk_high: PYMK recommendations for 9766: top-10 2-hop nodes scored by path count
    result=[9767, 3416]  avg=357.5ms  min=327.2ms  max=384.9ms
  [native] q16_pymk_med: PYMK recommendations for 3: top-10 2-hop nodes scored by path count
    result=[32, 2]  avg=53.8ms  min=51.8ms  max=57.9ms
  [native] q17_common_follows: Common follows between 9766 and 9765 (mutual friends count)
    result=[0]  avg=13.7ms  min=12.3ms  max=15.2ms
  [native] q18_ego_net_size: Ego network size of 9766 (1-hop, undirected)
    result=[12870]  avg=11.3ms  min=10.4ms  max=12.2ms
  [native] q19_2hop_reach: Distinct nodes reachable within 2 hops from 9766 (BFS depth-2)
    result=[330350]  avg=404.6ms  min=386.3ms  max=422.6ms
  [native] q20_3hop_reach: Distinct nodes reachable within 3 hops from 3 (BFS depth-3)
    result=[12047]  avg=51.9ms  min=30.4ms  max=62.2ms
  [native] q21_shortest_path: Shortest path length between 9766 and 1000
    result=[3]  avg=6856.1ms  min=6824.8ms  max=6878.3ms
  [native] q22_all_shortest_paths: Count of all shortest paths between 9766 and 9765
    result=[32, 3]  avg=13639.9ms  min=13580.3ms  max=13703.4ms
  [native] q23_triangles_node: Triangle count through 3 (local)
    result=[2]  avg=41.8ms  min=36.4ms  max=44.6ms
  [native] q24_global_triangles: Global triangle count (all directed 3-cycles, canonical ordering)
    WARMUP ERROR: Buffer manager exception: Unable to allocate memory! The buffer pool is full and no memory could be freed!
    RUN ERROR: Buffer manager exception: Unable to allocate memory! The buffer pool is full and no memory could be freed!

aheev · 2026-05-16T01:00:44Z

@adsharma how can I verify if hash_index is created or not? size on disk? If you look above, the queries w/o index are faster

adsharma · 2026-05-16T01:52:39Z

call show_indexes() return *;

and size on disk.

adsharma · 2026-05-16T01:53:04Z

Good to see benchmarks are better without hash indexes!

aheev · 2026-05-16T01:56:27Z

4M * 8

call show_indexes() return *;

it returns empty on both 0.16.1 and latest

adsharma · 2026-05-16T02:00:41Z

I remember fixing it - but may be the PR didn't go out.

Let me push the ART index branch, which could help you. Still validating it.

adsharma · 2026-05-16T02:09:40Z

Probably easier to create a second demo db with the patterns you want. No one has seen this commit, we could just reset the HEAD to the previous version.

aheev added 2 commits May 10, 2026 10:02

fix parquet rel table scans

0b7bde2

add some more tests

c39d130

fix redundant row-group scans

271c68a

handle REL_ID_COLUMN_ID copy during scan

4119103

aheev force-pushed the icedisk-impl2 branch from d19e4bb to 4119103 Compare May 10, 2026 16:25

adsharma approved these changes May 10, 2026

View reviewed changes

adsharma merged commit ec3f0b6 into LadybugDB:main May 10, 2026
4 checks passed

aheev deleted the icedisk-impl2 branch May 11, 2026 00:57

aheev mentioned this pull request May 16, 2026

update icebug-disk demo-db datasets LadybugDB/dataset#3

Merged

Conversation

aheev commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aheev commented May 10, 2026

Uh oh!

aheev commented May 10, 2026

Uh oh!

aheev commented May 10, 2026

Uh oh!

adsharma commented May 10, 2026

Uh oh!

aheev commented May 10, 2026

Uh oh!

aheev commented May 10, 2026

Uh oh!

adsharma left a comment

Choose a reason for hiding this comment

Uh oh!

adsharma May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adsharma May 10, 2026

Choose a reason for hiding this comment

Uh oh!

aheev May 11, 2026

Choose a reason for hiding this comment

Uh oh!

adsharma May 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aheev commented May 11, 2026

Uh oh!

adsharma commented May 11, 2026

Uh oh!

aheev commented May 11, 2026

Uh oh!

aheev commented May 11, 2026

Uh oh!

aheev commented May 11, 2026

Uh oh!

aheev commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aheev commented May 16, 2026

Uh oh!

aheev commented May 16, 2026

Uh oh!

adsharma commented May 16, 2026

Uh oh!

adsharma commented May 16, 2026

Uh oh!

aheev commented May 16, 2026

Uh oh!

adsharma commented May 16, 2026

Uh oh!

adsharma commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aheev commented May 10, 2026 •

edited

Loading

adsharma May 10, 2026 •

edited

Loading

aheev commented May 15, 2026 •

edited

Loading