PACKED logical and physical operators by adsharma · Pull Request #584 · LadybugDB/ladybug

adsharma · 2026-06-12T18:17:43Z

Context: https://amine.io/papers/2026-sigmod-ffx.pdf

Based on the cited work, we implement:

PACKED_EXTEND (logical and physical)
PACKED_FILTER_COUNT (physical)

operators. More physical operators can be implemented in the future based on the desired optimization.

On this query:

  CALL enable_packed_path_extend=true;
  MATCH
    (:Leaf)-[:R1]->(c:Center)<-[:R2]-(l2:Leaf),
    (c)<-[:R3]-(l3:Leaf)
  WITH c.grp AS grp, l2.id AS l2id, l3.id AS l3id
  WHERE (l2id + l3id) % 10 = 0
  RETURN grp, count(*) AS cnt
  ORDER BY cnt DESC;

using the following synthetic graph:

from pathlib import Path

out = Path("/tmp/ffx_synth_csv")
centers = 1000
fanout = 200

with (out / "center.csv").open("w") as f:
    for c in range(centers):
        f.write(f"{c},{c % 10}\n")

with (out / "leaf.csv").open("w") as f:
    for i in range(centers * fanout):
        f.write(f"{i}\n")

for rel in ["r1", "r2", "r3"]:
    with (out / f"{rel}.csv").open("w") as f:
        for c in range(centers):
            base = c * fanout
            for j in range(fanout):
                f.write(f"{base + j},{c}\n")

we got:

baseline 0.282s ±0.1306,
packed 0.072s ±0.0056

queryproc

I skimmed through the code and had a few comments. The first point reflects my understanding, so please correct me if I am mistaken; the second and third are comments on what to address and what to keep in mind, respectively.

My understanding is that you are reusing the adjacency-list scan for a physical packed extend. In a scan -> extend pipeline, the extend operator performs lookups over the input, processes multiple input nodes in a single processNextChunk call, and writes the offsets for each list of matches. Once the output vector is full, it sends the results to the next operator.

Furthermore, you do not need to maintain explicit start and end positions in the state. For example, suppose you are processing 2048 input elements and each expands to roughly 10 outputs. What you need is to maintain an active slice over the input. Since you already track the parent positions and write the offsets in a packed fashion, the active slice is implicitly available.

I assume you are building this progressively, which is why there is not yet a notion of cascade updates. Otherwise, cascade updates are necessary for correctness. For example, consider expanding from a set of a nodes to b nodes, and then from those b nodes to c nodes. If some b nodes do not have matching c nodes, this needs to be propagated upward so that the corresponding a entries are also invalidated when all of their b matches fail to produce c matches. This is discussed in Section 5.2 of the paper.
Eventually, one major difficulty beyond expand operators is making expression evaluation aware of the indirection introduced by offsets and optimizing for it. In particular, expression evaluation needs to correctly align the operand vectors before applying binary operations. We have some work on this over the summer, and I will keep you in the loop.

queryproc · 2026-06-17T13:01:41Z

 class LBUG_API DataChunkState {
 public:
+    struct PackedChildSlices {
+        std::vector<sel_t> parentPositions;


nitpick: The max sizes are known at initialization time, I would use an array instead of vector and allocate them.

Going with vector::reserve() instead to optimize allocations. arrays could overflow stack and not clear we know the sizes at compile time.

queryproc · 2026-06-17T13:02:40Z

+        DASSERT(packedChildSlices.has_value());
+        return *packedChildSlices;
+    }
+    void setPackedChildSlices(std::vector<sel_t> parentPositions, std::vector<sel_t> offsets) {


unsure if this move is what updates the Datachunk slice. To get back to.

Also use append in ScanRelTable

…vior

Use nodeIDVector selVector[0], not currBoundNodeIdx ScanRelTable::updatePackedChildSlices previously read the parent position from cachedBoundNodeSelVector[currBoundNodeIdx], but currBoundNodeIdx can advance past the current parent within a single scan() call (it is incremented when a parent's CSR list is fully consumed). This produced a wrong parentPos, breaking the factorized correlation between extends and yielding 0 results for multi-hop packed-extend queries. Restore the original, correct logic: the CSR scan sets nodeIDVector to flat with selVector[0] pointing at the actual parent whose children are in the output (via setNodeIDVectorToFlat). Use that as parentPos and set a single-parent slice (overwrite), matching the one-parent-per-batch scan architecture. Also remove the clear-at-start added in getNextTuplesInternal since setSingleParentPackedChildSlice already overwrites. The appendPackedChildSlice API on DataChunkState is retained for future multi-parent packing. Fix the drops-parents test to iterate via hasNext().

adsharma · 2026-06-18T00:14:37Z

@queryproc Thanks for the review!

Commit 5139223 addresses it
Added a unittest to verify
Looking forward to seeing your work this summer and learning from it

docs/multi_parent_lifetime.md contains some of what we could be doing, but deferred for now to keep this PR manageable.

adsharma force-pushed the ffx branch from 3d76fef to 5e7e0e4 Compare June 12, 2026 18:22

queryproc reviewed Jun 17, 2026

View reviewed changes

adsharma added 5 commits June 17, 2026 11:20

Add opt-in packed path extend planning

11d06f1

Add packed extend physical operator

7f9a649

Track packed child slices on packed extend

caf83c5

Add packed filtered count operator

779eac9

DataChunkState: add appendPackedChildSlice and PackedChildSlices::append

5139223

Also use append in ScanRelTable

adsharma force-pushed the ffx branch from 5e7e0e4 to eb9c9c5 Compare June 17, 2026 19:08

adsharma added 2 commits June 17, 2026 12:24

Add unit test for appendPackedChildSlice; test zero-sized append beha…

d5fe9d8

…vior

adsharma force-pushed the ffx branch from eb9c9c5 to b4cb711 Compare June 17, 2026 20:15

adsharma added 2 commits June 17, 2026 16:42

docs: add multi-parents lifetime plan

e94633a

packed extend: optimize memory allocation for vectors

e5d7c0b

adsharma merged commit 34b3b3f into main Jun 18, 2026
4 checks passed

adsharma deleted the ffx branch June 18, 2026 01:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PACKED logical and physical operators#584

PACKED logical and physical operators#584
adsharma merged 9 commits into
mainfrom
ffx

adsharma commented Jun 12, 2026 •

edited

Loading

Uh oh!

queryproc left a comment

Uh oh!

queryproc Jun 17, 2026

Uh oh!

adsharma Jun 17, 2026

Uh oh!

queryproc Jun 17, 2026

Uh oh!

Uh oh!

Uh oh!

adsharma commented Jun 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

adsharma commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

queryproc left a comment

Choose a reason for hiding this comment

Uh oh!

queryproc Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

adsharma Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

queryproc Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

adsharma commented Jun 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

adsharma commented Jun 12, 2026 •

edited

Loading