Conversation
queryproc
left a comment
There was a problem hiding this comment.
I skimmed through the code and had a few comments. The first point reflects my understanding, so please correct me if I am mistaken; the second and third are comments on what to address and what to keep in mind, respectively.
- My understanding is that you are reusing the adjacency-list scan for a physical packed extend. In a
scan -> extendpipeline, the extend operator performs lookups over the input, processes multiple input nodes in a single processNextChunk call, and writes the offsets for each list of matches. Once the output vector is full, it sends the results to the next operator.
Furthermore, you do not need to maintain explicit start and end positions in the state. For example, suppose you are processing 2048 input elements and each expands to roughly 10 outputs. What you need is to maintain an active slice over the input. Since you already track the parent positions and write the offsets in a packed fashion, the active slice is implicitly available.
-
I assume you are building this progressively, which is why there is not yet a notion of cascade updates. Otherwise, cascade updates are necessary for correctness. For example, consider expanding from a set of
anodes tobnodes, and then from thosebnodes tocnodes. If somebnodes do not have matchingcnodes, this needs to be propagated upward so that the correspondingaentries are also invalidated when all of theirbmatches fail to producecmatches. This is discussed in Section 5.2 of the paper. -
Eventually, one major difficulty beyond expand operators is making expression evaluation aware of the indirection introduced by offsets and optimizing for it. In particular, expression evaluation needs to correctly align the operand vectors before applying binary operations. We have some work on this over the summer, and I will keep you in the loop.
| class LBUG_API DataChunkState { | ||
| public: | ||
| struct PackedChildSlices { | ||
| std::vector<sel_t> parentPositions; |
There was a problem hiding this comment.
nitpick: The max sizes are known at initialization time, I would use an array instead of vector and allocate them.
There was a problem hiding this comment.
Going with vector::reserve() instead to optimize allocations. arrays could overflow stack and not clear we know the sizes at compile time.
| DASSERT(packedChildSlices.has_value()); | ||
| return *packedChildSlices; | ||
| } | ||
| void setPackedChildSlices(std::vector<sel_t> parentPositions, std::vector<sel_t> offsets) { |
There was a problem hiding this comment.
unsure if this move is what updates the Datachunk slice. To get back to.
Use nodeIDVector selVector[0], not currBoundNodeIdx ScanRelTable::updatePackedChildSlices previously read the parent position from cachedBoundNodeSelVector[currBoundNodeIdx], but currBoundNodeIdx can advance past the current parent within a single scan() call (it is incremented when a parent's CSR list is fully consumed). This produced a wrong parentPos, breaking the factorized correlation between extends and yielding 0 results for multi-hop packed-extend queries. Restore the original, correct logic: the CSR scan sets nodeIDVector to flat with selVector[0] pointing at the actual parent whose children are in the output (via setNodeIDVectorToFlat). Use that as parentPos and set a single-parent slice (overwrite), matching the one-parent-per-batch scan architecture. Also remove the clear-at-start added in getNextTuplesInternal since setSingleParentPackedChildSlice already overwrites. The appendPackedChildSlice API on DataChunkState is retained for future multi-parent packing. Fix the drops-parents test to iterate via hasNext().
|
@queryproc Thanks for the review!
|
Context: https://amine.io/papers/2026-sigmod-ffx.pdf
Based on the cited work, we implement:
operators. More physical operators can be implemented in the future based on the desired optimization.
On this query:
using the following synthetic graph:
we got:
baseline 0.282s ±0.1306,
packed 0.072s ±0.0056