-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize element_at for maps with complex type keys #7365
Conversation
✅ Deploy Preview for meta-velox canceled.
|
@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
How is this related to #7191? CC: @laithsakka |
velox/functions/lib/SubscriptUtil.h
Outdated
size_t offset = rawOffsets[mapIndex]; | ||
// Fast path for the case of a single map. It may be constant or dictionary | ||
// encoded. Sort map keys, then use binary search. | ||
if (baseMap->size() == 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@laithsakka How does this criteria compare to your method to check identity between batches?
@mbasmanova Laith is using a method to check if map vector between batches are the same. Your method looks simpler and less error-prone, while his method covers more cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, #7191 applies to cases where there are many maps, not just one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's possible to expand this one to all key types and use Laith's triggering criteria to detect MapVector identity. Then we can use one single optimization to cover all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Yuhta @laithsakka Indeed, it should be straightforward to extend this optimization to primitive types. We can also optimize Nested Loop Join to detect the case of single-row build side and use Constant encoding for build-side columns + sort build-side maps. If we do that, we'll have maps coming to element_at already sorted and that will make this optimization even more effective.
Laith, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Yuhta @laithsakka There is another pattern where map is a constant defined in the query. In this case it is not produced by the NLJ. We can add logic to always sort constant map literals so that element_at receives maps sorted in that case too. If we do that, we won't need to make element_at stateful and can keep it stateless and relatively simple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about both ideas initially and discussed with @oerling and @kevinwilfong .
My concern was that we have to sort for every batch because I do not have so much confidence
in the isSorted flag is reset whenever a map is updated. we can do the same game of holding a reference though to make sure the map is not updated and hence assert that if it was sorted in one batch it will always be sorted when received.
I do not have string opinion for either, maybe run this opt on the query i am working on also and check the results?
if you strongly feel this is better you can take over and abandon the other PR but also make sure that you double check with @oerling
If we just sort everything (not just complex typed keys) and binary search if the input is a single map, how does the result compare to the one in #7191? |
one more thing i recall from the discussions with orri is that those base maps can be shared across threads, we need to add locks in sorting if we are doing that. in the functions . unless we sort pre before distributing the vector |
@Yuhta I tried this on Laith's query and it didn't work very well. In his case, map has 70K entries and each batch has only 1K rows. It seems that sorting 70K entry map doesn't get amortized over 1K binary searches and cached hash table works better. |
velox/functions/lib/SubscriptUtil.h
Outdated
@@ -23,6 +23,22 @@ | |||
|
|||
namespace facebook::velox::functions { | |||
|
|||
namespace { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No anonymous namespace in header. Either put it in detail
or the main class SubscriptImpl
velox/functions/lib/SubscriptUtil.h
Outdated
const vector_size_t baseIndex; | ||
const vector_size_t index; | ||
|
||
static bool lessThen(const MapKey& left, const MapKey& right) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lessThan
, or you can just overload operator<
since it's a custom struct already
dc5b60f
to
7c768af
Compare
@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@mbasmanova merged this pull request in 34aa5ac. |
Conbench analyzed the 1 benchmark run on commit There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
Implementation of element_at(map, search) uses linear search to find matching
entry. This is slow when maps are large (100s of entries). This optimization
sorts map keys and uses binary search producing 4x speedup on a real-world
workload. The optimization applies only if there is a single map (constant or
dictionary encoded) since in this case the cost of sorting the map amortizes
across many searches.
There is a standard query optimization technique that generates element_at
over a single map.
A simple join with a small lookup table u:
Can be written as
This ensures that records from u will be collected into a single map and
broadcasted to all workers to be "joined" with t. The original join query may
run as partitioned join (if optimizer is not smart enough) and be very slow.