Skip to content

Commit

Permalink
Change the intermediate type of approx_percentile to ROW (#18386)
Browse files Browse the repository at this point in the history
Summary:
X-link: prestodb/presto#18386

Pull Request resolved: facebookincubator#2621

Before this change, intermediate aggregation node for `approx_percentile` in worker only sees a type signature of `VARBINARY -> VARBINARY`, thus not be able to figure out the actual value type and cannot perform the merge properly when the value type is not `DOUBLE`.

This fix changes the intermediate result type to `ROW` so that we can get the value type from it and instantiate accumulator with proper type.

Fix facebookincubator#2430

Reviewed By: mbasmanova

Differential Revision: D39733152

fbshipit-source-id: f7476a9a4a2e92d7fdc976f388a07692787cada8
  • Loading branch information
Yuhta authored and facebook-github-bot committed Sep 28, 2022
1 parent 0d0eb30 commit d8c377c
Show file tree
Hide file tree
Showing 5 changed files with 274 additions and 196 deletions.
15 changes: 15 additions & 0 deletions velox/docs/develop/aggregate-functions.rst
Expand Up @@ -580,3 +580,18 @@ To confirm that aggregate function works end to end as part of query, update tes
.. code-block:: java
assertQuery("SELECT orderkey, array_agg(linenumber) FROM lineitem GROUP BY 1");
Overwrite Intermediate Type in Presto
-------------------------------------

Sometimes we need to change the intermediate type of aggregation function in
Presto, due to the differences in implementation or in the type information
worker node receives. This is done in Presto class
``com.facebook.presto.metadata.BuiltInTypeAndFunctionNamespaceManager``. When
``FeaturesConfig.isUseAlternativeFunctionSignatures()`` is enabled, we can
register a different set of function signatures used specifically by Velox. An
example of how to create such alternative function signatures from scratch can
be found in
``com.facebook.presto.operator.aggregation.AlternativeApproxPercentile``. An
example pull request can be found at
https://github.com/prestodb/presto/pull/18386.
44 changes: 23 additions & 21 deletions velox/functions/lib/KllSketch.h
Expand Up @@ -140,26 +140,6 @@ struct KllSketch {
/// Get frequencies of items being tracked. The result is sorted by item.
std::vector<std::pair<T, uint64_t>> getFrequencies() const;

private:
KllSketch(const Allocator&, uint32_t seed);
void doInsert(T);
uint32_t insertPosition();
int findLevelToCompact() const;
void addEmptyTopLevelToCompletelyFullSketch();
void shiftItems(uint32_t delta);

uint8_t numLevels() const {
return levels_.size() - 1;
}

uint32_t getNumRetained() const {
return levels_.back() - levels_[0];
}

uint32_t safeLevelSize(uint8_t level) const {
return level < numLevels() ? levels_[level + 1] - levels_[level] : 0;
}

struct View {
uint32_t k;
size_t n;
Expand All @@ -179,10 +159,32 @@ struct KllSketch {
void deserialize(const char* FOLLY_NONNULL);
};

void mergeViews(const folly::Range<const View*>& views);

View toView() const;

private:
KllSketch(const Allocator&, uint32_t seed);
void doInsert(T);
uint32_t insertPosition();
int findLevelToCompact() const;
void addEmptyTopLevelToCompletelyFullSketch();
void shiftItems(uint32_t delta);

uint8_t numLevels() const {
return levels_.size() - 1;
}

uint32_t getNumRetained() const {
return levels_.back() - levels_[0];
}

uint32_t safeLevelSize(uint8_t level) const {
return level < numLevels() ? levels_[level + 1] - levels_[level] : 0;
}

static KllSketch<T, Allocator, Compare>
fromView(const View&, const Allocator&, uint32_t seed);
void mergeViews(const folly::Range<const View*>& views);

using AllocU32 = typename std::allocator_traits<
Allocator>::template rebind_alloc<uint32_t>;
Expand Down

0 comments on commit d8c377c

Please sign in to comment.