New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document reduce_agg aggregate function #12195

Merged
merged 1 commit into from Jan 22, 2019

Conversation

Projects
None yet
7 participants
@wenleix
Copy link
Contributor

wenleix commented Jan 8, 2019

No description provided.

@wenleix wenleix force-pushed the wenleix:reduce_agg_doc branch from 92e00bd to 1d2a324 Jan 8, 2019

``combineFunction`` takes two states and returns the new state.
The final state is returned.

``redcue_agg`` supports state with native container type long, double or boolean.

This comment has been minimized.

@findepi

findepi Jan 8, 2019

Contributor

does it mean that S must be one of these?

(just curious, why so?)

This comment has been minimized.

@mbasmanova

mbasmanova Jan 8, 2019

Contributor

typo: redcue_agg -> reduce_agg

Could you add an example of usage?

This comment has been minimized.

@wenleix

wenleix Jan 9, 2019

Contributor

@findepi : Correct, see the comments in code:

// State with Slice or Block as native container type is intentionally not supported yet,
// as it may result in excessive JVM memory usage of remembered set.
// See JDK-8017163.
throw new UnsupportedOperationException();

Also see #9553

.. function:: reduce_agg(inputValue T, initialState S, inputFunction(S, T, S), combineFunction(S, S, S)) -> S

Returns a single value reduced from input values. ``inputFunction`` and ``combineFunction`` will be invoked.
``inputFunction`` takes the current state (initially ``initialState``) and the input value, and returns the new state.

This comment has been minimized.

@findepi

findepi Jan 8, 2019

Contributor

does it mean reduce_agg runs on single node?
the wording suggests so; but existence of combineFunction suggests it is parallel

This comment has been minimized.

@wenleix

wenleix Jan 9, 2019

Contributor

No it runs in parallel. Any thoughts about how to revise the wording? -- Since when I read it, I didn't see it will run on single node, but I am obviously biased...

This comment has been minimized.

@dain

dain Jan 9, 2019

Contributor

@findepi, I don't think the name combineFunction implies parallelism.

This comment has been minimized.

@findepi

findepi Jan 9, 2019

Contributor

@dain not the name -- the existence of it.
It it was single-threaded, there would be just initialSate and inputFunction (aka "accumulator" or "accumulatingFunction").
The need for combineFunction arises only when you want to merge two branches of processing (ie parallel computations).

Show resolved Hide resolved presto-docs/src/main/sphinx/functions/aggregate.rst

@wenleix wenleix force-pushed the wenleix:reduce_agg_doc branch from 1d2a324 to 49d0764 Jan 9, 2019

@wenleix

This comment has been minimized.

Copy link
Contributor

wenleix commented Jan 9, 2019

Thanks @mbasmanova and @findepi for reviewing. I added examples, here is how it looks like:

screen shot 2019-01-08 at 10 06 18 pm

-- (1, 12)
-- (2, 30)

``reduce_agg`` supports state with native container type long, double or boolean.

This comment has been minimized.

@dain

dain Jan 9, 2019

Contributor

I don't think any end user will understand what this means. Maybe state this as reduce_agg currently does not support variable width types, container types, or inexact types.

This comment has been minimized.

@wenleix

wenleix Jan 9, 2019

Contributor

@dain : Maybe I will just say reduce_agg currently supports boolean, integer (TINYINT, SMALLINT, INTEGER, BIGINT) and floating-point (REAL, DOUBLE) types .

This comment has been minimized.

@findepi

findepi Jan 9, 2019

Contributor

It also supports dates, times, timestamps, right?

@wenleix wenleix force-pushed the wenleix:reduce_agg_doc branch from 49d0764 to d111ce9 Jan 11, 2019

@nezihyigitbasi

This comment has been minimized.

Copy link
Contributor

nezihyigitbasi commented Jan 19, 2019

@findepi @dain @mbasmanova Is there anything else that's needed from this PR? We need to merge this and update the release notes PR (#12194) with a link to this new doc.

electrum added a commit to electrum/presto that referenced this pull request Jan 22, 2019

@wenleix wenleix force-pushed the wenleix:reduce_agg_doc branch from d111ce9 to d4de05b Jan 22, 2019

@wenleix

This comment has been minimized.

Copy link
Contributor

wenleix commented Jan 22, 2019

Thanks @electrum for reviewing. Comments addressed.

electrum added a commit to electrum/presto that referenced this pull request Jan 22, 2019

electrum added a commit to electrum/presto that referenced this pull request Jan 22, 2019

@mbasmanova

This comment has been minimized.

Copy link
Contributor

mbasmanova commented Jan 22, 2019

@electrum David, is this ready to merge? Could you approve?

Returns a single value reduced from input values. ``inputFunction`` and ``combineFunction`` will be invoked.
``inputFunction`` takes the current state (initially ``initialState``) and the input value, and returns the new state.
``combineFunction`` takes two states and returns the new state.
The value of final state is returned::

This comment has been minimized.

@mbasmanova

mbasmanova Jan 22, 2019

Contributor

typo: there should be only one colon, e.g. "returned: "

@wenleix wenleix merged commit 559aaa9 into prestodb:master Jan 22, 2019

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@wenleix wenleix deleted the wenleix:reduce_agg_doc branch Jan 22, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment