Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PostgreSQL statistical aggregate function pushdown #6731

Conversation

MiguelWeezardo
Copy link
Member

No description provided.

@cla-bot cla-bot bot added the cla-signed label Jan 26, 2021
@MiguelWeezardo MiguelWeezardo marked this pull request as draft January 26, 2021 22:05
@MiguelWeezardo MiguelWeezardo marked this pull request as ready for review January 27, 2021 08:54
@MiguelWeezardo MiguelWeezardo force-pushed the postgresql_aggregate_function_pushdown branch 2 times, most recently from 65176e8 to 454d021 Compare January 27, 2021 22:33
Copy link
Member

@mosabua mosabua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc changes look good.

Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PostgreSQL supports covariance functions too. Maybe those should be added too? You might find the patterns added in #6734 to be useful for that.

https://www.postgresql.org/docs/9.6/functions-aggregate.html#FUNCTIONS-AGGREGATE-STATISTICS-TABLE

LGTM otherwise.

@findepi findepi changed the title Postgresql aggregate function pushdown PostgreSQL statistical aggregate function pushdown Jan 28, 2021
assertThat(query("SELECT min(short_decimal), min(long_decimal) FROM " + testTable.getName())).isFullyPushedDown();
assertThat(query("SELECT max(short_decimal), max(long_decimal) FROM " + testTable.getName())).isFullyPushedDown();
assertThat(query("SELECT sum(short_decimal), sum(long_decimal) FROM " + testTable.getName())).isFullyPushedDown();
assertThat(query("SELECT avg(short_decimal), avg(long_decimal) FROM " + testTable.getName())).isFullyPushedDown();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what has changed here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the withTable() approach to using TestTable since it's more robust.

@findepi
Copy link
Member

findepi commented Jan 28, 2021

mind CI red

@MiguelWeezardo
Copy link
Member Author

mind CI red

It's Error committing write to Hive again 😞 . The tests were run before #6711 got merged.

@MiguelWeezardo
Copy link
Member Author

MiguelWeezardo commented Jan 29, 2021

@findepi Please merge if you think it's OK. This code would really benefit from something like #6739 but for mutiple parameters. Do you think I should add a multiparameter version of fluent definitions to your draft and rebase on it?

@MiguelWeezardo MiguelWeezardo force-pushed the postgresql_aggregate_function_pushdown branch 2 times, most recently from 84c2ae7 to 1f78ec6 Compare January 30, 2021 14:02
Comment on lines 64 to 65
JdbcColumnHandle columnHandle1 = (JdbcColumnHandle) context.getAssignments().get(inputs.get(0).getName());
verifyNotNull(columnHandle1, "Unbound variable: %s", inputs.get(0));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use context.getAssignment and then no need for verify

same in other places

Comment on lines 70 to 71
Set<Type> types = ImmutableSet.of(aggregateFunction.getOutputType(), columnHandle1.getColumnType(), columnHandle2.getColumnType());
verify(types.equals(ImmutableSet.of(REAL)) || types.equals(ImmutableSet.of(DOUBLE)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know that columnHandle1.getColumnType(), columnHandle2.getColumnType() are the same, so it's an indirect way of saying "verify outputtype equals columnHandle1.getColumnType()"
let's make it explicit

}

@Test
public void testRegrPushdown()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's testRegrPushdown -> testRegrAggregationPushdown (same for others , also those existing ones)

-- as we're getting more test methods like that, this naming pattern will make them easier to find

// test some values for which the aggregate functions return whole numbers
try (TestTable testTable = new TestTable(
postgreSqlServer::execute,
getSession().getSchema().orElseThrow() + ".test_regr_pushdown",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the schema known to be 'tpch'?
i guess other tests already assme that & making the assumption yields to a more readable code.

@MiguelWeezardo
Copy link
Member Author

@findepi please merge if this version is OK

@findepi
Copy link
Member

findepi commented Feb 2, 2021

@MiguelWeezardo are you sure you did push your changes?

@MiguelWeezardo MiguelWeezardo force-pushed the postgresql_aggregate_function_pushdown branch from 1f78ec6 to 72c7463 Compare February 2, 2021 14:31
@MiguelWeezardo
Copy link
Member Author

@MiguelWeezardo are you sure you did push your changes?

Sorry, I forgot to run git add

@findepi findepi merged commit 5c12df4 into trinodb:master Feb 3, 2021
@findepi
Copy link
Member

findepi commented Feb 3, 2021

Merged, thanks!

@findepi findepi added this to the 352 milestone Feb 3, 2021
@findepi findepi mentioned this pull request Feb 3, 2021
10 tasks
@MiguelWeezardo MiguelWeezardo deleted the postgresql_aggregate_function_pushdown branch February 4, 2021 22:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

4 participants