New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HAWQ-1409. Send AGG-TYPE header to PXF #1189
Conversation
This change is mean to be a proof of concept that pushing down aggregate function information from HAWQ to the underlying external storage layer does indeed improve performance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just check the indentation in some of the changes.
@@ -110,6 +110,20 @@ void build_http_header(PxfInputData *input) | |||
else | |||
churl_headers_append(headers, "X-GP-HAS-FILTER", "0"); | |||
|
|||
/* Aggregate information */ | |||
if (input->agg_type) { | |||
char agg_groups_str[sizeof(int32)]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this var used?
src/backend/executor/nodeAgg.c
Outdated
AggrefExprState *aggrefstate = (AggrefExprState *) lfirst(l); | ||
Aggref *aggref = (Aggref *) aggrefstate->xprstate.expr; | ||
//Only dealing with one agg | ||
if (aggref->aggfnoid == 2147 || aggref->aggfnoid == 2803) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be better use COUNT_ANY_OID
and COUNT_STAR_OID
...
src/backend/access/external/fileam.c
Outdated
* If we have an agg type then our parent is an Agg node | ||
*/ | ||
rootPlan = state->state->es_plannedstmt->planTree; | ||
if (rootPlan->type == T_Agg && es_state->parent_agg_type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommend to use IsA(rootPlan, Agg)
src/backend/executor/nodeAgg.c
Outdated
@@ -1950,6 +1950,21 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) | |||
* initialize child nodes | |||
*/ | |||
outerPlan = outerPlan(node); | |||
if (outerPlan->type == T_ExternalScan) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better use IsA(outerPlan, ExternalScan)
src/backend/executor/nodeAgg.c
Outdated
/* | ||
* Hack to indicate to PXF when there is an external scan | ||
*/ | ||
if (aggstate->aggs && list_length(aggstate->aggs) == 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need check aggstate->aggs
is null or not? if it is null, list_length
will return 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok, I thought list_length()
on a null value would cause an error
src/backend/executor/nodeAgg.c
Outdated
* Hack to indicate to PXF when there is an external scan | ||
*/ | ||
if (aggstate->aggs && list_length(aggstate->aggs) == 1) { | ||
foreach(l, aggstate->aggs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the size of aggstate->aggs
is already 1, do we need use foreach
to iterate?
src/backend/executor/nodeAgg.c
Outdated
*/ | ||
if (aggstate->aggs && list_length(aggstate->aggs) == 1) { | ||
foreach(l, aggstate->aggs) { | ||
AggrefExprState *aggrefstate = (AggrefExprState *) lfirst(l); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just by changing lfirst(l)
to linitial(aggstate->aggs)
, you can remove foreach
loop.
This change is mean to be a proof of concept that pushing down
aggregate function information from HAWQ to the underlying external
storage layer does indeed improve performance