Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generic expression aggregation check all children are -> any child is #482

Merged
merged 7 commits into from
Apr 28, 2023

Conversation

CircArgs
Copy link
Contributor

Summary

changes the generic logic checking if an expression is an aggregation to just checking if there is any child which aggregates rather than all children aggregating

this would be a problem because of the binop default is_aggregation:

curl -X POST http://localhost:8000/nodes/metric/ \
-H 'Content-Type: application/json' \
-d '{
    "name": "num_repair_orders_x_2",
    "description": "Number of repair orders",
    "mode": "published",
    "query": "SELECT 2.0 * count(repair_order_id) as num_repair_orders FROM repair_orders"
}'

which would succeed after this change. Now, 2.0 * count(repair_order_id) is identified as an aggregation

Test Plan

  • PR has an associated issue: #
  • make check passes
  • make test shows 100% unit test coverage

changes the generic logic checking if an expression is an aggregation to just checking if there is any child which aggregates rather than all children aggregating
@shangyian
Copy link
Contributor

Hmm what about something like:

SELECT count(repair_order_id) + repair_order_id AS num_repair_orders FROM repair_orders

That wouldn't be an aggregation, but it would still pass this check right?

@samredai
Copy link
Contributor

@shangyian is that a valid query? If not then is it reasonable to rely on the underlying query engine to catch that?

Copy link
Contributor

@shangyian shangyian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samredai It's not a valid query, but then we'd be storing an invalid query in a Metric node that's marked as "valid". Not sure if this is an issue though, since arguably we don't have to check for aggregations at all.

@samredai
Copy link
Contributor

but then we'd be storing an invalid query in a Metric node that's marked as "valid"

Hmm, good point...

@CircArgs
Copy link
Contributor Author

@shangyian I made another change now and reverted the original. This needed to be the case anyways, but now values are aggregations. That should work right? So, rather than 2.0 * count(repair_order_id) being an aggregation because BinaryOp has some child (count) that is an aggregation, all the children including Number(2.0) is naturally an aggregation being a scalar value

Copy link
Contributor

@shangyian shangyian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me! Thanks @CircArgs

@agorajek
Copy link
Member

SELECT count(repair_order_id) + repair_order_id AS num_repair_orders FROM repair_orders

That wouldn't be an aggregation, but it would still pass this check right?

@shangyian I may be missing your point, but isn't this a completely separate problem? Meaning "whether the query is valid SQL" from "whether any given expression is an aggregate expression" ?

@samredai
Copy link
Contributor

@shangyian I may be missing your point, but isn't this a completely separate problem? Meaning "whether the query is valid SQL" from "whether any given expression is an aggregate expression" ?

@agorajek this was exactly my original thought but I think the update @CircArgs made is an even clearer approach that also solves the problem @shangyian raised. The updated check is more robust in that it gets to the core of the requirement which is that all elements resolve to some scalar value that can co-exist in a single expression.

@CircArgs CircArgs merged commit 5843d14 into DataJunction:main Apr 28, 2023
6 checks passed
youngman-droid pushed a commit to youngman-droid/dj that referenced this pull request Aug 26, 2023
…DataJunction#482)

* generic expression aggregation check all children are -> any child is

changes the generic logic checking if an expression is an aggregation to just checking if there is any child which aggregates rather than all children aggregating

* values are aggregations

* fix make check

* remove now incorrect test

* cov

* fix check

* trying to fix parse error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants