Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust distinct aggregation expansion on Mongo #44442

Merged
merged 9 commits into from
Jun 24, 2024
Merged

Conversation

lbrdnk
Copy link
Contributor

@lbrdnk lbrdnk commented Jun 19, 2024

Closes #35425

Description

This PR modifies expressions that are used in $addFields stage in aggregation transformation. Specifically, names that are result of :distinct aggregation are wrapped in $size to produce actual counts.

How to verify

  • Existing test was modified to check for the new behavior.
  • New test was added to check compilation of more complex case.
  • New test was added to ensure querying works as expected even more complex case.

Checklist

  • Tests have been added/updated to cover changes in this PR

@lbrdnk lbrdnk added Database/Mongo backport Automatically create PR on current release branch on merge .Team/QueryProcessor :hammer_and_wrench: labels Jun 19, 2024
@lbrdnk lbrdnk requested a review from camsaul as a code owner June 19, 2024 20:28
Copy link

replay-io bot commented Jun 19, 2024

Status In Progress ↗︎ 59 / 60
Commit ed977c2
Results
⚠️ 5 Flaky
2680 Passed

Comment on lines -49 to -51
(def ^:private $addFields
:$addFields)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Operators are defined as strings. This removal produces consistent compilation results in modified test, ie. also $addFields is string in compiled query.

@lbrdnk lbrdnk requested a review from a team June 20, 2024 15:32
Copy link
Contributor

@bshepherdson bshepherdson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff, just some Clojure code polishing.

Comment on lines 1133 to 1134
If there is a need for more _pre-post_ transformations (as per results of [[expand-aggregations]]), this function
should be generalized."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear what this means. I'd either turn this into a TODO comment that calls out a specific weakness to be fixed later, or just remove this sentence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By pre-post transformations I mean transformations that are necessary on :post key of expand-aggregations return value. Even if meaning of the comment was evident, its added value is questionable. Removed.

Comment on lines 1143 to 1149
aggr-expr' (walk/postwalk (fn [x]
(if (and (string? x)
(distinct-vals x))
{$size x}
x))
aggr-expr)]
[aggr-expr' mappings]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for an extra let binding here; this can be inlined into the output clause.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In context of the following function, expand-aggregations, I've deliberately decided to have an explicit var aggr-expr', to signal on the first sight that returned value has a same shape as what's used in the other function. I've also decided to bind the distinct-keys var, despite it could have been omitted by moving filter into transducer composition on the next line. I, subjectively, find this form easier on eyes of future reader.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But as I'm not the only future reader I did as suggested :)

(deftest ^:parallel multiple-aggregations-with-distinct-count-expression-test
(mt/test-driver
:mongo
(testing "Should generate correct queries for `:distinct` in expressions ()"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing the bug number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@lbrdnk lbrdnk merged commit ba42baa into master Jun 24, 2024
109 checks passed
@lbrdnk lbrdnk deleted the mongo-distinct-esc branch June 24, 2024 10:10
github-automation-metabase pushed a commit that referenced this pull request Jun 24, 2024
This commit adds `$size` operator wrapping to `$addToSet`. That enables use of `:distinct`
clause in expressions on Mongo.
@lbrdnk
Copy link
Contributor Author

lbrdnk commented Jun 24, 2024

@metabase-bot backport release-x.49.x

Copy link

@lbrdnk something went wrong while creating a backport [Logs]

lbrdnk added a commit that referenced this pull request Jun 24, 2024
iethree pushed a commit that referenced this pull request Jun 25, 2024
This commit adds `$size` operator wrapping to `$addToSet`. That enables use of `:distinct`
clause in expressions on Mongo.
@github-actions github-actions bot added this to the 0.49.18 milestone Jun 25, 2024
github-automation-metabase added a commit that referenced this pull request Jun 25, 2024
This commit adds `$size` operator wrapping to `$addToSet`. That enables use of `:distinct`
clause in expressions on Mongo.

Co-authored-by: lbrdnk <lbrdnk@users.noreply.github.com>
Copy link

🚀 This should also be released by v0.50.7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Automatically create PR on current release branch on merge Database/Mongo .Team/QueryProcessor :hammer_and_wrench:
Projects
None yet
2 participants