Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cross-db support for any_value() #497

Closed
joellabes opened this issue Feb 14, 2022 · 5 comments
Closed

Add cross-db support for any_value() #497

joellabes opened this issue Feb 14, 2022 · 5 comments
Labels
enhancement New feature or request good first issue

Comments

@joellabes
Copy link
Contributor

Describe the feature

To quote the Snowflake docs:

ANY_VALUE can simplify and optimize the performance of GROUP BY statements. A common problem for many queries is that the result of a query with a GROUP BY clause can only contain expressions used in the GROUP BY clause itself, or results of aggregate functions.

Grouping by is unnecessary when the column is known to be unique, and can add cost complexity. Adding any other aggregate (e.g. min) solves the problem (and is the only option for Postgres) but also can have a performance impact.

It would be useful to be able to use any_value when available, and fall back to min when not.

Describe alternatives you've considered

Using min all the time.

Additional context

The three core cloud warehouses supported by this package support any_value (Snowflake, BQ, Redshift). Postgres doesn't and would have to use min.

Who will this benefit?

Me!

Are you interested in contributing this feature?

Yes!

@joellabes joellabes added enhancement New feature or request triage labels Feb 14, 2022
@joellabes
Copy link
Contributor Author

@amychen1776 offhand, do you know whether other adapters support this? I'd like the default implementation to be any_value and then have a postgres-specific implementation that does min, but if these three are the outliers and other CDWs don't have it then I guess maybe min should be the default.

@amychen1776
Copy link

@joellabes Are there specific ones you are curious about? I ask this because the dbt_utils package doesn't work fully for many of our community/vendor-supported adapters (to the point that some of them have made their own)

@joellabes
Copy link
Contributor Author

joellabes commented Feb 15, 2022

Not specifically, just if there were high-usage ones you knew of that didn't have support.

to the point that some of them have made their own

you mean the shim packages, right? that's what I was thinking about here, was making it straightforward for them to overwrite.

Thinking about it more, I'm going to make the default any_value and just have a carved out exception for Postgres. It'd be weird for the default any_value behaviour to be something that isn't any_value 😬

@amychen1776
Copy link

Yes, that's correct and yeah...that is odd?

@joellabes joellabes mentioned this issue Feb 17, 2022
15 tasks
@dbeatty10
Copy link
Contributor

Resolved by #501

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue
Projects
None yet
Development

No branches or pull requests

3 participants