Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redshift avg returning 1 or 0 #81

Closed
stevenconnorg opened this issue Sep 8, 2023 · 3 comments · Fixed by #83
Closed

Redshift avg returning 1 or 0 #81

stevenconnorg opened this issue Sep 8, 2023 · 3 comments · Fixed by #83

Comments

@stevenconnorg
Copy link
Contributor

Hey there,

I seem to be having issues getting correct avg values returned in Redshift.

When we try to get the avg of a binary metric, Redshift returns either 1 or 0. Ideally, these metrics should be stored as a boolean value; however, I think we'd still have an issue since the measure_avg casts to 1 or 0 for boolean values, instead of 1.0 or 0.0.

Related stackoverflow thread: Redshift Avg not returning decimals

Is there some way that we can something like this so it works in Redshift?:


{%- if dbt_profiler.is_numeric_dtype(data_type) and not dbt_profiler.is_struct_dtype(data_type) -%}
    avg({{ adapter.quote(column_name) }} :: float)
{%- elif dbt_profiler.is_logical_dtype(data_type) -%}
    avg(case when {{ adapter.quote(column_name) }} then 1.0 else 0.0 end)
{%- else -%}
    cast(null as {{ dbt.type_numeric() }})
{%- endif -%}

Maybe create a different version of this macro to use only with the dbt-redshift adapter? I'm not sure how to do that, but would help in contributing with some guidance!

@stumelius
Copy link
Contributor

@stevenconnorg Hi! Thanks for reporting this. There is indeed a way to create a custom version of measure_avg for Redshift only. Macros that work differently in different databases are called cross-database macros. Under the hood, dbt dispatches macro calls to the database adapter (e.g., redshift) you are using; if there's an adapter specific version of the macro available then that is used but if not, a default implementation for the macro is used. Here's an example: https://docs.getdbt.com/reference/dbt-jinja-functions/dispatch#a-simple-example

So, what we need here is a redshift__measure_avg macro that customizes the avg logic for Redshift. I'd very much like if you could contribute this yourself and I'll help you out as needed. What do you think? :)

@stevenconnorg
Copy link
Contributor Author

@stumelius -- thanks for the help! I've linked a PR that will add this functionality. Let me know if you need anything else from me atm!

@stumelius
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants