Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for cut() function in dbplyr (snowflake) #697

Closed
aris-hastings opened this issue Aug 18, 2021 · 0 comments · Fixed by #801
Closed

support for cut() function in dbplyr (snowflake) #697

aris-hastings opened this issue Aug 18, 2021 · 0 comments · Fixed by #801

Comments

@aris-hastings
Copy link

Hi,

I'm using dbplyr to interface with Snowflake, and a very common task is to group numeric columns by bands to summarise output. With classic dplyr one would do group_by(cut(factor, c(1, 2, 3))) and it works wonders, but dbplyr does not appear to support it (at least in snowflake).

So this is a post of two halves:

  1. Could we please build in support for cut() function in SQL translations?
  2. In the mean time, what would be the best workaround for this?

e.g. i can build a function that creates an appropriate sql phrase, but how do I get it to be evaluated properly?

db_cut <- function(fact, bins) {
  k = length(bins)
  string = paste0("CASE WHEN ", fact, " < ", bins[1], " THEN '<", bins[1], "' ")   
  for (i in 2:k) {
    string = paste(string,
                   paste0("WHEN ", fact, " < ", bins[i], " THEN '[",
                          bins[i-1], "-", bins[i], ")' "))}
  string = paste0(string, " END")
  return(string)
}

this does not work...:
data %>%
  mutate(bands = db_cut(column_name, c(1, 2, 3))) %>%
  group_by(bands) %>%
  summarise(n()) 

am I missing something simple, or going down the wrong path?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant