Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add additional lubridate translations for PostgreSQL #333

Closed
bkkkk opened this issue Jul 9, 2019 · 5 comments
Closed

Add additional lubridate translations for PostgreSQL #333

bkkkk opened this issue Jul 9, 2019 · 5 comments
Labels
feature a feature request or enhancement func trans 🌍 Translation of individual functions to SQL wip work in progress

Comments

@bkkkk
Copy link
Contributor

bkkkk commented Jul 9, 2019

We make extensive use of dbplyr and lubridate and it would be great to have functions like

  • lubridate::weeks, lubridate::months, etc... translated into '1 weeks' :: INTERVAL to enable easy date math
  • floor_date(date, 'month') translated into DATE_TRUNC('month', date)

These two sets of functions are used all the time all the time in our codebase and it's a shame to lose our ability to test with local tibbles because of calls to sql during a dplyr verb chain.

Submitted PR here: #335

@hadley hadley added feature a feature request or enhancement func trans 🌍 Translation of individual functions to SQL labels Dec 13, 2019
@cassiopagnoncelli
Copy link

Hello @bkkkk, how are you using month grouping in dbplyr+pg provided there's still no support for that?

@bkkkk
Copy link
Contributor Author

bkkkk commented May 7, 2020

@cassiopagnoncelli We use raw SQL. We go for one of the following:

  1. Do the grouping in SQL and pass that query in the call to tbl:
query <- "SELECT
    date_trunc('month', created_at) :: DATE AS created_at,
    count(*)
FROM table
GROUP BY 1"

tbl(conn, sql(query))
  1. We mutate a new month column using sql inline:
my_tbl %>%
  mutate(created_month = sql("date_trunc('month', created_at)")) %>%
  group_by(created_month) %>%
  summarize(total_amount = sum(amount))
  1. Create a materialized view/view on the server and just pull that.

@cassiopagnoncelli
Copy link

Brilliant, thanks for sharing. I'm wondering to which point lubridate is integrated to dbplyr, gonna test it today. I hope it does all the maths behind and translate timezones.

@cassiopagnoncelli
Copy link

You're right, that's the way to do now. In addition, flooring date to first moment in month shifts back to last month if timezone is relatively before to UTC so I had to mutate(datetime = timezone('UTC', datetime)) in order to get the right time when converting dataset to tibble.

@bkkkk
Copy link
Contributor Author

bkkkk commented May 8, 2020

@cassiopagnoncelli I've never tested this very thoroughly since I usually convert everything to just dates to avoid these kinds of issues. The timezone of the connection can also have an impact. I think the interaction between lubridate, dbplyr, and timezones is something that could use some love.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement func trans 🌍 Translation of individual functions to SQL wip work in progress
Projects
None yet
Development

No branches or pull requests

3 participants