New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support combining for non-base types #2432

Open
hadley opened this Issue Feb 16, 2017 · 9 comments

Comments

Projects
4 participants
@hadley
Copy link
Member

hadley commented Feb 16, 2017

This is a meta issue related for all bind/grouped-mutate/join/filter issues related to custom S3 + S4 classes

See also r-lib/vctrs#27

@krlmlr

This comment has been minimized.

Copy link
Member

krlmlr commented Mar 21, 2017

Should be implemented in vctrs.

@fabian-s

This comment has been minimized.

Copy link

fabian-s commented Mar 10, 2018

We're trying to build a package for "tidy" functional data analysis which definines new data types for function-valued data and are not sure how to proceed since dplyr behavior for non-base columns is ... mysterious, especially with grouping.

Can anyone offer some advice on how to design classes for new data types so that they work with current (and future) versions of dplyr?

@krlmlr

This comment has been minimized.

Copy link
Member

krlmlr commented Mar 12, 2018

Looks like we still need to wrap our heads around how this will be supported consistently in future dplyr versions. The vctrs repo contains some pointers, but nothing production-ready yet, and no clean write-up as far as I remember.

@fabian-s

This comment has been minimized.

Copy link

fabian-s commented Mar 12, 2018

@krlmlr
Thanks for the quick reply, but to be honest, any pointers I might be able to extract from vctrs-code won't really help to make our stuff work in the near term, since development there seems to have been stalled for a long time.
I know it is a lot to ask, but would you be willing to answer some specific questions about how dplyr's grouped_df-table verbs evaluate expressions if I write up some code snippets where I don't understand what's happening? These summarize and mutate calls are really hard to debug/inspect since they call C++ and I suspect my idea to define new mean, sum, sd, etc... methods for the functional datatype S3 classes we define might clash with dplyr's hybrid evaluation scheme...?

@krlmlr

This comment has been minimized.

Copy link
Member

krlmlr commented Mar 16, 2018

@fabian-s: Sure, if the interface or the behavior is too complicated and not obvious from the docs, this qualifies as a bug ;-)

@tomwwagstaff

This comment has been minimized.

Copy link

tomwwagstaff commented Apr 16, 2018

I am currently experiencing a problem with lubridate intervals when using dplyr joins. I calculate the intervals first in one table, where each entity appears once, and then use an inner_join to a larger table where each entity appears in multiple rows. Now in the new table, the date interval shifts around apparently randomly from row to row for the same entity (although the length of the interval seems to be consistent and correct).

I can't tell from the series of issues I've clicked through if this precise problem has been raised before - but it's a current problem, and I would definitely call this a bug.

@krlmlr

This comment has been minimized.

Copy link
Member

krlmlr commented Apr 16, 2018

@tomwwagstaff: Can you please double-check with the development version of dplyr? I assume you'll be seeing an error, because we don't support the Interval class from lubridate. The problem you describe doesn't seem to be related to this issue, a new issue (with a reprex) is preferred.

@fabian-s

This comment has been minimized.

Copy link

fabian-s commented Dec 5, 2018

Now that vctrs is on CRAN,

  • is there a road map on when/how these issues will get resolved?
  • what version of dplyr will incorportate vctrs ?
  • any discussion or documentation what specific properties non-standard data types will need to have in order to work with dplyr going forward?

vctrs does look promising, but I'm hesitant to refactor existing code only to then play catch-up if specs change.

@hadley

This comment has been minimized.

Copy link
Member

hadley commented Dec 5, 2018

We’re planning to integrate in 0.9.0. We do not have a specific timeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment