Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use serialization for columns of type "list" #2355

Closed
krlmlr opened this issue Jan 7, 2017 · 8 comments
Closed

Use serialization for columns of type "list" #2355

krlmlr opened this issue Jan 7, 2017 · 8 comments
Labels
feature a feature request or enhancement vctrs ↗️
Milestone

Comments

@krlmlr
Copy link
Member

krlmlr commented Jan 7, 2017

in join (#2194) and distinct (#2222) operations.

We should be able to serialize all elements of a list efficiently by calling R_Serialize() for each. A specialized version of JoinVisitorImpl and VectorVisitor would then operate on the serializations. Perhaps hashing the serialization will be good enough.

@lstmemery

This comment has been minimized.

@krlmlr

This comment has been minimized.

@lstmemery

This comment has been minimized.

@romainfrancois
Copy link
Member

We can look at what R does in unique.c about lists.

@romainfrancois
Copy link
Member

Pretty sure this is a vctrs thing now.

@hadley I'm not sure yet at what level we'll use vctrs but if at low level then an api c function for hash_scalar would help.

@romainfrancois
Copy link
Member

> vctrs:::vec_hash(list(1:2,c(1L,2L)))
[1] "656ac61" "656ac61"

@hadley
Copy link
Member

hadley commented Sep 14, 2018

I think we can probably close this issue; once dplyr can use vctrs, it can use the standard hashing approach defined there.

@krlmlr
Copy link
Member Author

krlmlr commented Sep 15, 2018

Let's keep this open to remind us that we still need to enable it here once we have it in vctrs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement vctrs ↗️
Projects
None yet
Development

No branches or pull requests

4 participants