Would it make sense to go ahead and vec_slice()by using ki$key and use that as the return value in vec_duplicate_split()? In both vec_split() and in @romainfrancois's PR that seems to be the first thing you do. Plus the $key information is technically redundant because it is all there in $idx already. I probably would have done this already, but I don't think vec_slice() had a C implementation at the time.
I think it would be nice to even return a data frame. Then vec_split() would just switch out the idx column with the val column. It would make it more familiar to anyone used to the vec_split() output.
With this in mind it could also be called vec_split_id(), because we would have an id column (I'd change it from idx -> id). But I'm also fine with vec_split_info().
Would it make sense to document vec_split_id() on the same page as vec_split(), and give it an argument vec_split_id(by) rather than vec_split_id(x)? If I put it on a different page then I would probably use x as the argument, but on the same page it kind of makes sense to be by, since if we called it x it would complicate the documentation.
(This might go against the general vctrs api since we use x as the first argument for pretty much everything)
For e.g. tidyverse/dplyr#4504
vec_duplicate_split()gives exactly what is needed for a
vctrsbased implementation of
keygives the indices I can use to
vec_slice()the data to get the first columns
idxis exactly the
tidyverse/dplyr#4504 then goes a bit further to reveal empty groups when
.drop = FALSEbut that does not need to be
The text was updated successfully, but these errors were encountered: