-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple Columns as Group ID #3
Comments
Good (and interesting point). Thank you for taking the time to think about it and share it here. As you commented the current control table definition treats the first column in a special way: it is the record portion key. You are correct more could be done if some set of columns in the control table were allowed to be the record portion keys. This is an asymmetry: column names are single strings (though I think Your current work around is in fact what I would have suggested. But I perhaps we can automate this. |
Thanks. I use I have not thought of any consequences yet, but I hope there are nothing major and that the benefits will outweigh them. |
I'll probably add the feature soon. The nice thing is if I have started to lay-down infrastructure to handle the feature and entered a (failing) test to track progress on the feature. |
Got it! (In the development version, for local ops, will probably get to the database versions later.) library(cdata)
library(rqdatatable)
#> Loading required package: rquery
d <- iris
d$id <- seq_len(nrow(d))
control_table <- qchar_frame(
Part, Measure, Value |
Sepal, Length, Sepal.Length |
Sepal, Width, Sepal.Width |
Petal, Length, Petal.Length |
Petal, Width, Petal.Width )
d %.>%
rowrecs_to_blocks(
.,
control_table,
controlTableKeys = c("Part", "Measure"),
columnsToCopy = c("id", "Species")) %.>%
orderby(., c("id", "Part", "Measure")) %.>%
head(.)
#> id Species Part Measure Value
#> 1: 1 setosa Petal Length 1.4
#> 2: 1 setosa Petal Width 0.2
#> 3: 1 setosa Sepal Length 5.1
#> 4: 1 setosa Sepal Width 3.5
#> 5: 2 setosa Petal Length 1.4
#> 6: 2 setosa Petal Width 0.2 |
Wow. What a quick update. Will try it out as soon as I can, although I may wait for the CRAN update. Thanks! |
I found it an interesting possibility. Your example as a vignette here. |
I wanted to do something like this
But I get this error
checkControlTable()
assumes that the first column is always and the only id column, so when the first column does not have distinct values it throws an error.I think
cdata
should be able to support a combination of multiple columns as ids. In my example above, the combination of Part and Measure constitutes a group id. Maybe an extra argument to specify the id cols in the control table can make this work?Something like this?
keyColumns
(like the one inblocks_to_rowrecs()
) can also take a vector of col index to specify the columns to take as group ids. Default should be 1 to keep current behavior.Here is a work around with
{data.table}
Splitting columns is easy, but I was just wondering if it can be avoided.
The text was updated successfully, but these errors were encountered: