New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
R/Rstudio Crashes after running filter #602
Comments
I get this:
|
Try with the question mark at the end of the URL. |
@avwilliams that's unlikely to work - |
I did have that issue initially, but switching to http did not works as well.
but this is what im getting now DataAlternative fule locations
These work:
These crashes R/RStudio
Data Usage example 2
This gives this error
|
Can you:
|
Sorry, just realised view is using markdown |
Functions woking fine with built-in dataset
min age
max age
###med - mean ages
|
I can reproduce this now, with a bit of editing of the code. This consistently crashes my session indeed:
However, I only get the segfault on printing the data, so if I do:
it's fine, but then it breaks when I run:
And that's what
|
It also happens if I print |
One of the problem I guess is that some of the columns of
R's
@hadley do we want this or should we just forbid weird data structures like this, it does not look tidy to me. |
Here's a minimal example: afl <- jsonlite::fromJSON("https://data.cityofchicago.org/resource/alternative-fuel-locations.json")
res <- dplyr::filter(afl, fuel_type_code %in% "LPG") I'd say this is technically a bug in dplyr, since This is relatively low priority since it's a pretty esoteric feature of data frame subsetting, but it would be nice if it worked. |
and the matrix would have to have the same number of rows as the host data frame ? and have colnames ? I think this would mean updating some code in the For matrices, I could either be lazy and copy columns into a brand new vector, or come up with some sort of virtual vector. Does not sound that horrible to implement. I'll marinate this a bit. |
Yes, the internal matrix/data frame should have |
I've put some code in to support data frames as columns of data frames. So the reprex gives:
Matrices next, when I understand what to do with them |
So some explanation of how I treated inner data frames (support might still be a bit rough). Essentially what I do is treat columns of the inner df as if it were a column of the host data frame.
So I just pretend that
Another question, I've used the inner names as the names of the output df. Is that alright or should I somehow produce names based on the name of the host column and the column names of the inner df, e.g. I think this is a pretty esoteric case anyway, so I'm happy with what I have done. Will add tests for various verbs and with groups, etc ... Part of me feels that this warrants an update of |
About matrices:
Based on what
I can see two ways of doing this:
Thinking about this, perhaps option 2 could be also used for inner data frames instead of what I did so far. Anyway, over to you @hadley; Tell me what you think of all this. |
It's not clear to me what
I think the main thing with filtering when one column is a data frame, is correctly subsetting that data frame when you're filtering on a regular column. i.e. what's important to me is that |
Alright, that's not what I've put in. To make Making I'm just a tiny bit more concerned about making That answers the question about handling of matrices too, I'll try to keep the structure intact. |
I'd say:
|
generates an error. (0.3.1 part of #602).
Ok I've now put various protections in place to forbid data frames and matrices columns.
So I'm promoting this to 0.4 now. Or perhaps we should create a new issue. We also need to have a look at how this affects other verbs, e.g. how do we The bulk of work for this is on the 0.4 item, once we have this in place, it should be relatively easy to allow |
…rVisitor` concept for a column that is a `data.frame`. Part of #602.
Can now preserve
|
`VectorVisitor` concept for a column that is a `matrix`. Part of #602
And now also matrices:
|
Next I guess is dealing with other kinds of visitors, i.e. |
I'm now getting:
because:
ping @hadley. |
I've put some simple change in the So I can get to : > df <- data.frame( a = 1:10, b = 1:10, c = 1:10 )
> df$b <- data.frame( x = 1:10, y = 1:10 )
> res <- df %>% filter( a < 5 )
> str(res)
'data.frame': 4 obs. of 3 variables:
$ a: int 1 2 3 4
$ b:'data.frame': 4 obs. of 2 variables:
..$ x: int 1 2 3 4
..$ y: int 1 2 3 4
$ c: int 1 2 3 4 So the internal code does the right thing. However, we can't print those objects:
Apparently this happens in |
I think this is now correct - I get "data_frames can only contain 1d atomic vectors and lists " - the first column of this data frame is another data frame, which dplyr does not support. |
System:
Distributor ID: Ubuntu
Description: Ubuntu 12.04.5 LTS
Release: 12.04
Codename: precise
Kernel
3.8.0-32-lowlatency #24-Ubuntu SMP PREEMPT x86_64 GNU/Linux
Mem: 6112764
R version 3.1.1 (2014-07-10)
Platform: x86_64-pc-linux-gnu (64-bit)
RStudio
Version 0.98.932
Every time i run this command R/Rstusio Crashes
These work
Data Coerced Source:
socrata.afl.dt <- tbl_dt(socrata.afl)
This works:
filter(socrata.afl.dt, fuel_type_code == "LPG")
This does not, but R/Rstudio does not crash.
subset(socrata.afl.dt, fuel_type_code == "LPG")
Error in FUN(X[[2L]], ...) :
Invalid column: it has dimensions. Can't format it. If it's the result of data.table(table()), use as.data.table(table()) instead.
Any help kindly appreciated.
The text was updated successfully, but these errors were encountered: