New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dplyr 0.6.0 join problem with CRAN version of sparklyr 0.5.5 #2825
Comments
|
Thanks for reporting this @JohnMount, really appreciated. The problem here is that in order to support joins in I think the best path here is to push a patch for @JohnMount if you could try out this @hadley could you ping me on Slack when you submit |
|
Thanks @javierluraschi , It looks like suppressPackageStartupMessages(library('dplyr'))
library('sparklyr')
sc <- spark_connect(version='2.0.2',
master = "local")
d1 <- copy_to(sc, data.frame(x=1:3, y=4:6), 'd1')
d2 <- copy_to(sc, data.frame(x=1:3, y=7:9), 'd2')
left_join(d1, d2, by='x')
#> Error: Column `y` must have a unique name
# print versions
packageVersion("dplyr")
#> [1] '0.7.0'
packageVersion("sparklyr")
#> [1] '0.5.5'
if(requireNamespace("dbplyr", quietly = TRUE)) {
packageVersion("dbplyr")
}
#> [1] '1.0.0'
R.Version()$version.string
#> [1] "R version 3.4.0 (2017-04-21)"
# cleanup
spark_disconnect(sc)
suppressPackageStartupMessages(library('dplyr'))
library('sparklyr')
sc <- spark_connect(version='2.0.2',
master = "local")
d1 <- copy_to(sc, data.frame(x=1:3, y=4:6), 'd1')
d2 <- copy_to(sc, data.frame(x=1:3, y=7:9), 'd2')
left_join(d1, d2, by='x')
#> # Source: lazy query [?? x 3]
#> # Database: spark_connection
#> x y.x y.y
#> <int> <int> <int>
#> 1 1 4 7
#> 2 2 5 8
#> 3 3 6 9
# print versions
packageVersion("dplyr")
#> [1] '0.7.0'
packageVersion("sparklyr")
#> [1] '0.5.5.9002'
if(requireNamespace("dbplyr", quietly = TRUE)) {
packageVersion("dbplyr")
}
#> [1] '0.0.0.9001'
R.Version()$version.string
#> [1] "R version 3.4.0 (2017-04-21)"
# cleanup
spark_disconnect(sc)We can probably ask people to "go to the dev version of Sparklyr", but for confidence it would be good to have some assurance that a given tag or branch is stable and exactly what versions of everything is needed. Hopefully CRAN will let you push a |
|
|
|
@JohnMount on CRAN now. |
JohnMount commentedMay 29, 2017
The current (5-28-2017) dev version of
dplyr0.6.0 appears to not allow joins with common column names with the current CRAN version ofsparklyr0.5.5. This means if this version ofdplyrbecomes current on CRAN beforesparklyralso updates on CRAN, then production user code will break on bulk update (such asupdate.packages()). As asparklyruser I would suggest this be treated as an important dependent package (sparklyr) breaking ondplyrproposed CRAN update (regardless of the automatic check status ofsparklyr0.5.5).The problem appears to go away if we move up to the dev version of
sparklyr0.5.5.9000.I am re-filing the issue as I have improved the reprexes, and tested and documented more combinations of package versions. I am re-filing it here as this issue seems relevant to
dplyritself (especially assparklyrappears to already have a fix that just needs to percolate up to CRAN).Failing and succeeding reprexes below.
The text was updated successfully, but these errors were encountered: