You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you use a variable in dplyr::mutate() against a sparklyr data source the lazy eval captures references to user variables. Changing values of those variables implicitly changes the mutate and changes the values seen in the sparklyr result (which is itself a query). This can be worked around by dropping in dplyr::compute() but it seems like it can produce a lot of incorrect calculations. Below is a small example and a lot information on the versions of everything being run.
Notice s1 changed its value (like due to lazy evaluation and having captured a reference to v).
version# _ # platform x86_64-apple-darwin13.4.0 # arch x86_64 # os darwin13.4.0 # system x86_64, darwin13.4.0 # status # major 3 # minor 3.2 # year 2016 # month 10 # day 31 # svn rev 71607 # language R # version.string R version 3.3.2 (2016-10-31)# nickname Sincere Pumpkin Patch
The text was updated successfully, but these errors were encountered:
Copied over from sparklyr/sparklyr#503 .
Altering captured reference damages spark results.
If you use a variable in
dplyr::mutate()
against asparklyr
data source the lazy eval captures references to user variables. Changing values of those variables implicitly changes themutate
and changes the values seen in thesparklyr
result (which is itself a query). This can be worked around by dropping indplyr::compute()
but it seems like it can produce a lot of incorrect calculations. Below is a small example and a lot information on the versions of everything being run.OSX 10.11.6. Spark installed as described at http://spark.rstudio.com
s1
has the same values1
column.Notice
s1
changed its value (like due to lazy evaluation and having captured a reference tov
).The text was updated successfully, but these errors were encountered: