New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strings with "NA" still handled incorrectly in sdf_copy_to #2031
Comments
|
Just wanted to say, facing the same issue, and it doesn't seem like there are any args that can affect this. |
|
The way that the data is written to Spark, it is first written to a temporary CSV and then written to Spark. Line 167 in f82d3c0
sparklyr/java/spark-1.5.2/utils.scala Lines 270 to 304 in f82d3c0
After the CSV write, any |
Problem:
When copying a data frame into spark, sdf_copy_to (or spark?) is treating characters or factors that use "NA" as NA. At least in R, "NA" is not the same as NA. Thus the copy in spark may have more NAs than it should.
This appears related to #1854.
Example:
The text was updated successfully, but these errors were encountered: