Skip to content

Commit

Permalink
[SPARK-17902][R] Revive stringsAsFactors option for collect() in SparkR
Browse files Browse the repository at this point in the history
## What changes were proposed in this pull request?

This PR proposes to revive `stringsAsFactors` option in collect API, which was mistakenly removed in 71a138c.

Simply, it casts `charactor` to `factor` if it meets the condition, `stringsAsFactors && is.character(vec)` in primitive type conversion.

## How was this patch tested?

Unit test in `R/pkg/tests/fulltests/test_sparkSQL.R`.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #19551 from HyukjinKwon/SPARK-17902.
  • Loading branch information
HyukjinKwon committed Oct 26, 2017
1 parent 3073344 commit a83d8d5
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 0 deletions.
3 changes: 3 additions & 0 deletions R/pkg/R/DataFrame.R
Original file line number Diff line number Diff line change
Expand Up @@ -1191,6 +1191,9 @@ setMethod("collect",
vec <- do.call(c, col)
stopifnot(class(vec) != "list")
class(vec) <- PRIMITIVE_TYPES[[colType]]
if (is.character(vec) && stringsAsFactors) {
vec <- as.factor(vec)
}
df[[colIndex]] <- vec
} else {
df[[colIndex]] <- col
Expand Down
6 changes: 6 additions & 0 deletions R/pkg/tests/fulltests/test_sparkSQL.R
Original file line number Diff line number Diff line change
Expand Up @@ -499,6 +499,12 @@ test_that("create DataFrame with different data types", {
expect_equal(collect(df), data.frame(l, stringsAsFactors = FALSE))
})

test_that("SPARK-17902: collect() with stringsAsFactors enabled", {
df <- suppressWarnings(collect(createDataFrame(iris), stringsAsFactors = TRUE))
expect_equal(class(iris$Species), class(df$Species))
expect_equal(iris$Species, df$Species)
})

test_that("SPARK-17811: can create DataFrame containing NA as date and time", {
df <- data.frame(
id = 1:2,
Expand Down

0 comments on commit a83d8d5

Please sign in to comment.