[SPARK-39372][R] Support R 4.2.0 #36758

HyukjinKwon · 2022-06-03T08:47:04Z

What changes were proposed in this pull request?

This PR proposes:

Updates AppVeyor to use the latest R version 4.2.0.

Uses the correct way of checking if an object is a matrix: is.matrix.
After R 4.2.0, class(upperBoundsOnCoefficients) != "matrix") fails:

-- 1. Error (test_mllib_classification.R:245:3): spark.logit -------------------
Error in `if (class(upperBoundsOnCoefficients) != "matrix") {
    stop("upperBoundsOnCoefficients must be a matrix.")
}`: the condition has length > 1

This fixes spark.logit when lowerBoundsOnCoefficients or upperBoundsOnCoefficients is specified.

Explicitly use the first element in is.na comparison. From R 4.2.0, it throws an exception as below:
```
Error in if (is.na(c(1, 2))) print("abc") : the condition has length > 1
```
Previously it was a warning.

This fixes createDataFrame or as.DataFrame when the data type is a nested complex type.

Why are the changes needed?

To support/test the latest R. R community tends to use the latest versions aggressively.

Does this PR introduce any user-facing change?

Yes, after this PR, we officially support R 4.2.0 in SparkR.

How was this patch tested?

CI in this PR should test it out.

dongjoon-hyun

+1, LGTM.

HyukjinKwon · 2022-06-03T10:06:29Z

dev/appveyor-install-dependencies.ps1

@@ -129,7 +129,7 @@ $env:PATH = "$env:HADOOP_HOME\bin;" + $env:PATH
 Pop-Location

 # ========================== R
-$rVer = "4.0.2"
+$rVer = "4.2.0"
 $rToolsVer = "4.0.2"


While the tests passed, it failed to download RTools 4.2.0. I reverted the RTools upgrade here for now.

HyukjinKwon · 2022-06-03T12:48:14Z

R/pkg/R/serialize.R

@@ -58,7 +58,12 @@ writeObject <- function(con, object, writeType = TRUE) {
  # Checking types is needed here, since 'is.na' only handles atomic vectors,
  # lists and pairlists
  if (type %in% c("integer", "character", "logical", "double", "numeric")) {
-    if (is.na(object)) {
+    if (is.na(object[[1]])) {


R 4.1 and below:

Warning in if (is.na(c(1, 2))) print("abc") : the condition has length > 1 and only the first element will be used

R 4.2+:

Error in if (is.na(c(1, 2))) print("abc") : the condition has length > 1

HyukjinKwon · 2022-06-03T12:49:20Z

Tests should pass now ..

HyukjinKwon · 2022-06-03T12:59:30Z

R/pkg/R/serialize.R

+      # Uses the first element for now to keep the behavior same as R before
+      # 4.2.0. This is wrong because we should differenciate c(NA) from a
+      # single NA as the former means array(null) and the latter means null
+      # in Spark SQL. However, it requires non-trivial comparison to distinguish


e.g.) we should check if the input is vector, list, array, etc, which is exactly being done at getSerdeType. However, this comparison here (up to my best knowledge) is a shortcut to avoid the overhead from getSerdeType. So, I just decided to leave it as is for now.

HyukjinKwon · 2022-06-03T13:01:27Z

cc @felixcheung @shivaram @viirya too FYI

dongjoon-hyun

+1, LGTM. Merged to master.

viirya

lgtm

HyukjinKwon · 2022-06-03T22:42:32Z

Thanks!!!

Update R version to 4.2.0 in AppVeyor

932da74

github-actions bot added the BUILD label Jun 3, 2022

dongjoon-hyun approved these changes Jun 3, 2022

View reviewed changes

Keep the Rtools version as 4.0 for now

239a275

HyukjinKwon commented Jun 3, 2022

View reviewed changes

Use is.matrix to check if the input is a matrix

e398d4d

github-actions bot added ML R labels Jun 3, 2022

HyukjinKwon changed the title ~~[SPARK-39372][INFRA][R] Update R version to 4.2.0 in AppVeyor~~ [SPARK-39372][R] Support R 4.2.0 Jun 3, 2022

HyukjinKwon and others added 2 commits June 3, 2022 20:29

Apply suggestions from code review

3ec7fca

Fix

ddda616

HyukjinKwon commented Jun 3, 2022

View reviewed changes

dongjoon-hyun approved these changes Jun 3, 2022

View reviewed changes

dongjoon-hyun closed this in c63e37e Jun 3, 2022

viirya reviewed Jun 3, 2022

View reviewed changes

HyukjinKwon deleted the upgrade-r-appveyor branch January 15, 2024 00:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-39372][R] Support R 4.2.0 #36758

[SPARK-39372][R] Support R 4.2.0 #36758

HyukjinKwon commented Jun 3, 2022 •

edited

dongjoon-hyun left a comment

HyukjinKwon Jun 3, 2022 •

edited

HyukjinKwon Jun 3, 2022

HyukjinKwon commented Jun 3, 2022

HyukjinKwon Jun 3, 2022

HyukjinKwon commented Jun 3, 2022

dongjoon-hyun left a comment

viirya left a comment

HyukjinKwon commented Jun 3, 2022

[SPARK-39372][R] Support R 4.2.0 #36758

[SPARK-39372][R] Support R 4.2.0 #36758

Conversation

HyukjinKwon commented Jun 3, 2022 • edited

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

dongjoon-hyun left a comment

Choose a reason for hiding this comment

HyukjinKwon Jun 3, 2022 • edited

Choose a reason for hiding this comment

HyukjinKwon Jun 3, 2022

Choose a reason for hiding this comment

HyukjinKwon commented Jun 3, 2022

HyukjinKwon Jun 3, 2022

Choose a reason for hiding this comment

HyukjinKwon commented Jun 3, 2022

dongjoon-hyun left a comment

Choose a reason for hiding this comment

viirya left a comment

Choose a reason for hiding this comment

HyukjinKwon commented Jun 3, 2022

HyukjinKwon commented Jun 3, 2022 •

edited

HyukjinKwon Jun 3, 2022 •

edited