Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-26014][R] Deprecate R prior to version 3.4 in SparkR #23012

Closed
wants to merge 3 commits into from

Conversation

HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Nov 12, 2018

What changes were proposed in this pull request?

This PR proposes to bump up the minimum versions of R from 3.1 to 3.4.

R version. 3.1.x is too old. It's released 4.5 years ago. R 3.4.0 is released 1.5 years ago. Considering the timing for Spark 3.0, deprecating lower versions, bumping up R to 3.4 might be reasonable option.

It should be good to deprecate and drop < R 3.4 support.

How was this patch tested?

Jenkins tests.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Nov 12, 2018

Tests probably will fail since it produces warnings now. Once we upgrade R to 3.4, it wouldn't fail.

cc @felixcheung. @shaneknapp, @viirya, @shivaram, @falaki, @mengxr, @yanboliang FYI.

This PR is made per http://apache-spark-developers-list.1001551.n3.nabble.com/discuss-SparkR-CRAN-feasibility-check-server-problem-td25605.html

@HyukjinKwon
Copy link
Member Author

adding @srowen too.

@SparkQA
Copy link

SparkQA commented Nov 12, 2018

Test build #98718 has finished for PR 23012 at commit dc2dbd9.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

docs/index.md Outdated Show resolved Hide resolved
Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll put a note in the Docs Text field in the JIRA for release notes.

@felixcheung
Copy link
Member

felixcheung commented Nov 12, 2018 via email

@felixcheung
Copy link
Member

felixcheung commented Nov 12, 2018 via email

@HyukjinKwon
Copy link
Member Author

Yea will take a look to address. But about documenting unsupported, if we explicitly are going to say it's unsupported and dropped, for instance, we should remove the compatibility change (https://github.com/apache/spark/blob/master/R/pkg/src-native/string_hash_code.c) and I assume previous versions don't work. Deprecation step might be more concervative and consistent with dropping steps of other language APIs.

R/pkg/R/sparkR.R Outdated Show resolved Hide resolved
@SparkQA
Copy link

SparkQA commented Nov 12, 2018

Test build #98736 has finished for PR 23012 at commit 3ec34f1.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

R/WINDOWS.md Outdated Show resolved Hide resolved
@felixcheung
Copy link
Member

felixcheung commented Nov 13, 2018 via email

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Nov 13, 2018

Nice. Thanks! - I just saw SPARKR-7839. BTW Felix, are you maybe worrying about that we happen to upgrade R version in Jenkins to 3.4 and .. we could break lower deprecated R version support in Spark 3.0 I guess?

If so, let me put the version check into both places general.R and shell.R. In this way, both shell and submit still show the warnings but the tests will pass with deprecated R versions

@HyukjinKwon
Copy link
Member Author

In this way, we could postpone R upgrade after Spark 3.0.0 release in Jenkins, and could still test the deprecated R version 3.1.

@felixcheung
Copy link
Member

I think it's easier to say unsupported if we are not testing it in jenkins or appveyer. I don't know if we any coverage at release for older R version anyway, so it's better to unsupported then deprecate.

but agree maybe the way to do this is deprecate without updating R in jenkins

@shaneknapp
Copy link
Contributor

howdy howdy!

unless we dockerize spark builds (someday!), we're going to be stuck w/testing against one version of R on the jenkins workers... i've been looking in to packrat to help manage packages, but having more than one version of R will require me manually building and disting it out. and i really, truly, don't want to do that.

let me know how you think i should proceed.

@felixcheung
Copy link
Member

felixcheung commented Nov 14, 2018 via email

@HyukjinKwon
Copy link
Member Author

@shaneknapp, do you roughly know how difficult it is (and do you have some time shortly) to upgrade R from 3.1 to 3.4? I am asking this because I had some difficulties when I tried to manually upgrade from a certain low version to another non-latest version.

If it's expected to take a while, let's go deprecation step.
If that's expected to be less difficult, let's go saying unsupporting way.

Does this sound okay to you @felixcheung?

@shaneknapp
Copy link
Contributor

TL;DR: let's go w/deprecation.

still TL;DR: if i never have to install or manage R again, i will be a happy person!

@HyukjinKwon upgrading R is easy. getting the right mix of R and all of the associated packages working "as expected" is a nightmare.

the biggest problem i foresee is if we upgrade R (and all other packages) on the workers, every version of spark will be tested against this... and there will be bugs, test failures, and other time consuming (and obtuse) problems to debug. multiply this by every branch, and you can see the rabbit hole you've just entered.

for example, a month ago when i finally had time to dive back in to the ubuntu port, after finally figuring out how to install R+friends on ubuntu in an identical way to the centos workers, i STILL was finding problems w/lintr (see: #22896).

anyways: i'm more than happy to upgrade R and all the packages to something much more recent, but i will definitely appreciate some help in the game of test-failure whack-a-mole.

@HyukjinKwon
Copy link
Member Author

Ah .. right makes sense to me. Thanks @shaneknapp. +1

@felixcheung
Copy link
Member

felixcheung commented Nov 14, 2018 via email

@shaneknapp
Copy link
Contributor

@felixcheung @HyukjinKwon

yes: deprecation in this case means we test against R-3.1.1

@HyukjinKwon
Copy link
Member Author

Yup, will address the other comments and update the PR accordingly.

@SparkQA
Copy link

SparkQA commented Nov 15, 2018

Test build #98848 has finished for PR 23012 at commit f153413.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

Merged to master.

Thanks @felixcheung.

@asfgit asfgit closed this in d4130ec Nov 15, 2018
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
## What changes were proposed in this pull request?

This PR proposes to bump up the minimum versions of R from 3.1 to 3.4.

R version. 3.1.x is too old. It's released 4.5 years ago. R 3.4.0 is released 1.5 years ago. Considering the timing for Spark 3.0, deprecating lower versions, bumping up R to 3.4 might be reasonable option.

It should be good to deprecate and drop < R 3.4 support.

## How was this patch tested?

Jenkins tests.

Closes apache#23012 from HyukjinKwon/SPARK-26014.

Authored-by: hyukjinkwon <gurwls223@apache.org>
Signed-off-by: hyukjinkwon <gurwls223@apache.org>
@HyukjinKwon HyukjinKwon deleted the SPARK-26014 branch March 3, 2020 01:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants