-
Notifications
You must be signed in to change notification settings - Fork 29.2k
Fixes SPARK-12910: R version for installing sparkR #10836
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
9a41d47
Fixes SPARK-12910: R version for installing sparkR
napsternxg 17fe916
Merge branch 'master' of github.com:apache/spark
napsternxg dfe0c82
Merge branch 'master' of github.com:apache/spark
napsternxg 70782ab
Fixes SPARK-12910: R version for installing sparkR
napsternxg 0b3960a
Merge branch 'master' of github.com:apache/spark
napsternxg File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If one were running from R studio with the steps in
Using SparkR from RStudiobelow he wouldn't have to install or run install-dev.sh though - could we clarify that?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am running this on a machine with no X-server hence no R-studio, so this kind of functionality will be needed for users like me.
Even for R-studio I feel sparkR needs to be compiled with the version of R >= 3.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, I'm referring the part about
sparkR need to be created in$SPARK_HOME/R/lib. This can be done by running the script$SPARK_HOME/R/install-dev.sh``There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is needed as mentioned in running sparkR from R studio. The script there tries to access the lib location which might not be present by default in the sparkR folder if the wrong version of R is selected by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, this line
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))added SPARK_HOME/R/lib into R's lib path and allows R, any running version, to load SparkR package from there - SparkR packages does not need to be installed with
R CMD INSTALL(in install-dev.sh) at all.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sorry if my I was not clear before but what I mean is the following:
In order for the code
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))to work, the directory$SPARK_HOME/R/libneeds to exist. When we build spark usingbuild/mvn -DskipTests clean packagethis directory is not created by default. Hence we have to runinstall-dev.shin order to use SparkR from an R shell.If we look at the code in
install-dev.sh, the following lines actually create thelibdirectory.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I think I get your point now.
I guess we are saying this README.md is more for developer, so I'm ok with what you have here.
There are users that are not building Spark from source and are running with the binary release, in which case the SPARK_HOME/R/lib is there and they would not need to install the SparkR package. Similarly when running SparkR with a cluster manager, on the worker nodes SparkR would not need to be installed either. I agree they are possibly outside the scope of this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great. So is this PR ready for a merge into the master ?