From a0485741f86656c0f4c5a588fd69598f04f49cd1 Mon Sep 17 00:00:00 2001 From: felixcheung Date: Sun, 1 Nov 2015 14:22:28 -0800 Subject: [PATCH 1/2] Add doc for running from RStudio --- docs/sparkr.md | 46 ++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 42 insertions(+), 4 deletions(-) diff --git a/docs/sparkr.md b/docs/sparkr.md index 497a276679f3..5ccaff292a1b 100644 --- a/docs/sparkr.md +++ b/docs/sparkr.md @@ -30,24 +30,62 @@ The entry point into SparkR is the `SparkContext` which connects your R program You can create a `SparkContext` using `sparkR.init` and pass in options such as the application name , any spark packages depended on, etc. Further, to work with DataFrames we will need a `SQLContext`, which can be created from the SparkContext. If you are working from the `sparkR` shell, the -`SQLContext` and `SparkContext` should already be created for you. +`SQLContext` and `SparkContext` should already be created for you, and you would not need to call +`sparkR.init`. +
{% highlight r %} sc <- sparkR.init() sqlContext <- sparkRSQL.init(sc) {% endhighlight %} +
+ +## Starting Up from RStudio -In the event you are creating `SparkContext` instead of using `sparkR` shell or `spark-submit`, you -could also specify certain Spark driver properties. Normally these +You can also start SparkR from RStudio. You can connect your R program to a Spark cluster from +RStudio, R shell, Rscript or other R IDEs. In addition to calling `sparkR.init`, you could also +specify certain Spark driver properties. Normally these [Application properties](configuration.html#application-properties) and [Runtime Environment](configuration.html#runtime-environment) cannot be set programmatically, as the driver JVM process would have been started, in this case SparkR takes care of this for you. To set them, pass them as you would other configuration properties in the `sparkEnvir` argument to `sparkR.init()`. +
{% highlight r %} -sc <- sparkR.init("local[*]", "SparkR", "/home/spark", list(spark.driver.memory="2g")) +if (nchar(Sys.getenv("SPARK_HOME")) < 1) { + Sys.setenv(SPARK_HOME = "/home/spark") +} +library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"))) +sc <- sparkR.init(master = "local[*]", sparkEnvir = list(spark.driver.memory="2g")) {% endhighlight %} +
+ +The following options can be set in `sparkEnvir` with `sparkR.init` from RStudio: + + + + + + + + + + + + + + + + + + + + + + + +
Property NameProperty groupspark-submit equivalent
spark.driver.memoryApplication Properties--driver-memory
spark.driver.extraClassPathRuntime Environment--driver-class-path
spark.driver.extraJavaOptionsRuntime Environment--driver-java-options
spark.driver.extraLibraryPathRuntime Environment--driver-library-path
From 929c8c4582a0a657a3175edd917c38d7094d1d95 Mon Sep 17 00:00:00 2001 From: felixcheung Date: Sun, 1 Nov 2015 16:46:29 -0800 Subject: [PATCH 2/2] slight update with steps --- docs/sparkr.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/sparkr.md b/docs/sparkr.md index 5ccaff292a1b..437bd4756c27 100644 --- a/docs/sparkr.md +++ b/docs/sparkr.md @@ -43,8 +43,10 @@ sqlContext <- sparkRSQL.init(sc) ## Starting Up from RStudio You can also start SparkR from RStudio. You can connect your R program to a Spark cluster from -RStudio, R shell, Rscript or other R IDEs. In addition to calling `sparkR.init`, you could also -specify certain Spark driver properties. Normally these +RStudio, R shell, Rscript or other R IDEs. To start, make sure SPARK_HOME is set in environment +(you can check [Sys.getenv](https://stat.ethz.ch/R-manual/R-devel/library/base/html/Sys.getenv.html)), +load the SparkR package, and call `sparkR.init` as below. In addition to calling `sparkR.init`, you +could also specify certain Spark driver properties. Normally these [Application properties](configuration.html#application-properties) and [Runtime Environment](configuration.html#runtime-environment) cannot be set programmatically, as the driver JVM process would have been started, in this case SparkR takes care of this for you. To set