[SPARK-16381][SQL][SparkR] Update SQL examples and programming guide for R language binding #14082

keypointt · 2016-07-06T23:53:58Z

https://issues.apache.org/jira/browse/SPARK-16381

What changes were proposed in this pull request?

Update SQL examples and programming guide for R language binding.

Here I just follow example master...liancheng:example-snippet-extraction, created a separate R file to store all the example code.

How was this patch tested?

Manual test on my local machine.
Screenshot as below:

SparkQA · 2016-07-07T00:25:25Z

Test build #61881 has finished for PR 14082 at commit e7def7c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

liancheng · 2016-07-07T04:21:42Z

@shivaram @mengxr It would be nice if any of you can help review this one, thanks!

shivaram · 2016-07-07T22:21:05Z

I'll take a look at this today. Also cc @felixcheung

felixcheung · 2016-07-07T22:52:55Z

docs/sql-programming-guide.md

-## 30   1
-
-{% endhighlight %}
+{% include_example untyped_transformations r/RSparkSQLExample.R %}


this is just internal stuff, but "untyped_transformations" is a bit odd? shouldn't we call this "dataframe_operations" or something?

felixcheung · 2016-07-07T23:01:55Z

looks good, only this minor comment: #14082 (comment)
and some improvement suggestions to example code.

keypointt · 2016-07-07T23:16:14Z

thanks a lot @felixcheung , working on it

SparkQA · 2016-07-08T01:20:22Z

Test build #61946 has finished for PR 14082 at commit 1af09f3.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-07-08T07:18:35Z

Test build #61970 has finished for PR 14082 at commit 828b2cf.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-07-08T07:30:05Z

Test build #61971 has finished for PR 14082 at commit 7dca42d.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

shivaram · 2016-07-08T18:42:11Z

examples/src/main/r/RSparkSQLExample.R

+library(SparkR)
+
+# $example on:init_session$
+sparkR.session()


The python code snippet shows how to set appName, options etc. Could we do something similar here ? i.e something like

sparkR.session(appName='MyApp', sparkConfig=list(spark.executor.memory="1g"))

sure I'll add it

… SPARK-16381

shivaram · 2016-07-08T18:59:00Z

examples/src/main/r/RSparkSQLExample.R

+head(teenagers)
+## name
+## 1 Justin
+


Would be good to add a comment as to what we are doing in the following code block. Something like We can also run custom R-UDFs on Spark DataFrames. Here we prefix all the names with "Name:"

SparkQA · 2016-07-08T19:24:37Z

Test build #61993 has finished for PR 14082 at commit cd184b3.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-07-08T19:50:07Z

Test build #61995 has finished for PR 14082 at commit d5b0b7f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

shivaram · 2016-07-08T21:44:33Z

Thanks @keypointt -- Changes look good to me. @felixcheung any other comments ?

felixcheung · 2016-07-08T22:40:35Z

examples/src/main/r/RSparkSQLExample.R

+library(SparkR)
+
+# $example on:init_session$
+sparkR.session(appName='MyApp', sparkConfig=list(spark.executor.memory="1g"))


I think it'll be great if you could run lint-r on this, typically, our R style would be something like this with extra spaces:

sparkR.session(appName = "MyApp", sparkConfig = list(spark.executor.memory = "1g"))

you might also want to use " to be consistent.

I just found some inconsistency like below, and I'll follow the style you suggested.

no space: https://github.com/keypointt/spark/blob/d5b0b7f111a28c63ca6e501ff0017af64881f0b4/examples/src/main/r/ml.R#L25

with space: https://github.com/keypointt/spark/blob/d5b0b7f111a28c63ca6e501ff0017af64881f0b4/examples/src/main/r/ml.R#L34

we should probably update those too..

ok I'll just submit a quick minor patch

felixcheung · 2016-07-08T22:42:46Z

looks good except for one comment. thanks for putting this together!

felixcheung · 2016-07-08T23:26:25Z

LGTM

keypointt · 2016-07-08T23:56:22Z

I just did a quick scan, the no-space style like session(appName='MyApp', sparkConfig=list(spark.executor.memory="1g")) is only found in example/ folder (in above commit), and all r files are following space style under R/pkg/

shivaram · 2016-07-08T23:57:45Z

Thanks @keypointt -- LGTM. Will merge this once Jenkins passes

keypointt · 2016-07-09T00:17:39Z

Thank you for reviewing :)

SparkQA · 2016-07-09T00:23:58Z

Test build #62005 has finished for PR 14082 at commit 7195750.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-07-09T01:00:31Z

Test build #62004 has finished for PR 14082 at commit a1eca2b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

liancheng · 2016-07-11T12:02:29Z

Since both @shivaram and @felixcheung signed this off, I'm merging this to master and branch-2.0.

Thanks @keypointt for working on this and @shivaram and @felixcheung for the review!

…for R language binding https://issues.apache.org/jira/browse/SPARK-16381 ## What changes were proposed in this pull request? Update SQL examples and programming guide for R language binding. Here I just follow example master...liancheng:example-snippet-extraction, created a separate R file to store all the example code. ## How was this patch tested? Manual test on my local machine. Screenshot as below: ![screen shot 2016-07-06 at 4 52 25 pm](https://cloud.githubusercontent.com/assets/3925641/16638180/13925a58-439a-11e6-8d57-8451a63dcae9.png) Author: Xin Ren <iamshrek@126.com> Closes #14082 from keypointt/SPARK-16381. (cherry picked from commit 9cb1eb7) Signed-off-by: Cheng Lian <lian@databricks.com>

[SPARK-16381] move example code to a separate R file

e7def7c

felixcheung reviewed Jul 7, 2016
View reviewed changes

[SPARK-16381] some fixes, more to come

1af09f3

keypointt added 4 commits July 7, 2016 23:11

[SPARK-16381] make schema merge example runnable

9ac6a70

[SPARK-16381] make sql_query example runnable

05ee46b

[SPARK-16381] make load_programmatically example runnable

828b2cf

[SPARK-16381] replace last showDF()

7dca42d

shivaram reviewed Jul 8, 2016
View reviewed changes

keypointt added 3 commits July 8, 2016 11:49

Merge branch 'master' into SPARK-16381

34ca57c

Merge branch 'SPARK-16381' of https://github.com/keypointt/spark into…

af2365c

… SPARK-16381

[SPARK-16381] minor fix

cd184b3

shivaram reviewed Jul 8, 2016
View reviewed changes

keypointt added 2 commits July 8, 2016 12:01

[SPARK-16381] make it verbose

5e95fdd

[SPARK-16381] remove code duplicate etc

d5b0b7f

felixcheung reviewed Jul 8, 2016
View reviewed changes

[SPARK-16381] style fix

a1eca2b

[SPARK-16381] fix space style in other r examples

7195750

asfgit closed this in 9cb1eb7 Jul 11, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-16381][SQL][SparkR] Update SQL examples and programming guide for R language binding #14082

[SPARK-16381][SQL][SparkR] Update SQL examples and programming guide for R language binding #14082

keypointt commented Jul 6, 2016

SparkQA commented Jul 7, 2016

liancheng commented Jul 7, 2016

shivaram commented Jul 7, 2016

felixcheung Jul 7, 2016

felixcheung commented Jul 7, 2016

keypointt commented Jul 7, 2016

SparkQA commented Jul 8, 2016

SparkQA commented Jul 8, 2016

SparkQA commented Jul 8, 2016

shivaram Jul 8, 2016

keypointt Jul 8, 2016

shivaram Jul 8, 2016

keypointt Jul 8, 2016

SparkQA commented Jul 8, 2016

SparkQA commented Jul 8, 2016

shivaram commented Jul 8, 2016

felixcheung Jul 8, 2016 •

edited

keypointt Jul 8, 2016

felixcheung Jul 8, 2016

keypointt Jul 8, 2016

felixcheung commented Jul 8, 2016

felixcheung commented Jul 8, 2016

keypointt commented Jul 8, 2016

shivaram commented Jul 8, 2016

keypointt commented Jul 9, 2016

SparkQA commented Jul 9, 2016

SparkQA commented Jul 9, 2016

liancheng commented Jul 11, 2016

[SPARK-16381][SQL][SparkR] Update SQL examples and programming guide for R language binding #14082

[SPARK-16381][SQL][SparkR] Update SQL examples and programming guide for R language binding #14082

Conversation

keypointt commented Jul 6, 2016

What changes were proposed in this pull request?

How was this patch tested?

SparkQA commented Jul 7, 2016

liancheng commented Jul 7, 2016

shivaram commented Jul 7, 2016

Choose a reason for hiding this comment

felixcheung commented Jul 7, 2016

keypointt commented Jul 7, 2016

SparkQA commented Jul 8, 2016

SparkQA commented Jul 8, 2016

SparkQA commented Jul 8, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Jul 8, 2016

SparkQA commented Jul 8, 2016

shivaram commented Jul 8, 2016

felixcheung Jul 8, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

felixcheung commented Jul 8, 2016

felixcheung commented Jul 8, 2016

keypointt commented Jul 8, 2016

shivaram commented Jul 8, 2016

keypointt commented Jul 9, 2016

SparkQA commented Jul 9, 2016

SparkQA commented Jul 9, 2016

liancheng commented Jul 11, 2016

felixcheung Jul 8, 2016 •

edited