-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-16381][SQL][SparkR] Update SQL examples and programming guide for R language binding #14082
Conversation
Test build #61881 has finished for PR 14082 at commit
|
I'll take a look at this today. Also cc @felixcheung |
## 30 1 | ||
|
||
{% endhighlight %} | ||
{% include_example untyped_transformations r/RSparkSQLExample.R %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is just internal stuff, but "untyped_transformations" is a bit odd? shouldn't we call this "dataframe_operations" or something?
looks good, only this minor comment: #14082 (comment) |
thanks a lot @felixcheung , working on it |
Test build #61946 has finished for PR 14082 at commit
|
Test build #61970 has finished for PR 14082 at commit
|
Test build #61971 has finished for PR 14082 at commit
|
library(SparkR) | ||
|
||
# $example on:init_session$ | ||
sparkR.session() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The python code snippet shows how to set appName, options etc. Could we do something similar here ? i.e something like
sparkR.session(appName='MyApp', sparkConfig=list(spark.executor.memory="1g"))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure I'll add it
head(teenagers) | ||
## name | ||
## 1 Justin | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to add a comment as to what we are doing in the following code block. Something like We can also run custom R-UDFs on Spark DataFrames. Here we prefix all the names with "Name:"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
Test build #61993 has finished for PR 14082 at commit
|
Test build #61995 has finished for PR 14082 at commit
|
Thanks @keypointt -- Changes look good to me. @felixcheung any other comments ? |
library(SparkR) | ||
|
||
# $example on:init_session$ | ||
sparkR.session(appName='MyApp', sparkConfig=list(spark.executor.memory="1g")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'll be great if you could run lint-r on this, typically, our R style would be something like this with extra spaces:
sparkR.session(appName = "MyApp", sparkConfig = list(spark.executor.memory = "1g"))
- you might also want to use
"
to be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just found some inconsistency like below, and I'll follow the style you suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably update those too..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I'll just submit a quick minor patch
looks good except for one comment. thanks for putting this together! |
LGTM |
I just did a quick scan, the no-space style like |
Thanks @keypointt -- LGTM. Will merge this once Jenkins passes |
Thank you for reviewing :) |
Test build #62005 has finished for PR 14082 at commit
|
Test build #62004 has finished for PR 14082 at commit
|
Since both @shivaram and @felixcheung signed this off, I'm merging this to master and branch-2.0. Thanks @keypointt for working on this and @shivaram and @felixcheung for the review! |
…for R language binding https://issues.apache.org/jira/browse/SPARK-16381 ## What changes were proposed in this pull request? Update SQL examples and programming guide for R language binding. Here I just follow example master...liancheng:example-snippet-extraction, created a separate R file to store all the example code. ## How was this patch tested? Manual test on my local machine. Screenshot as below: ![screen shot 2016-07-06 at 4 52 25 pm](https://cloud.githubusercontent.com/assets/3925641/16638180/13925a58-439a-11e6-8d57-8451a63dcae9.png) Author: Xin Ren <iamshrek@126.com> Closes #14082 from keypointt/SPARK-16381. (cherry picked from commit 9cb1eb7) Signed-off-by: Cheng Lian <lian@databricks.com>
https://issues.apache.org/jira/browse/SPARK-16381
What changes were proposed in this pull request?
Update SQL examples and programming guide for R language binding.
Here I just follow example master...liancheng:example-snippet-extraction, created a separate R file to store all the example code.
How was this patch tested?
Manual test on my local machine.
Screenshot as below: