Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-16381][SQL][SparkR] Update SQL examples and programming guide for R language binding #14082

Closed
wants to merge 13 commits into from

Conversation

keypointt
Copy link
Contributor

https://issues.apache.org/jira/browse/SPARK-16381

What changes were proposed in this pull request?

Update SQL examples and programming guide for R language binding.

Here I just follow example master...liancheng:example-snippet-extraction, created a separate R file to store all the example code.

How was this patch tested?

Manual test on my local machine.
Screenshot as below:

screen shot 2016-07-06 at 4 52 25 pm

@SparkQA
Copy link

SparkQA commented Jul 7, 2016

Test build #61881 has finished for PR 14082 at commit e7def7c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@liancheng
Copy link
Contributor

@shivaram @mengxr It would be nice if any of you can help review this one, thanks!

@shivaram
Copy link
Contributor

shivaram commented Jul 7, 2016

I'll take a look at this today. Also cc @felixcheung

## 30 1

{% endhighlight %}
{% include_example untyped_transformations r/RSparkSQLExample.R %}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is just internal stuff, but "untyped_transformations" is a bit odd? shouldn't we call this "dataframe_operations" or something?

@felixcheung
Copy link
Member

looks good, only this minor comment: #14082 (comment)
and some improvement suggestions to example code.

@keypointt
Copy link
Contributor Author

thanks a lot @felixcheung , working on it

@SparkQA
Copy link

SparkQA commented Jul 8, 2016

Test build #61946 has finished for PR 14082 at commit 1af09f3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 8, 2016

Test build #61970 has finished for PR 14082 at commit 828b2cf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 8, 2016

Test build #61971 has finished for PR 14082 at commit 7dca42d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

library(SparkR)

# $example on:init_session$
sparkR.session()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The python code snippet shows how to set appName, options etc. Could we do something similar here ? i.e something like

sparkR.session(appName='MyApp', sparkConfig=list(spark.executor.memory="1g"))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure I'll add it

head(teenagers)
## name
## 1 Justin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to add a comment as to what we are doing in the following code block. Something like We can also run custom R-UDFs on Spark DataFrames. Here we prefix all the names with "Name:"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

@SparkQA
Copy link

SparkQA commented Jul 8, 2016

Test build #61993 has finished for PR 14082 at commit cd184b3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 8, 2016

Test build #61995 has finished for PR 14082 at commit d5b0b7f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor

shivaram commented Jul 8, 2016

Thanks @keypointt -- Changes look good to me. @felixcheung any other comments ?

library(SparkR)

# $example on:init_session$
sparkR.session(appName='MyApp', sparkConfig=list(spark.executor.memory="1g"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'll be great if you could run lint-r on this, typically, our R style would be something like this with extra spaces:

sparkR.session(appName = "MyApp", sparkConfig = list(spark.executor.memory = "1g"))
  • you might also want to use " to be consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably update those too..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I'll just submit a quick minor patch

@felixcheung
Copy link
Member

looks good except for one comment. thanks for putting this together!

@felixcheung
Copy link
Member

LGTM

@keypointt
Copy link
Contributor Author

I just did a quick scan, the no-space style like session(appName='MyApp', sparkConfig=list(spark.executor.memory="1g")) is only found in example/ folder (in above commit), and all r files are following space style under R/pkg/

@shivaram
Copy link
Contributor

shivaram commented Jul 8, 2016

Thanks @keypointt -- LGTM. Will merge this once Jenkins passes

@keypointt
Copy link
Contributor Author

Thank you for reviewing :)

@SparkQA
Copy link

SparkQA commented Jul 9, 2016

Test build #62005 has finished for PR 14082 at commit 7195750.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 9, 2016

Test build #62004 has finished for PR 14082 at commit a1eca2b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@liancheng
Copy link
Contributor

Since both @shivaram and @felixcheung signed this off, I'm merging this to master and branch-2.0.

Thanks @keypointt for working on this and @shivaram and @felixcheung for the review!

asfgit pushed a commit that referenced this pull request Jul 11, 2016
…for R language binding

https://issues.apache.org/jira/browse/SPARK-16381

## What changes were proposed in this pull request?

Update SQL examples and programming guide for R language binding.

Here I just follow example master...liancheng:example-snippet-extraction, created a separate R file to store all the example code.

## How was this patch tested?

Manual test on my local machine.
Screenshot as below:

![screen shot 2016-07-06 at 4 52 25 pm](https://cloud.githubusercontent.com/assets/3925641/16638180/13925a58-439a-11e6-8d57-8451a63dcae9.png)

Author: Xin Ren <iamshrek@126.com>

Closes #14082 from keypointt/SPARK-16381.

(cherry picked from commit 9cb1eb7)
Signed-off-by: Cheng Lian <lian@databricks.com>
@asfgit asfgit closed this in 9cb1eb7 Jul 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants