[SPARK-6806] [SPARKR] [DOCS] Add a new SparkR programming guide #6490

shivaram · 2015-05-29T06:37:13Z

This PR adds a new SparkR programming guide at the top-level. This will be useful for R users as our APIs don't directly match the Scala/Python APIs and as we need to explain SparkR without using RDDs as examples etc.

cc @rxin @davies @pwendell

cc @cafreeman -- Would be great if you could also take a look at this !

Also update write.df, read.df to handle defaults better

concretevitamin · 2015-05-29T07:38:13Z

docs/sparkr.md

"supports operations similar to R data frames, [dplyr]"

add "etc"/"e.g."?

SparkQA · 2015-05-29T08:28:08Z

Test build #33721 has finished for PR 6490 at commit d09703c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

davies · 2015-05-29T16:16:41Z

docs/sparkr.md

Can we hide sparkR.init() and call it in sparkRSQL.init() internally?

Yeah I was thinking about the fact that we have these two init calls being wasteful. But longer term when we say want to introduce ML stuff which requires the SparkContext it might be good to familiarize users with the idea of having a SparkContext around ?

We can definitely do an implicit sparkR.init though if we find that no spark context exists (something like the logic we use in

spark/R/pkg/R/sparkR.R

Line 105 in e7b6177

if (exists(".sparkRjsc", envir = .sparkREnv)) {

)

Yes, it seems reasonable since we only support one SparkContext at a time.

davies · 2015-05-29T18:42:16Z

This looks good to me. Can we have a section about Hive? hiveContext and load a table from hive.

shivaram · 2015-05-29T19:53:11Z

Thanks @davies for taking a look. Added the section on HiveContext now

shivaram · 2015-05-29T20:08:20Z

BTW @davies I think we can add the automatic SparkContext creation in a separate PR ? I'd like to keep the amount of code change in this PR small

SparkQA · 2015-05-29T20:12:23Z

Test build #33759 has finished for PR 6490 at commit 408dce5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

davies · 2015-05-29T21:11:17Z

@shivaram LGTM, merging this into master and 1.4.

This PR adds a new SparkR programming guide at the top-level. This will be useful for R users as our APIs don't directly match the Scala/Python APIs and as we need to explain SparkR without using RDDs as examples etc. cc rxin davies pwendell cc cafreeman -- Would be great if you could also take a look at this ! Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #6490 from shivaram/sparkr-guide and squashes the following commits: d5ff360 [Shivaram Venkataraman] Add a section on HiveContext, HQL queries 408dce5 [Shivaram Venkataraman] Fix link dbb86e3 [Shivaram Venkataraman] Fix minor typo 9aff5e0 [Shivaram Venkataraman] Address comments, use dplyr-like syntax in example d09703c [Shivaram Venkataraman] Fix default argument in read.df ea816a1 [Shivaram Venkataraman] Add a new SparkR programming guide Also update write.df, read.df to handle defaults better (cherry picked from commit 5f48e5c) Signed-off-by: Davies Liu <davies@databricks.com>

SparkQA · 2015-05-29T22:29:48Z

Test build #33768 timed out for PR 6490 at commit d5ff360 after a configured wait of 150m.

This PR adds a new SparkR programming guide at the top-level. This will be useful for R users as our APIs don't directly match the Scala/Python APIs and as we need to explain SparkR without using RDDs as examples etc. cc rxin davies pwendell cc cafreeman -- Would be great if you could also take a look at this ! Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes apache#6490 from shivaram/sparkr-guide and squashes the following commits: d5ff360 [Shivaram Venkataraman] Add a section on HiveContext, HQL queries 408dce5 [Shivaram Venkataraman] Fix link dbb86e3 [Shivaram Venkataraman] Fix minor typo 9aff5e0 [Shivaram Venkataraman] Address comments, use dplyr-like syntax in example d09703c [Shivaram Venkataraman] Fix default argument in read.df ea816a1 [Shivaram Venkataraman] Add a new SparkR programming guide Also update write.df, read.df to handle defaults better

shivaram added 2 commits May 28, 2015 23:32

Add a new SparkR programming guide

ea816a1

Also update write.df, read.df to handle defaults better

Fix default argument in read.df

d09703c

concretevitamin reviewed May 29, 2015
View reviewed changes

davies reviewed May 29, 2015
View reviewed changes

shivaram mentioned this pull request May 29, 2015

added warning on new spark release amplab-extras/SparkR-pkg#251

Merged

shivaram added 3 commits May 29, 2015 11:06

Address comments, use dplyr-like syntax in example

9aff5e0

Fix minor typo

dbb86e3

Fix link

408dce5

Add a section on HiveContext, HQL queries

d5ff360

asfgit closed this in 5f48e5c May 29, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-6806] [SPARKR] [DOCS] Add a new SparkR programming guide #6490

[SPARK-6806] [SPARKR] [DOCS] Add a new SparkR programming guide #6490

Uh oh!

shivaram commented May 29, 2015

Uh oh!

concretevitamin May 29, 2015

Uh oh!

shivaram May 29, 2015

Uh oh!

SparkQA commented May 29, 2015

Uh oh!

davies May 29, 2015

Uh oh!

shivaram May 29, 2015

Uh oh!

davies May 29, 2015

Uh oh!

davies commented May 29, 2015

Uh oh!

shivaram commented May 29, 2015

Uh oh!

shivaram commented May 29, 2015

Uh oh!

SparkQA commented May 29, 2015

Uh oh!

davies commented May 29, 2015

Uh oh!

SparkQA commented May 29, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-6806] [SPARKR] [DOCS] Add a new SparkR programming guide #6490

[SPARK-6806] [SPARKR] [DOCS] Add a new SparkR programming guide #6490

Uh oh!

Conversation

shivaram commented May 29, 2015

Uh oh!

concretevitamin May 29, 2015

Choose a reason for hiding this comment

Uh oh!

shivaram May 29, 2015

Choose a reason for hiding this comment

Uh oh!

SparkQA commented May 29, 2015

Uh oh!

davies May 29, 2015

Choose a reason for hiding this comment

Uh oh!

shivaram May 29, 2015

Choose a reason for hiding this comment

Uh oh!

davies May 29, 2015

Choose a reason for hiding this comment

Uh oh!

davies commented May 29, 2015

Uh oh!

shivaram commented May 29, 2015

Uh oh!

shivaram commented May 29, 2015

Uh oh!

SparkQA commented May 29, 2015

Uh oh!

davies commented May 29, 2015

Uh oh!

SparkQA commented May 29, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants