Skip to content

Conversation

@shivaram
Copy link
Contributor

This PR adds a new SparkR programming guide at the top-level. This will be useful for R users as our APIs don't directly match the Scala/Python APIs and as we need to explain SparkR without using RDDs as examples etc.

cc @rxin @davies @pwendell

cc @cafreeman -- Would be great if you could also take a look at this !

shivaram added 2 commits May 28, 2015 23:32
Also update write.df, read.df to handle defaults better
docs/sparkr.md Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"supports operations similar to R data frames, [dplyr]"

add "etc"/"e.g."?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done now

@SparkQA
Copy link

SparkQA commented May 29, 2015

Test build #33721 has finished for PR 6490 at commit d09703c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we hide sparkR.init() and call it in sparkRSQL.init() internally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was thinking about the fact that we have these two init calls being wasteful. But longer term when we say want to introduce ML stuff which requires the SparkContext it might be good to familiarize users with the idea of having a SparkContext around ?

We can definitely do an implicit sparkR.init though if we find that no spark context exists (something like the logic we use in

if (exists(".sparkRjsc", envir = .sparkREnv)) {
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it seems reasonable since we only support one SparkContext at a time.

@davies
Copy link
Contributor

davies commented May 29, 2015

This looks good to me. Can we have a section about Hive? hiveContext and load a table from hive.

@shivaram
Copy link
Contributor Author

Thanks @davies for taking a look. Added the section on HiveContext now

@shivaram
Copy link
Contributor Author

BTW @davies I think we can add the automatic SparkContext creation in a separate PR ? I'd like to keep the amount of code change in this PR small

@SparkQA
Copy link

SparkQA commented May 29, 2015

Test build #33759 has finished for PR 6490 at commit 408dce5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@davies
Copy link
Contributor

davies commented May 29, 2015

@shivaram LGTM, merging this into master and 1.4.

asfgit pushed a commit that referenced this pull request May 29, 2015
This PR adds a new SparkR programming guide at the top-level. This will be useful for R users as our APIs don't directly match the Scala/Python APIs and as we need to explain SparkR without using RDDs as examples etc.

cc rxin davies pwendell

cc cafreeman -- Would be great if you could also take a look at this !

Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>

Closes #6490 from shivaram/sparkr-guide and squashes the following commits:

d5ff360 [Shivaram Venkataraman] Add a section on HiveContext, HQL queries
408dce5 [Shivaram Venkataraman] Fix link
dbb86e3 [Shivaram Venkataraman] Fix minor typo
9aff5e0 [Shivaram Venkataraman] Address comments, use dplyr-like syntax in example
d09703c [Shivaram Venkataraman] Fix default argument in read.df
ea816a1 [Shivaram Venkataraman] Add a new SparkR programming guide Also update write.df, read.df to handle defaults better

(cherry picked from commit 5f48e5c)
Signed-off-by: Davies Liu <davies@databricks.com>
@asfgit asfgit closed this in 5f48e5c May 29, 2015
@SparkQA
Copy link

SparkQA commented May 29, 2015

Test build #33768 timed out for PR 6490 at commit d5ff360 after a configured wait of 150m.

jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
This PR adds a new SparkR programming guide at the top-level. This will be useful for R users as our APIs don't directly match the Scala/Python APIs and as we need to explain SparkR without using RDDs as examples etc.

cc rxin davies pwendell

cc cafreeman -- Would be great if you could also take a look at this !

Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>

Closes apache#6490 from shivaram/sparkr-guide and squashes the following commits:

d5ff360 [Shivaram Venkataraman] Add a section on HiveContext, HQL queries
408dce5 [Shivaram Venkataraman] Fix link
dbb86e3 [Shivaram Venkataraman] Fix minor typo
9aff5e0 [Shivaram Venkataraman] Address comments, use dplyr-like syntax in example
d09703c [Shivaram Venkataraman] Fix default argument in read.df
ea816a1 [Shivaram Venkataraman] Add a new SparkR programming guide Also update write.df, read.df to handle defaults better
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
This PR adds a new SparkR programming guide at the top-level. This will be useful for R users as our APIs don't directly match the Scala/Python APIs and as we need to explain SparkR without using RDDs as examples etc.

cc rxin davies pwendell

cc cafreeman -- Would be great if you could also take a look at this !

Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>

Closes apache#6490 from shivaram/sparkr-guide and squashes the following commits:

d5ff360 [Shivaram Venkataraman] Add a section on HiveContext, HQL queries
408dce5 [Shivaram Venkataraman] Fix link
dbb86e3 [Shivaram Venkataraman] Fix minor typo
9aff5e0 [Shivaram Venkataraman] Address comments, use dplyr-like syntax in example
d09703c [Shivaram Venkataraman] Fix default argument in read.df
ea816a1 [Shivaram Venkataraman] Add a new SparkR programming guide Also update write.df, read.df to handle defaults better
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants