Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-9317] [SPARKR] Change show to print DataFrame entries #8360

Closed
wants to merge 1 commit into from

Conversation

felixcheung
Copy link
Member

Small update to DataFrame API in SparkR

@shivaram

@shivaram
Copy link
Contributor

Jenkins, ok to test

@SparkQA
Copy link

SparkQA commented Aug 21, 2015

Test build #41375 has finished for PR 8360 at commit e3ae104.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@davies
Copy link
Contributor

davies commented Aug 21, 2015

@felixcheung show() will be called when the result of an expression is a DataFrame, so show() should be cheap. But showDF() is not, especially when you have a large dataset and multiple stages (groupBy or join).

@shivaram
Copy link
Contributor

Hmm I can see both sides of this. The problem is that R users are used to seeing the first few rows of a DataFrame if you just print the name out and this will restore that behavior

cc @rxin

@felixcheung
Copy link
Member Author

From what I can infer from the original JIRA is that we are trying to match R data.frame behavior.
I think it is handy, though it is easy to think of several alternative ways to do this (head(df), showDF(df)) but those will need to be learned.

@shivaram
Copy link
Contributor

@rxin Any thoughts on this ? If we want to change show it might be a good idea to grab this for 1.5

@rxin
Copy link
Contributor

rxin commented Aug 25, 2015

Unfortunately I don't think we should make this change right now.

We can however make a change in the future to automatically show if the dataframe is local (or small). I've been thinking about that -- it'd be great to do that for both Python, Scala, and R.

@shivaram
Copy link
Contributor

Alright I'll update the JIRA and since I guess we want this for all languages I'll mark it as a SQL issue as well.

@felixcheung I think we can close this PR for now and revisit this when we have support from the Scala backend for this.

@felixcheung
Copy link
Member Author

ok, closing. let me know what I can help with!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants