Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark 7998 freq item api #6919

Closed
wants to merge 11 commits into from

Conversation

dwmclary
Copy link
Contributor

Here's "a better frequent item API" which provides a DataFrame with each ArrayBuffer expanded into a column. There's surely some improvement that could be done here, but I think this is in the spirit of what the JIRA was asking for.

@davies
Copy link
Contributor

davies commented Jun 21, 2015

ok to test

@SparkQA
Copy link

SparkQA commented Jun 21, 2015

Test build #35377 has finished for PR 6919 at commit 8dec609.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 21, 2015

Test build #35379 has finished for PR 6919 at commit d0ce7d5.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 21, 2015

Test build #35380 has finished for PR 6919 at commit 6f25872.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 21, 2015

Test build #35382 has finished for PR 6919 at commit ff676c9.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 21, 2015

Test build #35396 has finished for PR 6919 at commit 7ce1248.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 21, 2015

Test build #35397 has finished for PR 6919 at commit 8c239c7.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 21, 2015

Test build #35398 has finished for PR 6919 at commit 80c841b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dwmclary
Copy link
Contributor Author

So, I'm wondering if the Scala-specific method actually needs to re-implement, or if it would be cleaner to just call mutable.copyToArray and pass it to the agnostic function. Any thoughts @rxin?

@dwmclary
Copy link
Contributor Author

@davies any review comments?

@davies
Copy link
Contributor

davies commented Jun 30, 2015

ping @rxin

@dwmclary
Copy link
Contributor Author

dwmclary commented Jul 6, 2015

ping @rxin ?

@rxin
Copy link
Contributor

rxin commented Jul 8, 2015

Sorry need some time to think about this.

@dwmclary
Copy link
Contributor Author

dwmclary commented Jul 8, 2015

No problem -- just wanted to make sure it was on your radar.

On Wed, Jul 8, 2015 at 12:55 AM, Reynold Xin notifications@github.com
wrote:

Sorry need some time to think about this.


Reply to this email directly or view it on GitHub
#6919 (comment).

@rxin
Copy link
Contributor

rxin commented Aug 5, 2015

@dwmclary do you mind closing this pull request?

Discussed with Xiangrui Meng and Burak Yavuz offline. We are not going to change the API, but just update the documentation to explain more clearly the schema and how to get the frequent values out.
In 1.6, we will likely create a frequent items UDAF.

@dwmclary dwmclary closed this Aug 5, 2015
@dwmclary
Copy link
Contributor Author

dwmclary commented Aug 5, 2015

Closed.

On Wed, Aug 5, 2015 at 12:49 PM, Reynold Xin notifications@github.com
wrote:

@dwmclary https://github.com/dwmclary do you mind closing this pull
request?

Discussed with Xiangrui Meng and Burak Yavuz offline. We are not going to
change the API, but just update the documentation to explain more clearly
the schema and how to get the frequent values out.
In 1.6, we will likely create a frequent items UDAF.


Reply to this email directly or view it on GitHub
#6919 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants