Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-28962][SQL][FOLLOW-UP] Add the parameter description for the Scala function API filter #27336

Closed
wants to merge 3 commits into from

Conversation

gatorsmile
Copy link
Member

What changes were proposed in this pull request?

This PR is a follow-up PR #25666 for adding the description and example for the Scala function API filter.

Why are the changes needed?

It is hard to tell which parameter is the index column.

Does this PR introduce any user-facing change?

No

How was this patch tested?

N/A

@gatorsmile
Copy link
Member Author

@SparkQA
Copy link

SparkQA commented Jan 23, 2020

Test build #117290 has finished for PR 27336 at commit 59860e7.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

All test passed, the R failure is a known flaky incoming feasibility test. cc @viirya

* checking CRAN incoming feasibility ...Error in .check_package_CRAN_incoming(pkgdir) : 
  dims [product 26] do not match the length of object [0]

@viirya
Copy link
Member

viirya commented Jan 23, 2020

@dongjoon-hyun Thanks for pinging me. Requested help from CRAN.

@SparkQA
Copy link

SparkQA commented Jan 23, 2020

Test build #117318 has finished for PR 27336 at commit ac1c3c0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merged to master.

*
* @param column: the input array column
* @param f: (col, index) => predicate, the boolean predicate to filter the input column
* given the index. Indices start at 0.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this consistent within Spark that the indices parameter starts with 0 in higher-order functions? @ueshin @HyukjinKwon

Copy link
Member

@ueshin ueshin Feb 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ArrayTransform with index argument starts with 0.
We might need to change it from 1 (with a legacy config and migration guide)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the behavior of presto?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually presto's transform or filter don't take index argument.

I remember we had a discussion about the index argument in the PR for zip_with_index (#21121 (comment)).
And for filter, it was done later separately, but seems like the similar context https://issues.apache.org/jira/browse/SPARK-28962.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's too late to change now, let's keep using 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
8 participants