Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-8232][SQL] Add sort_array support #7581

Closed
wants to merge 5 commits into from

Conversation

chenghao-intel
Copy link
Contributor

Add expression sort_array support.

* the array elements and returns it.
*/
case class SortArray(child: Expression)
extends UnaryExpression with ExpectsInputTypes with CodegenFallback {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use CodegenFallback,as the sorting in codegen is quite complex, as we need to generate the java code to access the catalyst data type. (e.g. Seq)

@SparkQA
Copy link

SparkQA commented Jul 22, 2015

Test build #38018 has finished for PR 7581 at commit f7974ce.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class SortArray(child: Expression)

@chenghao-intel
Copy link
Contributor Author

cc @rxin

@chenghao-intel
Copy link
Contributor Author

@rxin, can you review this for me?

def sort_array(col):
"""
Collection function: sorts the input array for the given column in ascending order.
:param col: name of column or expression
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a new line here, or the parameter can not be rendered correctly in doc.

@SparkQA
Copy link

SparkQA commented Jul 27, 2015

Test build #38489 has finished for PR 7581 at commit ceb8a73.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class SortArray(child: Expression)

override def dataType: DataType = child.dataType
override def inputTypes: Seq[AbstractDataType] = Seq(ArrayType)

override def checkInputDataTypes(): TypeCheckResult = child.dataType match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we have checkInputDataTypes, do we still need ExpectsInputTypes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we probably need it, as the ExpectsInputTypes does not support the input type as ArrayType(AtomicType, _)

@SparkQA
Copy link

SparkQA commented Jul 29, 2015

Test build #38798 has finished for PR 7581 at commit c725f6f.

  • This patch fails Python style tests.
  • This patch does not merge cleanly.
  • This patch adds the following public classes (experimental):
    • case class SortArray(base: Expression, ascendingOrder: Expression)

@SparkQA
Copy link

SparkQA commented Jul 29, 2015

Test build #38802 has finished for PR 7581 at commit 384e373.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class SortArray(base: Expression, ascendingOrder: Expression)

@chenghao-intel
Copy link
Contributor Author

@davies any more comments?


(left, right) => {
if (left == null && right == null) {
true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

@SparkQA
Copy link

SparkQA commented Jul 30, 2015

Test build #39033 has finished for PR 7581 at commit 25fbede.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class SortArray(base: Expression, ascendingOrder: Expression)

@chenghao-intel
Copy link
Contributor Author

Thank you @davies I've updated the code by using the ArrayData.

@SparkQA
Copy link

SparkQA commented Jul 31, 2015

Test build #39163 has finished for PR 7581 at commit 664c960.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class SortArray(base: Expression, ascendingOrder: Expression)

@davies
Copy link
Contributor

davies commented Aug 1, 2015

LGTM, will fix the conflict while merging.

@davies
Copy link
Contributor

davies commented Aug 1, 2015

Merged into master, thanks!

@asfgit asfgit closed this in 67ad4e2 Aug 1, 2015
@davies
Copy link
Contributor

davies commented Aug 1, 2015

There are logical conflicts, reverted, will fix conflict again.

asfgit pushed a commit that referenced this pull request Aug 1, 2015
This PR is based on #7581 , just fix the conflict.

Author: Cheng Hao <hao.cheng@intel.com>
Author: Davies Liu <davies@databricks.com>

Closes #7851 from davies/sort_array and squashes the following commits:

a80ef66 [Davies Liu] fix conflict
7cfda65 [Davies Liu] Merge branch 'master' of github.com:apache/spark into sort_array
664c960 [Cheng Hao] update the sort_array by using the ArrayData
276d2d5 [Cheng Hao] add empty line
0edab9c [Cheng Hao] Add asending/descending support for sort_array
80fc0f8 [Cheng Hao] Add type checking
a42b678 [Cheng Hao] Add sort_array support
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants