Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-26955][CORE] Align Spark's TimSort to jdk11 implementation #23858

Closed
wants to merge 1 commit into from

Conversation

MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Feb 21, 2019

What changes were proposed in this pull request?

Spark's TimSort deviates from JDK 11 TimSort in a couple places:

  • stackLen was increased in jdk
  • additional cases for break in mergeCollapse: n < 0

In the PR, I propose to align Spark TimSort to jdk implementation.

How was this patch tested?

By existing test suites, in particular, SorterSuite.

@MaxGekk MaxGekk changed the title Align Spark's TimSort to jdk11 implementation [SPARK-26955][CORE] Align Spark's TimSort to jdk11 implementation Feb 21, 2019
*/
private void mergeCollapse() {
while (stackSize > 1) {
int n = stackSize - 2;
if ( (n >= 1 && runLen[n-1] <= runLen[n] + runLen[n+1])
|| (n >= 2 && runLen[n-2] <= runLen[n] + runLen[n-1])) {
if (n > 0 && runLen[n-1] <= runLen[n] + runLen[n+1] ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure about removing the parentheses? that changes the semantics, I believe, to (((a and b) or c) and d) . What's the implementation you're copying here, just to be sure?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. TIL that && has higher precedence that || so this doesn't change things. Actually the whole change on this line doesn't cause any actual change, which is good; aligning the implementation exactly is useful though.

@SparkQA
Copy link

SparkQA commented Feb 21, 2019

Test build #102582 has finished for PR 23858 at commit 0bd8075.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@MaxGekk
Copy link
Member Author

MaxGekk commented Feb 21, 2019

jenkins, retest this, please

@SparkQA
Copy link

SparkQA commented Feb 22, 2019

Test build #102603 has finished for PR 23858 at commit 0bd8075.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Feb 22, 2019

Merged to master

@srowen srowen closed this in 1304974 Feb 22, 2019
@MaxGekk MaxGekk deleted the timsort-java-alignment branch September 18, 2019 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants