Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-22848][SQL] Eliminate mutable state from Stack #20035

Closed
wants to merge 4 commits into from

Conversation

kiszk
Copy link
Member

@kiszk kiszk commented Dec 20, 2017

What changes were proposed in this pull request?

This PR eliminates mutable states from the generated code for Stack.

How was this patch tested?

Existing test suites

s"""
|$code
|$wrapperClass<InternalRow> ${ev.value} = $wrappedArray;
""".stripMargin, isNull = "false")
Copy link
Member Author

@kiszk kiszk Dec 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change does not use inline = true at ctx.addMutableState for correct code generation.

@kiszk kiszk changed the title Enable to execute generated code of function in Generator [SPARK-22848][SQL] Enable to execute generated code of function in Generator Dec 20, 2017
@kiszk kiszk changed the title [SPARK-22848][SQL] Enable to execute generated code of function in Generator [SPARK-22848][SQL] Eliminate mutable state from Stack Dec 20, 2017
@kiszk
Copy link
Member Author

kiszk commented Dec 20, 2017

This PR comes from this discussion.

@SparkQA
Copy link

SparkQA commented Dec 20, 2017

Test build #85189 has finished for PR 20035 at commit 2b65856.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

ev.copy(code = code, isNull = "false")
ev.copy(code =
s"""
|InternalRow[] $rowData = new InternalRow[$numRows];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this creates a large array every time, and I don't think we have data copy issues for generator expressions...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I did not imagine that numRows is large.
I will revert the code for rowData.

@SparkQA
Copy link

SparkQA commented Dec 20, 2017

Test build #85191 has finished for PR 20035 at commit 81306c4.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 20, 2017

Test build #85194 has finished for PR 20035 at commit d31ccd7.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 20, 2017

Test build #85198 has finished for PR 20035 at commit f0163e7.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member Author

kiszk commented Dec 21, 2017

Jenkins, retest this please

@cloud-fan
Copy link
Contributor

LGTM

@SparkQA
Copy link

SparkQA commented Dec 21, 2017

Test build #85223 has finished for PR 20035 at commit f0163e7.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member

viirya commented Dec 21, 2017

LGTM

@kiszk
Copy link
Member Author

kiszk commented Dec 21, 2017

Jenkins, retest this please

@viirya
Copy link
Member

viirya commented Dec 21, 2017

I think the test failure is not related to this change, but the ongoing work to upgrade pyarrow.

@cloud-fan
Copy link
Contributor

yea it's failing globally, I'm merging this PR, thanks!

@asfgit asfgit closed this in cb9fc8d Dec 21, 2017
@SparkQA
Copy link

SparkQA commented Dec 21, 2017

Test build #85237 has finished for PR 20035 at commit f0163e7.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member

viirya commented Dec 21, 2017

I ran the test (HDFSMetadataLogSuite) locally. It should be fine as it passes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants