Skip to content

[MINOR][SQL][PYTHON] Make projection creation separate of the output in FlatMapGroupsInPandasExec#23739

Closed
HyukjinKwon wants to merge 1 commit intoapache:masterfrom
HyukjinKwon:SPARK-26832
Closed

[MINOR][SQL][PYTHON] Make projection creation separate of the output in FlatMapGroupsInPandasExec#23739
HyukjinKwon wants to merge 1 commit intoapache:masterfrom
HyukjinKwon:SPARK-26832

Conversation

@HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Feb 6, 2019

What changes were proposed in this pull request?

The projection creation at output looks confusing. It looks like creating each projection for each record (but actually it doesn't). We should better pull it out.

How was this patch tested?

Existing tests should cover.

@SparkQA
Copy link

SparkQA commented Feb 6, 2019

Test build #102059 has finished for PR 23739 at commit b9eed23.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

retest this please

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice finding.

@SparkQA
Copy link

SparkQA commented Feb 6, 2019

Test build #102060 has finished for PR 23739 at commit b9eed23.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Feb 6, 2019

Test build #102062 has finished for PR 23739 at commit b9eed23.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

Thanks @HeartSaVioR . Adding @BryanCutler and @icexelloss as well.

@HyukjinKwon HyukjinKwon changed the title [SPARK-26832][SQL][PYTHON] Avoid project creation per record at Python's grouped vectorized UDF [MINOR][SQL][PYTHON] Make projection creation separate of the output in FlatMapGroupsInPandasExec Feb 7, 2019
@HyukjinKwon
Copy link
Member Author

D'oh, sorry guys, I updated the JIRA and PR. I misread it.

@HyukjinKwon
Copy link
Member Author

Hm, let me just leave this closed. Minor style fixes are not quite encouraged anyway. I might likely touch this code soon since I am working on vectorized R native udfs (gapply and dapply). Let me fix it if I happen to touch this code later.

@HyukjinKwon HyukjinKwon closed this Feb 7, 2019
@HyukjinKwon
Copy link
Member Author

BTW, thanks for reviewing this guys.

@HyukjinKwon HyukjinKwon deleted the SPARK-26832 branch March 3, 2020 01:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants