Skip to content

Conversation

@MrBago
Copy link
Contributor

@MrBago MrBago commented Nov 9, 2019

What changes were proposed in this pull request?

On the worker we express lambda functions as strings and then eval them to create a "mapper" function. This make the code hard to read & limits the # of arguments a udf can support to 256 for python <= 3.6.

This PR rewrites the mapper functions as nested functions instead of "lambda strings" and allows passing in more than 255 args.

Why are the changes needed?

The jira ticket associated with this issue describes how MLflow uses udfs to consume columns as features. This pattern isn't unique and a limit of 255 features is quite low.

Does this PR introduce any user-facing change?

Users can now pass more than 255 cols to a udf function.

How was this patch tested?

Added a unit test for passing in > 255 args to udf.

@mengxr mengxr self-requested a review November 9, 2019 00:20
@mengxr
Copy link
Contributor

mengxr commented Nov 9, 2019

LGTM. Code is much easier to read now.

@SparkQA
Copy link

SparkQA commented Nov 9, 2019

Test build #113481 has finished for PR 26442 at commit 14c6432.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 9, 2019

Test build #113482 has finished for PR 26442 at commit e1bcda1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 9, 2019

Test build #113485 has finished for PR 26442 at commit db40059.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@asfgit asfgit closed this in 8152a87 Nov 9, 2019
@mengxr
Copy link
Contributor

mengxr commented Nov 9, 2019

Merged into master. Thanks!

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

late LGTM too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants