Skip to content

Conversation

@gaogaotiantian
Copy link
Contributor

What changes were proposed in this pull request?

For iter based UDF, we use the function f directly for memory profiler to track. However, function f might not be the function that we need to track. It could just return another generator. We should use the return value from f() and use that code object.

Why are the changes needed?

For Python data source, we can't track the correct function. It's also possible that users use similar structure which we can't track.

Does this PR introduce any user-facing change?

No

How was this patch tested?

A new test is added which failed before fix and passed after.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions
Copy link

JIRA Issue Information

=== Bug SPARK-55171 ===
Summary: Memory profiler for iter based UDF tracks the wrong function
Assignee: None
Status: Open
Affected: ["4.2.0"]


This comment was automatically generated by GitHub Actions

@HyukjinKwon
Copy link
Member

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants