Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-43302][SQL][FOLLOWUP] Code cleanup for PythonUDAF #41142

Closed
wants to merge 8 commits into from

Conversation

cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

This is a followup of #40739 to do some code cleanup

  1. remove the pattern PYTHON_UDAF as it's not used by any rule.
  2. add PythonFuncExpression.evalType for convenience: catalyst rules (including third-party extensions) may want to get the eval type of a python function, no matter it's UDF or UDAF.
  3. update the python profile to use PythonUDAF.resultId instead of AggregateExpression.resultId, to be consistent with PythonUDF

Why are the changes needed?

code cleanup

Does this PR introduce any user-facing change?

no

How was this patch tested?

existing tests

@cloud-fan
Copy link
Contributor Author

cc @HyukjinKwon

sc.profiler_collector.add_profiler(id, memory_profiler)
else:
judf = self._judf
jPythonUDF = judf.apply(_to_seq(sc, cols, _to_java_column))
return Column(jPythonUDF)

def _get_UDF_id(self, jexpr: JavaObject) -> int:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think should be lowercased

Suggested change
def _get_UDF_id(self, jexpr: JavaObject) -> int:
def _get_udf_id(self, jexpr: JavaObject) -> int:

python/pyspark/sql/udf.py Outdated Show resolved Hide resolved
@yaooqinn
Copy link
Member

thanks, merged to master

@yaooqinn yaooqinn closed this in fddf25a May 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants