Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-46413][PYTHON] Validate returnType of Arrow Python UDF #44362

Closed
wants to merge 3 commits into from

Conversation

xinrong-meng
Copy link
Member

@xinrong-meng xinrong-meng commented Dec 14, 2023

What changes were proposed in this pull request?

Validate returnType of Arrow Python UDF

Why are the changes needed?

Better error handling and consistency with other types of UDFs.

Does this PR introduce any user-facing change?

Yes, now we raise an error when the given returnType is not supported.

>>> udf(lambda x: x, returnType=VarcharType(10), useArrow=True)
Traceback (most recent call last):
...
pyspark.errors.exceptions.base.PySparkTypeError: [UNSUPPORTED_DATA_TYPE_FOR_ARROW_CONVERSION] VarcharType(10) is not supported in conversion to Arrow.

How was this patch tested?

Unit tests.

Was this patch authored or co-authored using generative AI tooling?

No.

@xinrong-meng
Copy link
Member Author

ERROR:  Error installing bundler:
	The last version of bundler (>= 0) to support your Ruby & RubyGems was 2.4.22. Try installing it with `gem install bundler -v 2.4.22`
	bundler requires Ruby version >= 3.0.0. The current ruby version is 2.7.0.0.
Error: Process completed with exit code 1.

@xinrong-meng
Copy link
Member Author

@ueshin @HyukjinKwon may I get a review please?

@HyukjinKwon
Copy link
Member

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants