Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-46820][PYTHON] Fix error message regression by restoring new_msg #44859

Closed
wants to merge 12 commits into from

Conversation

itholic
Copy link
Contributor

@itholic itholic commented Jan 24, 2024

What changes were proposed in this pull request?

This PR proposes to fix error message regression by restoring new_msg.

Why are the changes needed?

In the past few PRs, we mistakenly remove new_msg which introduces error message regression.

Does this PR introduce any user-facing change?

No API change, but the user-facing error message is improved

Before

>>> from pyspark.sql.types import StructType, StructField, StringType, IntegerType
>>> schema = StructType([
...     StructField("name", StringType(), nullable=True),
...     StructField("age", IntegerType(), nullable=False)
... ])
>>> df = spark.createDataFrame([(["asd", None])], schema)
pyspark.errors.exceptions.base.PySparkValueError: [CANNOT_BE_NONE] Argument `obj` cannot be None.

After

>>> from pyspark.sql.types import StructType, StructField, StringType, IntegerType
>>> schema = StructType([
...     StructField("name", StringType(), nullable=True),
...     StructField("age", IntegerType(), nullable=False)
... ])
>>> df = spark.createDataFrame([(["asd", None])], schema)
pyspark.errors.exceptions.base.PySparkValueError: field age: This field is not nullable, but got None

How was this patch tested?

The existing CI should pass

Was this patch authored or co-authored using generative AI tooling?

No.

@itholic itholic changed the title [SPARK-46820][PYTHON] Improve error message when createDataFrame fails nullability check [SPARK-46820][PYTHON] Fix error message regression by restoring new_msg Jan 24, 2024
@itholic
Copy link
Contributor Author

itholic commented Jan 24, 2024

Thanks @HyukjinKwon for reviewing. Just fixed regressions from past few PRs, and updated the PR title & description accordingly.

@itholic itholic changed the title [SPARK-46820][PYTHON] Fix error message regression by restoring new_msg [WIP][SPARK-46820][PYTHON] Fix error message regression by restoring new_msg Jan 24, 2024
@itholic itholic marked this pull request as draft January 24, 2024 08:26
@itholic itholic changed the title [WIP][SPARK-46820][PYTHON] Fix error message regression by restoring new_msg [SPARK-46820][PYTHON] Fix error message regression by restoring new_msg Feb 13, 2024
@itholic itholic marked this pull request as ready for review February 13, 2024 10:53
message_parameters={"arg_name": "obj"},
error_class="FIELD_NOT_NULLABLE",
message_parameters={
"field_name": name if name is not None else "",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the error message would look weird if this is an empty string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separated into two error class FIELD_NOT_NULLABLE and FIELD_NOT_NULLABLE_WITN_NAME. Please let me know if any other suggestions!

@HyukjinKwon
Copy link
Member

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants