Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Failed to convert string with invisible characters to float #10724

Closed
thirtiseven opened this issue Apr 18, 2024 · 0 comments · Fixed by NVIDIA/spark-rapids-jni#1978
Closed
Assignees
Labels
bug Something isn't working

Comments

@thirtiseven
Copy link
Collaborator

Describe the bug
If a number string starts/ends with invisible characters, Spark can convert to double/float normally, but plugin will return null.

Steps/Code to reproduce bug
“value” column is of StringType, the length is its value is 5, and contains invisible character, the real content is as below:

1234^@   ---\u0000
1000^A   ---\u0001

native Spark:

scala> val df = spark.read.parquet("nullcharacter_test_parquet")
df: org.apache.spark.sql.DataFrame = [value: string]

scala> df.selectExpr("value", "length(value)", "cast(value as float)").show(false)
+-----+-------------+------+
|value|length(value)|value |
+-----+-------------+------+
|1234|5            |1234.0|
|1000|5            |1000.0|
+-----+-------------+------+

spark-rapids:

scala> val df = spark.read.parquet("nullcharacter_test_parquet")
df: org.apache.spark.sql.DataFrame = [value: string]

scala> df.selectExpr("value", "length(value)", "cast(value as float)").show(false)


+-----+-------------+-----+
|value|length(value)|value|
+-----+-------------+-----+
|1234|5            |null |
|1000|5            |null |
+-----+-------------+-----+
@thirtiseven thirtiseven added bug Something isn't working ? - Needs Triage Need team to review and classify labels Apr 18, 2024
@thirtiseven thirtiseven self-assigned this Apr 19, 2024
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants