Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-29854]lpad and rpad built in function should throw Error or Exception for invalid length value #27024

Closed
wants to merge 1 commit into from

Conversation

07ARB
Copy link
Contributor

@07ARB 07ARB commented Dec 27, 2019

What changes were proposed in this pull request?

lpad and rpad built in function should throw Error or Exception for invalid length value Instead of empty string.

Why are the changes needed?

We should throw Error or Exception message, if user trying to perform LPAD/RPAD operation using invalid argument.

SELECT lpad('hi', 'ankit', '?') => ""   <previous behaviours>
SELECT lpad('hi', 'ankit', '?') => "Error in query: Invalid argument, TypeCheckFailure(argument 2 requires int type, however, ''ankit'' is of string type.);"   <After this PR>

SELECT rpad('hi', 'raj', '?') => ""   <previous behaviours>
SELECT lpad('hi', 'raj', '?') => "Error in query: Invalid argument, TypeCheckFailure(argument 2 requires int type, however, ''raj'' is of string type.);"   <After this PR>

Does this PR introduce any user-facing change?

YES
Screenshot 2019-12-27 at 4 09 46 PM

Screenshot 2019-12-27 at 4 15 55 PM
Screenshot 2019-12-27 at 4 16 52 PM

Screenshot 2019-12-27 at 4 17 05 PM

How was this patch tested?

Added new test case to check invalid length value.

@07ARB
Copy link
Contributor Author

07ARB commented Dec 27, 2019

@cloud-fan and @srowen , please review this PR.

@maropu
Copy link
Member

maropu commented Dec 28, 2019

ok to test

@07ARB
Copy link
Contributor Author

07ARB commented Dec 28, 2019

@maropu , Thank you

@maropu
Copy link
Member

maropu commented Dec 28, 2019

But, there are many functions having the same behaivour? e.g.,

scala> sql("SELECT substring_index('www.apache.org', '.', 'Must be integer')").show()
+----------------------------------------------------------------+
|substring_index(www.apache.org, ., CAST(Must be integer AS INT))|
+----------------------------------------------------------------+
|                                                            null|
+----------------------------------------------------------------+

Actually, implicit casts do so.

@07ARB
Copy link
Contributor Author

07ARB commented Dec 28, 2019

yes , we need to discuss whether we need to handle or not.
but what i feel, we should show error message, so that end user will get some meaningful information.

@maropu
Copy link
Member

maropu commented Dec 28, 2019

Yea, I think this behaivour is weird, but we just follow the hive behviour;
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala#L948-L950

We are planning to support ansi-compatible type coercion, so I think this issue will be resolved if that supported. cc: @gengliangwang @cloud-fan

@SparkQA
Copy link

SparkQA commented Dec 28, 2019

Test build #115885 has finished for PR 27024 at commit 2793ac6.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@07ARB
Copy link
Contributor Author

07ARB commented Dec 28, 2019

in Hive : SELECT lpad('hihhhhhhhhhhhhhhhhhhhhhhh', 'Expected int', '????????????');
Error: Error while compiling statement: FAILED: SemanticException [Error 10016]: Line 1:67 Argument type mismatch ''????????????'': lpad only takes INT/SHORT/BYTE types as 2-ths argument, got DECIMAL (state=42000,code=10016)

@maropu
Copy link
Member

maropu commented Dec 28, 2019

I just meant not the lpad behaivour but the type coercion behaivour;

hive> SELECT '1' + 1;
2.0

If you want to fix this, I think we need to update the implicit cast logics instead of the lpad logic.

@07ARB
Copy link
Contributor Author

07ARB commented Dec 28, 2019

oh , ok got it.

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label May 16, 2020
@github-actions github-actions bot closed this May 17, 2020
@cloud-fan cloud-fan removed the Stale label May 17, 2020
@cloud-fan cloud-fan reopened this May 17, 2020
@cloud-fan
Copy link
Contributor

This behavior LGTM under the ansi mode. We should also update docs/sql-ref-ansi-compliance.md

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@maropu
Copy link
Member

maropu commented May 17, 2020

@07ARB Are you still here? Could you update this PR based on the @cloud-fan suggestion above?

@maropu maropu closed this in 7ca73f0 May 22, 2020
maropu added a commit that referenced this pull request May 22, 2020
…tion for invalid length input

### What changes were proposed in this pull request?

This PR intends to add trivial tests to check #27024 has already been fixed in the master.

Closes #27024

### Why are the changes needed?

For test coverage.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added tests.

Closes #28604 from maropu/SPARK-29854.

Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
(cherry picked from commit 7ca73f0)
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
6 participants