coalesce with string and int failed #11860

jayzhan211 · 2024-04-28T00:58:33Z

What happens?

I'm not sure if the result is expected, but it seems weird to me

D select coalesce('a', 1);
Error: Conversion Error: Could not convert string 'a' to INT32
D select coalesce(1, 'a');
┌──────────────────┐
│ COALESCE(1, 'a') │
│      int32       │
├──────────────────┤
│                1 │
└──────────────────┘

I expect either they have errors or expected results, 'a' and 1

To Reproduce

D select coalesce('a', 1);
Error: Conversion Error: Could not convert string 'a' to INT32
D select coalesce(1, 'a');
┌──────────────────┐
│ COALESCE(1, 'a') │
│      int32       │
├──────────────────┤
│                1 │
└──────────────────┘

OS:

Macos M2

DuckDB Version:

v0.10.0 20b1486

DuckDB Client:

cli

Full Name:

Jay Zhan

Affiliation:

None

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include all code required to reproduce the issue?

Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

Yes, I have

The text was updated successfully, but these errors were encountered:

ewencp · 2024-05-06T04:14:48Z

The difference in behavior here is due to a combination of type resolution and short circuiting in execution of the COALESCE such that after finding a non-null value, no other values are converted. In the type resolution, the string literal is just being treated as an unknown. DuckDB is lax when resolving types for string literals with a second operand and seems to just assume the string can convert to the normalized type of the second operand.

Then the execution of the COALESCE works from left to right on the arguments and once it finds a non-NULL, short-circuits and doesn't evaluate the rest of them, including converting the literal to an integer. This explains why the first example works, but in the second example the first check on an argument to COALESCE needs to convert the literal and fails.

The "right" fix for this kinda depends.

In, e.g., Postgres, the implicit conversion for string literals to non-string-like types isn't permitted -- it'd only map to string-like types implicitly and require an explicit cast. Arguably even if you wanted to be more lax in DuckDB, this seems like something that should be caught during constant folding in the optimizer. Alternatively, the type resolution code could be adjusted to check that the literal is actually convertible to the type it's assuming. Or, despite being a bit redundant, possibly just doing both (the latter happens earlier, only once per query/planning, and could potentially give more useful error messages, the former just seems worthwhile to avoid redundant work (which I think is being performed, but haven't verified).

This could also be considered "correct" behavior. But doesn't seem great since it's confusing and if the string literal argument was a column reference to a textual type it generates an error re: types and requiring an explicit cast.

On test coverage, from a quick review it seems like the relevant tests only use explicit casts and probably aren't covering quite enough ordering cases as the error tests are primarily focused (for good reason) on NULL tests.

ewencp · 2024-05-06T20:27:14Z

Actually it looks like the constant folding rule in the optimizer is trying to evaluate the cast to simplify, but since the same execution code is used, the optimization is also short circuited. It might be possible to catch/rethrow ConversionException specifically in ExpressionExecutor to make the error in the first case be caught earlier (currently it just fails to get optimized, then fails again in query evaluation). But to force different behavior in the optimizer to catch the type error, the ExpressionExecutor would need more context about the phase it's being executed in (optimization or execution).

Arguably this simply isn't a bug since the lax implicit conversion rules (which are specified in the SQL docs) combined with an expectation of lazy evaluation would mean the current behavior makes sense and is expected. But I'm not one of the duckdb devs, would be good to have one of them weigh in.

jayzhan211 added the needs triage label Apr 28, 2024

szarnyasg added the reproduced label Apr 28, 2024

duckdblabs-bot removed the needs triage label Apr 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

coalesce with string and int failed #11860

coalesce with string and int failed #11860

jayzhan211 commented Apr 28, 2024

ewencp commented May 6, 2024

ewencp commented May 6, 2024

coalesce with string and int failed #11860

coalesce with string and int failed #11860

Comments

jayzhan211 commented Apr 28, 2024

What happens?

To Reproduce

OS:

DuckDB Version:

DuckDB Client:

Full Name:

Affiliation:

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

Did you include all relevant data sets for reproducing the issue?

Did you include all code required to reproduce the issue?

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

ewencp commented May 6, 2024

ewencp commented May 6, 2024