Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

caseWithExpression (case + toInt16OrNull) unexpected behaviour (All CH versions) #9596

Closed
den-crane opened this issue Mar 10, 2020 · 6 comments · Fixed by #50833
Closed

caseWithExpression (case + toInt16OrNull) unexpected behaviour (All CH versions) #9596

den-crane opened this issue Mar 10, 2020 · 6 comments · Fixed by #50833
Labels
bug Confirmed user-visible misbehaviour in official release comp-nullable Nulls related st-fixed unexpected behaviour warmup task The task for new ClickHouse team members. Low risk, moderate complexity, no urgency.

Comments

@den-crane
Copy link
Contributor

den-crane commented Mar 10, 2020

SELECT
    d,
    toInt16OrNull(d),
    caseWithExpression(d, 'a', 3, toInt16OrZero(d)) AS case_zero,
    caseWithExpression(d, 'a', 3, toInt16OrNull(d)) AS case_null,
    if(d = 'a', 3, toInt16OrZero(d)) AS if_zero,
    if(d = 'a', 3, toInt16OrNull(d)) AS if_null
FROM
(
    SELECT arrayJoin(['', '1', 'a']) AS d
)
ORDER BY
    case_zero ASC,
    d ASC

case_null is wrong for 'a'
┌─d─┬─toInt16OrNull(d)─┬─case_zero─┬─case_null─┬─if_zero─┬─if_null─┐
│   │             ᴺᵁᴸᴸ │         0 │      ᴺᵁᴸᴸ │       0 │    ᴺᵁᴸᴸ │
│ 1 │                1 │         1 │         1 │       1 │       1 │
│ a │             ᴺᵁᴸᴸ │         3 │      ᴺᵁᴸᴸ │       3 │       3 │
└───┴──────────────────┴───────────┴───────────┴─────────┴─────────┘

expected
┌─d─┬─toInt16OrNull(d)─┬─case_zero─┬─case_null─┬─if_zero─┬─if_null─┐
│   │             ᴺᵁᴸᴸ │         0 │      ᴺᵁᴸᴸ │       0 │    ᴺᵁᴸᴸ │
│ 1 │                1 │         1 │         1 │       1 │       1 │
│ a │             ᴺᵁᴸᴸ │         3 │         3 │       3 │       3 │
└───┴──────────────────┴───────────┴───────────┴─────────┴─────────┘

@den-crane den-crane added bug Confirmed user-visible misbehaviour in official release unexpected behaviour labels Mar 10, 2020
@den-crane den-crane changed the title caseWithExpression unexpected behaviour (All CH versions) caseWithExpression (case + toInt16OrNull) unexpected behaviour (All CH versions) Mar 10, 2020
@alexey-milovidov
Copy link
Member

Known issue: lack of proper support for Nullable in function transform.

@den-crane
Copy link
Contributor Author

den-crane commented Mar 10, 2020

I remember that I saw something similar but I cannot find the existing issue.

@stavrolia stavrolia self-assigned this Mar 10, 2020
@alexey-milovidov
Copy link
Member

alexey-milovidov commented Mar 10, 2020

caseWithExpression is implemented with function transform
See caseWithExpression.cpp, executeImpl.

The function transform is using "default implementation for nulls", that means if one of the arguments is NULL then the result is also NULL.
See transform.cpp, and
virtual bool useDefaultImplementationForNulls() const { return true; } is by default for IFunction.

It cannot be easily extended for proper support of Nullable, because it is implemented only for a few carefully optimized cases (num -> num, num -> string, string -> num, string -> string) but lacks the generic implementation that can work for all possible types, including Nullable.

Let's note that the behaviour of transform function is very similar to Join (see Join.h, Join.cpp), more specifically - ANY LEFT JOIN. We create a dictionary, then search and substitute the values from this dictionary. We can use Join to generalize the implementation of transform function.

(for this step, ask @amosbird (Telegram: @amosbird) or @4ertus2)

The dictionary inside the function transform is build lazily inside the initialize method, under mutex. It will be worth doing to rewrite it with IFunctionOverloadResolver that will initialize it from constant arguments and return IExecutableFunction.

(for this step, ask @KochetovNicolai)

@alexey-milovidov
Copy link
Member

Also it's a good idea to use ColumnUnique to implement generic version of function transform.

We can convert source array elements to ColumnUnique to represent them as a dictionary to offsets in the destination array.
Then we will use this dictionary for lookups and copy elements with IColumn::insertFrom.

If the types of source column and source array to transform or destination array and destination default value are different, we can convert array to getLeastSupertype in advance.

We can also support non-constant destination array easily.

@alexey-milovidov alexey-milovidov added warmup task The task for new ClickHouse team members. Low risk, moderate complexity, no urgency. and removed help wanted labels Jan 24, 2022
@den-crane
Copy link
Contributor Author

@antonio2368 check this

@alexey-milovidov
Copy link
Member

Fixed in 23.5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed user-visible misbehaviour in official release comp-nullable Nulls related st-fixed unexpected behaviour warmup task The task for new ClickHouse team members. Low risk, moderate complexity, no urgency.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants