-
Notifications
You must be signed in to change notification settings - Fork 6.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix possible clickhouse-local abort on JSONEachRow schema inference #46731
Conversation
Big diff is just a test file |
for (auto & new_name_and_type : new_names_and_types) | ||
{ | ||
auto & name = new_name_and_type.name; | ||
auto & new_type = new_name_and_type.type; | ||
if (names_set.contains(name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there really difference between these syntax versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like whil using for auto & [name, new_type] : new_names_and_types
name
and new_type
are not references but copied objects.
Maybe because we have only specialization for const const NameAndTypePair &
here:
ClickHouse/src/Core/NamesAndTypes.h
Lines 53 to 62 in b0427c2
/// This needed to use structured bindings for NameAndTypePair | |
/// const auto & [name, type] = name_and_type | |
template <int I> | |
decltype(auto) get(const NameAndTypePair & name_and_type) | |
{ | |
if constexpr (I == 0) | |
return name_and_type.name; | |
else if constexpr (I == 1) | |
return name_and_type.type; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to add specialization for NameAndTypePair & name_and_type
but it didn't help for some reason...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And we want to change the new_type
in chooseResultColumnType
, ok.
Why do we need such a big test case for this, why it can't be reproduced with smaller one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was reproduced only on large file. But I just realized that with address sanitizer it will reproduce with 1 row. Let me update the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A Fuzzer of data formats will be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to add specialization for NameAndTypePair & name_and_type but it didn't help for some reason...
To make it work you have to make sure that:
- Your return type is reference e.g.
return std::ref(name_and_type.name);
(or just avoid usingdecltype(auto)
) - Tuple also holds reference, not value
template <> struct tuple_element<0, DB::NameAndTypePair> { using type = DB::String&; };
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've found a better solution. You can actually keep tuple of values (as it is now), but just change return type from decltype(auto)
to std::tuple_element_t<I, NameAndTypePair>&
or auto&
and of course provide non-const overload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Let me try it
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix possible clickhouse-local abort on JSONEachRow schema inference