Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix possible clickhouse-local abort on JSONEachRow schema inference #46731

Merged
merged 4 commits into from
Feb 24, 2023

Conversation

Avogar
Copy link
Member

@Avogar Avogar commented Feb 22, 2023

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in official stable or prestable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fix possible clickhouse-local abort on JSONEachRow schema inference

@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-bugfix Pull request with bugfix, not backported by default label Feb 22, 2023
@Avogar
Copy link
Member Author

Avogar commented Feb 22, 2023

Big diff is just a test file

Comment on lines 226 to 230
for (auto & new_name_and_type : new_names_and_types)
{
auto & name = new_name_and_type.name;
auto & new_type = new_name_and_type.type;
if (names_set.contains(name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there really difference between these syntax versions?

Copy link
Member Author

@Avogar Avogar Feb 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like whil using for auto & [name, new_type] : new_names_and_types
name and new_type are not references but copied objects.
Maybe because we have only specialization for const const NameAndTypePair & here:

/// This needed to use structured bindings for NameAndTypePair
/// const auto & [name, type] = name_and_type
template <int I>
decltype(auto) get(const NameAndTypePair & name_and_type)
{
if constexpr (I == 0)
return name_and_type.name;
else if constexpr (I == 1)
return name_and_type.type;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to add specialization for NameAndTypePair & name_and_type but it didn't help for some reason...

Copy link
Member

@vdimir vdimir Feb 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we want to change the new_type in chooseResultColumnType, ok.

Why do we need such a big test case for this, why it can't be reproduced with smaller one?

Copy link
Member Author

@Avogar Avogar Feb 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was reproduced only on large file. But I just realized that with address sanitizer it will reproduce with 1 row. Let me update the test

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A Fuzzer of data formats will be helpful.

Copy link
Member

@serxa serxa Feb 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to add specialization for NameAndTypePair & name_and_type but it didn't help for some reason...

To make it work you have to make sure that:

  1. Your return type is reference e.g. return std::ref(name_and_type.name); (or just avoid using decltype(auto))
  2. Tuple also holds reference, not value template <> struct tuple_element<0, DB::NameAndTypePair> { using type = DB::String&; };

Copy link
Member

@serxa serxa Feb 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've found a better solution. You can actually keep tuple of values (as it is now), but just change return type from decltype(auto) to std::tuple_element_t<I, NameAndTypePair>& or auto& and of course provide non-const overload.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Let me try it

@vdimir vdimir self-assigned this Feb 22, 2023
@Avogar Avogar merged commit ee754b9 into ClickHouse:master Feb 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-bugfix Pull request with bugfix, not backported by default
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants