Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More convenient usage of big integers and ORDER BY WITH FILL. #46152

Merged
merged 12 commits into from
Feb 10, 2023

Conversation

alexey-milovidov
Copy link
Member

@alexey-milovidov alexey-milovidov commented Feb 8, 2023

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

More convenient usage of big integers and ORDER BY WITH FILL. Allow using plain integers for start and end points in WITH FILL when ORDER BY big (128-bit and 256-bit) integers. Fix the wrong result for big integers with negative start or end points. This closes #16733

See also the assertion about unexpected type of exception in debug build https://s3.amazonaws.com/clickhouse-test-reports/46149/a69f9c05a981a6c95af496b47413e540c87e2c09/fuzzer_astfuzzerdebug/report.html

@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-improvement Pull request with some product improvements label Feb 8, 2023
@alexey-milovidov
Copy link
Member Author

I also noticed that the code around type conversion is stupid.

@yakov-olkhovskiy yakov-olkhovskiy self-assigned this Feb 8, 2023
@alexey-milovidov
Copy link
Member Author

@yakov-olkhovskiy I want WITH FILL to use getLeastSuperType to determine the resulting type.
Currently, the logic is strange.

@alexey-milovidov
Copy link
Member Author

But I don't understand - whether we can change the types after WITH FILL.

@alexey-milovidov
Copy link
Member Author

Looks like no.

@alexey-milovidov
Copy link
Member Author

And there are some complications with STEP.

@yakov-olkhovskiy
Copy link
Member

Currently, the logic is strange.

I think it's done this way to simplify step function generation

@alexey-milovidov
Copy link
Member Author

Ready for review.

@@ -53,7 +52,6 @@ class FieldVisitorConvertToNumber : public StaticVisitor<T>

T operator() (const UInt64 & x) const { return T(x); }
T operator() (const Int64 & x) const { return T(x); }
T operator() (const Int128 & x) const { return T(x); }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here were leftovers from when we used UInt128 for UUID a long time ago. I've cleaned it up. Some bugs can be automatically fixed by this change...

@@ -45,28 +45,28 @@ class FieldVisitorAccurateEquals : public StaticVisitor<bool>
if constexpr (std::is_same_v<T, U>)
return l == r;

if constexpr (std::is_arithmetic_v<T> && std::is_arithmetic_v<U>)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enables comparison for big integers.

@@ -49,7 +49,7 @@ bool FillingRow::next(const FillingRow & to_row)
size_t pos = 0;

/// Find position we need to increment for generating next row.
for (; pos < size(); ++pos)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a minor change, but the compiler cannot typically optimize size calls out of the loop due to not enough strict-aliasing rules in C++ (in comparison to Rust).

@@ -27,7 +27,7 @@ Block FillingTransform::transformHeader(Block header, const SortDescription & so
/// Columns which are not from sorting key may not be constant anymore.
for (auto & column : header)
if (column.column && isColumnConst(*column.column) && !sort_keys.contains(column.name))
column.column = column.type->createColumn();
column.column = column.column->convertToFullColumnIfConst();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more natural way to do the same.

@@ -61,7 +61,7 @@ static bool tryConvertFields(FillColumnDescription & descr, const DataTypePtr &
{
WhichDataType which_from(descr.fill_from_type);
if ((which_from.isDateOrDate32() || which_from.isDateTime() || which_from.isDateTime64()) &&
!descr.fill_from_type->equals(*type))
!descr.fill_from_type->equals(*removeNullable(type)))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enables WITH FILL when the type is Nullable, see the test.

@@ -73,8 +73,17 @@ static bool tryConvertFields(FillColumnDescription & descr, const DataTypePtr &
return false;
}

/// TODO Wrong results for big integers.
if (isInteger(type) || which.isDate() || which.isDate32() || which.isDateTime())
if (which.isInt128() || which.isUInt128())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented TODO.
The code remains tricky, maybe it's worth simplifying later.

descr.fill_from = convertFieldToType(descr.fill_from, *to_type);
descr.fill_to = convertFieldToType(descr.fill_to, *to_type);
descr.fill_step = convertFieldToType(descr.fill_step, *to_type);
if (!descr.fill_from.isNull())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much better - we are explicitly asking if the field is convertible, instead of having a risk it is confused with NULL, which has another meaning for the WITH FILL logic.

@@ -184,18 +196,20 @@ FillingTransform::FillingTransform(
fill_column_positions.push_back(block_position);

auto & descr = filling_row.getFillDescription(i);
const auto & type = header_.getByPosition(block_position).type;

const Block & output_header = getOutputPort().getHeader();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking it from the output header is unneeded, as it is currently identical to the input header by the data types, but I was not sure about it and added this code. It is more natural, so worth keeping this change.

const auto & type = header_.getByPosition(block_position).type;

const Block & output_header = getOutputPort().getHeader();
const DataTypePtr & type = removeNullable(output_header.getByPosition(block_position).type);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support for Nullable.


if (type->isValueRepresentedByUnsignedInteger() &&
if (isUnsignedInteger(type) &&
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support for big integers.

@@ -213,7 +227,7 @@ FillingTransform::FillingTransform(
input_positions.emplace_back(idx, p->second);

if (!is_fill_column[idx] && !(interpolate_description && interpolate_description->result_columns_set.contains(column.name)))
other_column_positions.push_back(idx);
other_column_positions.push_back(idx);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style.

@@ -335,8 +349,8 @@ void FillingTransform::transform(Chunk & chunk)
interpolate();
while (filling_row.next(next_row))
{
insertFromFillingRow(res_fill_columns, res_interpolate_columns, res_other_columns, filling_row, interpolate_block);
interpolate();
insertFromFillingRow(res_fill_columns, res_interpolate_columns, res_other_columns, filling_row, interpolate_block);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style.

@alexey-milovidov alexey-milovidov merged commit d5d87bc into master Feb 10, 2023
@alexey-milovidov alexey-milovidov deleted the with-fill-bigint branch February 10, 2023 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-improvement Pull request with some product improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feature request: WITH FILL expression values compatible with column type Nullable
3 participants