Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpolate feature #35349

Merged
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
7bb66e6
added INTERPOLATE extension for ORDER BY WITH FILL
yakov-olkhovskiy Mar 17, 2022
0070098
style fix
yakov-olkhovskiy Mar 17, 2022
a8e1671
type match check for INTERPOLATE expressions added, bugfix, printout …
yakov-olkhovskiy Mar 18, 2022
ecf05ec
tests are added, bugfix
yakov-olkhovskiy Mar 19, 2022
90888ea
Update index.md
yakov-olkhovskiy Mar 19, 2022
40c91c3
Update index.md
yakov-olkhovskiy Mar 19, 2022
4f892dc
Update order-by.md
yakov-olkhovskiy Mar 19, 2022
f9ed659
Update order-by.md
yakov-olkhovskiy Mar 19, 2022
5c8a77d
Update order-by.md
yakov-olkhovskiy Mar 19, 2022
b01f965
Update order-by.md
yakov-olkhovskiy Mar 19, 2022
5ae6f80
Update order-by.md
yakov-olkhovskiy Mar 19, 2022
eb7474e
Merge branch 'master' into interpolate-feature
yakov-olkhovskiy Mar 19, 2022
481ee8a
Update FillingTransform.cpp
yakov-olkhovskiy Mar 19, 2022
c4daf51
Update InterpreterSelectQuery.cpp
yakov-olkhovskiy Mar 19, 2022
83f406b
optimization, INTERPOLATE without expr. list, any column is allowed e…
yakov-olkhovskiy Mar 24, 2022
adefcfd
Merge branch 'master' into interpolate-feature
yakov-olkhovskiy Mar 24, 2022
5a4694f
major refactoring, simplified, optimized, bugs fixed
yakov-olkhovskiy Mar 27, 2022
615efa1
aliases processing fixed
yakov-olkhovskiy Mar 28, 2022
6a1e116
refactoring
yakov-olkhovskiy Mar 30, 2022
b5682c1
minor refactoring
yakov-olkhovskiy Mar 31, 2022
a159963
bugfix - columns order tracking
yakov-olkhovskiy Mar 31, 2022
538373a
style fix
yakov-olkhovskiy Mar 31, 2022
0116233
allow INTERPOLATE to reference optimized out columns
yakov-olkhovskiy Apr 1, 2022
ec0ad88
style fix
yakov-olkhovskiy Apr 2, 2022
95ad1bf
use aliases if exist for original_select_set
yakov-olkhovskiy Apr 4, 2022
ff4d295
style fix
yakov-olkhovskiy Apr 4, 2022
e0d6033
all columns can participate in interpolate expression despite if they…
yakov-olkhovskiy Apr 5, 2022
90c4cd3
Merge branch 'master' into interpolate-feature
yakov-olkhovskiy Apr 5, 2022
6b9a349
Update SortDescription.h
yakov-olkhovskiy Apr 5, 2022
ac441b9
compiler suggestions
yakov-olkhovskiy Apr 6, 2022
7dbe8bc
major bugs fixed, tests added, docs updated
yakov-olkhovskiy Apr 7, 2022
64dcddc
fixed ASTInterpolateElement::clone, fixed QueryNormalizer to exclude …
yakov-olkhovskiy Apr 7, 2022
87c2b3e
fixed Nullable, tests added
yakov-olkhovskiy Apr 8, 2022
7293e01
some comments added
yakov-olkhovskiy Apr 11, 2022
2588f80
comment fix
yakov-olkhovskiy Apr 11, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 61 additions & 0 deletions src/Core/InterpolateDescription.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
#include <Core/Block.h>
#include <IO/Operators.h>
#include <Common/JSONBuilder.h>
#include <Core/InterpolateDescription.h>

namespace DB
{

void dumpInterpolateDescription(const InterpolateDescription & description, const Block & /*header*/, WriteBuffer & out)
{
bool first = true;

for (const auto & desc : description)
{
if (!first)
out << ", ";
first = false;

if (desc.column.name.empty())
out << "?";
else
out << desc.column.name;
}
}

void InterpolateColumnDescription::interpolate(Field & field) const
yakov-olkhovskiy marked this conversation as resolved.
Show resolved Hide resolved
{
if (field.isNull())
return;
Block expr_columns;
expr_columns.insert({column.type->createColumnConst(1, field), column.type, column.name});
actions->execute(expr_columns);
expr_columns.getByPosition(0).column->get(0, field);
}

void InterpolateColumnDescription::explain(JSONBuilder::JSONMap & map, const Block & /*header*/) const
{
map.add("Column", column.name);
}

std::string dumpInterpolateDescription(const InterpolateDescription & description)
{
WriteBufferFromOwnString wb;
dumpInterpolateDescription(description, Block{}, wb);
return wb.str();
}

JSONBuilder::ItemPtr explainInterpolateDescription(const InterpolateDescription & description, const Block & header)
{
auto json_array = std::make_unique<JSONBuilder::JSONArray>();
for (const auto & descr : description)
{
auto json_map = std::make_unique<JSONBuilder::JSONMap>();
descr.explain(*json_map, header);
json_array->add(std::move(json_map));
}

return json_array;
}

}
70 changes: 70 additions & 0 deletions src/Core/InterpolateDescription.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
#pragma once

#include <vector>
#include <memory>
#include <cstddef>
#include <string>
#include <Core/Field.h>
#include <Core/SettingsEnums.h>
#include <Common/IntervalKind.h>
#include <Parsers/ASTOrderByElement.h>
#include <Parsers/ASTInterpolateElement.h>
#include <Functions/FunctionsMiscellaneous.h>

class Collator;

namespace DB
{

namespace JSONBuilder
{
class JSONMap;
class IItem;
using ItemPtr = std::unique_ptr<IItem>;
}

class Block;


/// Interpolate description
struct InterpolateColumnDescription
{
using Signature = ExecutableFunctionExpression::Signature;
yakov-olkhovskiy marked this conversation as resolved.
Show resolved Hide resolved

ColumnWithTypeAndName column;
ExpressionActionsPtr actions;
yakov-olkhovskiy marked this conversation as resolved.
Show resolved Hide resolved

explicit InterpolateColumnDescription(const ColumnWithTypeAndName & column_, ExpressionActionsPtr actions_) :
column(column_), actions(actions_) {}

bool operator == (const InterpolateColumnDescription & other) const
{
return column == other.column;
}

bool operator != (const InterpolateColumnDescription & other) const
{
return !(*this == other);
}

void interpolate(Field & field) const;

std::string dump() const
{
return fmt::format("{}", column.name);
}

void explain(JSONBuilder::JSONMap & map, const Block & header) const;
};

/// Description of interpolation for several columns.
using InterpolateDescription = std::vector<InterpolateColumnDescription>;

/// Outputs user-readable description into `out`.
void dumpInterpolateDescription(const InterpolateDescription & description, const Block & header, WriteBuffer & out);

std::string dumpInterpolateDescription(const InterpolateDescription & description);

JSONBuilder::ItemPtr explainInterpolateDescription(const InterpolateDescription & description, const Block & header);

}
55 changes: 31 additions & 24 deletions src/Interpreters/FillingRow.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -19,26 +19,30 @@ bool equals(const Field & lhs, const Field & rhs)
}


FillingRow::FillingRow(const SortDescription & sort_description) : description(sort_description)
FillingRow::FillingRow(const SortDescription & sort_description_, const InterpolateDescription & interpolate_description_)
: sort{*this}
, interpolate{*this}
, sort_description(sort_description_)
, interpolate_description(interpolate_description_)
{
row.resize(description.size());
row.resize(sort_description.size() + interpolate_description.size());
}

bool FillingRow::operator<(const FillingRow & other) const
{
for (size_t i = 0; i < size(); ++i)
for (size_t i = 0; i < sort.size(); ++i)
{
if (row[i].isNull() || other[i].isNull() || equals(row[i], other[i]))
if (sort[i].isNull() || other.sort[i].isNull() || equals(sort[i], other.sort[i]))
continue;
return less(row[i], other[i], getDirection(i));
return less(sort[i], other.sort[i], getDirection(i));
}
return false;
}

bool FillingRow::operator==(const FillingRow & other) const
{
for (size_t i = 0; i < size(); ++i)
if (!equals(row[i], other[i]))
for (size_t i = 0; i < sort.size(); ++i)
if (!equals(sort[i], other.sort[i]))
return false;
return true;
}
Expand All @@ -47,49 +51,52 @@ bool FillingRow::next(const FillingRow & to_row)
{
size_t pos = 0;

for (size_t i = 0; i < to_row.interpolate.size(); ++i)
interpolate[i] = to_row.interpolate[i];

/// Find position we need to increment for generating next row.
for (; pos < row.size(); ++pos)
if (!row[pos].isNull() && !to_row[pos].isNull() && !equals(row[pos], to_row[pos]))
for (; pos < sort.size(); ++pos)
if (!sort[pos].isNull() && !to_row.sort[pos].isNull() && !equals(sort[pos], to_row.sort[pos]))
break;

if (pos == row.size() || less(to_row[pos], row[pos], getDirection(pos)))
if (pos == sort.size() || less(to_row.sort[pos], sort[pos], getDirection(pos)))
return false;

/// If we have any 'fill_to' value at position greater than 'pos',
/// we need to generate rows up to 'fill_to' value.
for (size_t i = row.size() - 1; i > pos; --i)
for (size_t i = sort.size() - 1; i > pos; --i)
{
if (getFillDescription(i).fill_to.isNull() || row[i].isNull())
if (getFillDescription(i).fill_to.isNull() || sort[i].isNull())
continue;

auto next_value = row[i];
auto next_value = sort[i];
getFillDescription(i).step_func(next_value);
if (less(next_value, getFillDescription(i).fill_to, getDirection(i)))
{
row[i] = next_value;
sort[i] = next_value;
initFromDefaults(i + 1);
return true;
}
}

auto next_value = row[pos];
auto next_value = sort[pos];
getFillDescription(pos).step_func(next_value);

if (less(to_row[pos], next_value, getDirection(pos)))
if (less(to_row.sort[pos], next_value, getDirection(pos)))
return false;

row[pos] = next_value;
if (equals(row[pos], to_row[pos]))
sort[pos] = next_value;
if (equals(sort[pos], to_row.sort[pos]))
{
bool is_less = false;
for (size_t i = pos + 1; i < size(); ++i)
for (size_t i = pos + 1; i < sort.size(); ++i)
{
const auto & fill_from = getFillDescription(i).fill_from;
if (!fill_from.isNull())
row[i] = fill_from;
sort[i] = fill_from;
else
row[i] = to_row[i];
is_less |= less(row[i], to_row[i], getDirection(i));
sort[i] = to_row.sort[i];
is_less |= less(sort[i], to_row.sort[i], getDirection(i));
}

return is_less;
Expand All @@ -101,8 +108,8 @@ bool FillingRow::next(const FillingRow & to_row)

void FillingRow::initFromDefaults(size_t from_pos)
{
for (size_t i = from_pos; i < row.size(); ++i)
row[i] = getFillDescription(i).fill_from;
for (size_t i = from_pos; i < sort.size(); ++i)
sort[i] = getFillDescription(i).fill_from;
}


Expand Down
29 changes: 25 additions & 4 deletions src/Interpreters/FillingRow.h
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#pragma once
#include <Core/SortDescription.h>
#include <Core/InterpolateDescription.h>
#include <Columns/IColumn.h>


Expand All @@ -17,7 +18,25 @@ bool equals(const Field & lhs, const Field & rhs);
class FillingRow
{
public:
FillingRow(const SortDescription & sort_description);
struct
{
FillingRow & filling_row;

Field & operator[](size_t index) { return filling_row.row[index]; }
const Field & operator[](size_t index) const { return filling_row.row[index]; }
size_t size() const { return filling_row.sort_description.size(); }
} sort;

struct
{
FillingRow & filling_row;

Field & operator[](size_t index) { return filling_row.row[filling_row.sort_description.size() + index]; }
const Field & operator[](size_t index) const { return filling_row.row[filling_row.sort_description.size() + index]; }
size_t size() const { return filling_row.interpolate_description.size(); }
} interpolate;
yakov-olkhovskiy marked this conversation as resolved.
Show resolved Hide resolved
public:
FillingRow(const SortDescription & sort_description, const InterpolateDescription & interpolate_description);

/// Generates next row according to fill 'from', 'to' and 'step' values.
bool next(const FillingRow & to_row);
Expand All @@ -30,12 +49,14 @@ class FillingRow
bool operator<(const FillingRow & other) const;
bool operator==(const FillingRow & other) const;

int getDirection(size_t index) const { return description[index].direction; }
FillColumnDescription & getFillDescription(size_t index) { return description[index].fill_description; }
int getDirection(size_t index) const { return sort_description[index].direction; }
FillColumnDescription & getFillDescription(size_t index) { return sort_description[index].fill_description; }
InterpolateColumnDescription & getInterpolateDescription(size_t index) { return interpolate_description[index]; }

private:
Row row;
SortDescription description;
SortDescription sort_description;
InterpolateDescription interpolate_description;
};

void insertFromFillingRow(MutableColumns & filling_columns, MutableColumns & other_columns, const FillingRow & filling_row);
Expand Down
21 changes: 20 additions & 1 deletion src/Interpreters/InterpreterSelectQuery.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
#include <Parsers/ASTIdentifier.h>
#include <Parsers/ASTLiteral.h>
#include <Parsers/ASTOrderByElement.h>
#include <Parsers/ASTInterpolateElement.h>
#include <Parsers/ASTSelectWithUnionQuery.h>
#include <Parsers/ASTSelectIntersectExceptQuery.h>
#include <Parsers/ASTTablesInSelectQuery.h>
Expand Down Expand Up @@ -827,6 +828,23 @@ static SortDescription getSortDescription(const ASTSelectQuery & query, ContextP
return order_descr;
}

static InterpolateDescription getInterpolateDescription(const ASTSelectQuery & query, Block block, ContextPtr context)
{
InterpolateDescription interpolate_descr;
interpolate_descr.reserve(query.interpolate()->children.size());

for (const auto & elem : query.interpolate()->children)
{
auto interpolate = elem->as<ASTInterpolateElement &>();
auto syntax_result = TreeRewriter(context).analyze(interpolate.expr, block.getNamesAndTypesList());
ExpressionAnalyzer analyzer(interpolate.expr, syntax_result, context);
ExpressionActionsPtr actions = analyzer.getActions(true, true, CompileExpressions::yes);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can call analyzer.getActionsDAG here, it will just return DAG without actions

interpolate_descr.emplace_back(block.findByName(interpolate.column->getColumnName())->cloneEmpty(), actions);
}

return interpolate_descr;
}

static SortDescription getSortDescriptionFromGroupBy(const ASTSelectQuery & query)
{
SortDescription order_descr;
Expand Down Expand Up @@ -2498,7 +2516,8 @@ void InterpreterSelectQuery::executeWithFill(QueryPlan & query_plan)
if (fill_descr.empty())
return;

auto filling_step = std::make_unique<FillingStep>(query_plan.getCurrentDataStream(), std::move(fill_descr));
InterpolateDescription interpolate_descr = getInterpolateDescription(query, source_header, context);
auto filling_step = std::make_unique<FillingStep>(query_plan.getCurrentDataStream(), std::move(fill_descr), std::move(interpolate_descr));
query_plan.addStep(std::move(filling_step));
}
}
Expand Down
15 changes: 15 additions & 0 deletions src/Parsers/ASTInterpolateElement.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#include <Columns/Collator.h>
#include <Parsers/ASTInterpolateElement.h>
#include <Common/SipHash.h>
#include <IO/Operators.h>


namespace DB
{

/// TODO JOO
void ASTInterpolateElement::formatImpl(const FormatSettings & /*settings*/, FormatState & /*state*/, FormatStateStacked /*frame*/) const
{
}

}
29 changes: 29 additions & 0 deletions src/Parsers/ASTInterpolateElement.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#pragma once

#include <Parsers/IAST.h>


namespace DB
{

class ASTInterpolateElement : public IAST
{
public:
ASTPtr column;
ASTPtr expr;

String getID(char) const override { return "InterpolateElement"; }

ASTPtr clone() const override
{
auto clone = std::make_shared<ASTInterpolateElement>(*this);
clone->cloneChildren();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, I think you need to fix ptr for clone->expr; otherwise it still points to old ast.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe like here

ASTPtr ASTColumnDeclaration::clone() const
{
const auto res = std::make_shared<ASTColumnDeclaration>(*this);
res->children.clear();
if (type)
{
// Type may be an ASTFunction (e.g. `create table t (a Decimal(9,0))`),
// so we have to clone it properly as well.
res->type = type->clone();
res->children.push_back(res->type);
}
if (default_expression)
{
res->default_expression = default_expression->clone();
res->children.push_back(res->default_expression);
}
if (comment)
{
res->comment = comment->clone();
res->children.push_back(res->comment);
}
if (codec)
{
res->codec = codec->clone();
res->children.push_back(res->codec);
}
if (ttl)
{
res->ttl = ttl->clone();
res->children.push_back(res->ttl);
}
return res;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did something, but probably still wrong - will revisit tomorrow...

return clone;
}


protected:
void formatImpl(const FormatSettings & settings, FormatState & state, FormatStateStacked frame) const override;
};

}
1 change: 1 addition & 0 deletions src/Parsers/ASTOrderByElement.h
Original file line number Diff line number Diff line change
Expand Up @@ -37,4 +37,5 @@ class ASTOrderByElement : public IAST
protected:
void formatImpl(const FormatSettings & settings, FormatState & state, FormatStateStacked frame) const override;
};

}
Loading