Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpolate feature #35349

Merged
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
7bb66e6
added INTERPOLATE extension for ORDER BY WITH FILL
yakov-olkhovskiy Mar 17, 2022
0070098
style fix
yakov-olkhovskiy Mar 17, 2022
a8e1671
type match check for INTERPOLATE expressions added, bugfix, printout …
yakov-olkhovskiy Mar 18, 2022
ecf05ec
tests are added, bugfix
yakov-olkhovskiy Mar 19, 2022
90888ea
Update index.md
yakov-olkhovskiy Mar 19, 2022
40c91c3
Update index.md
yakov-olkhovskiy Mar 19, 2022
4f892dc
Update order-by.md
yakov-olkhovskiy Mar 19, 2022
f9ed659
Update order-by.md
yakov-olkhovskiy Mar 19, 2022
5c8a77d
Update order-by.md
yakov-olkhovskiy Mar 19, 2022
b01f965
Update order-by.md
yakov-olkhovskiy Mar 19, 2022
5ae6f80
Update order-by.md
yakov-olkhovskiy Mar 19, 2022
eb7474e
Merge branch 'master' into interpolate-feature
yakov-olkhovskiy Mar 19, 2022
481ee8a
Update FillingTransform.cpp
yakov-olkhovskiy Mar 19, 2022
c4daf51
Update InterpreterSelectQuery.cpp
yakov-olkhovskiy Mar 19, 2022
83f406b
optimization, INTERPOLATE without expr. list, any column is allowed e…
yakov-olkhovskiy Mar 24, 2022
adefcfd
Merge branch 'master' into interpolate-feature
yakov-olkhovskiy Mar 24, 2022
5a4694f
major refactoring, simplified, optimized, bugs fixed
yakov-olkhovskiy Mar 27, 2022
615efa1
aliases processing fixed
yakov-olkhovskiy Mar 28, 2022
6a1e116
refactoring
yakov-olkhovskiy Mar 30, 2022
b5682c1
minor refactoring
yakov-olkhovskiy Mar 31, 2022
a159963
bugfix - columns order tracking
yakov-olkhovskiy Mar 31, 2022
538373a
style fix
yakov-olkhovskiy Mar 31, 2022
0116233
allow INTERPOLATE to reference optimized out columns
yakov-olkhovskiy Apr 1, 2022
ec0ad88
style fix
yakov-olkhovskiy Apr 2, 2022
95ad1bf
use aliases if exist for original_select_set
yakov-olkhovskiy Apr 4, 2022
ff4d295
style fix
yakov-olkhovskiy Apr 4, 2022
e0d6033
all columns can participate in interpolate expression despite if they…
yakov-olkhovskiy Apr 5, 2022
90c4cd3
Merge branch 'master' into interpolate-feature
yakov-olkhovskiy Apr 5, 2022
6b9a349
Update SortDescription.h
yakov-olkhovskiy Apr 5, 2022
ac441b9
compiler suggestions
yakov-olkhovskiy Apr 6, 2022
7dbe8bc
major bugs fixed, tests added, docs updated
yakov-olkhovskiy Apr 7, 2022
64dcddc
fixed ASTInterpolateElement::clone, fixed QueryNormalizer to exclude …
yakov-olkhovskiy Apr 7, 2022
87c2b3e
fixed Nullable, tests added
yakov-olkhovskiy Apr 8, 2022
7293e01
some comments added
yakov-olkhovskiy Apr 11, 2022
2588f80
comment fix
yakov-olkhovskiy Apr 11, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/en/sql-reference/statements/select/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ SELECT [DISTINCT [ON (column1, column2, ...)]] expr_list
[WHERE expr]
[GROUP BY expr_list] [WITH ROLLUP|WITH CUBE] [WITH TOTALS]
[HAVING expr]
[ORDER BY expr_list] [WITH FILL] [FROM expr] [TO expr] [STEP expr]
[ORDER BY expr_list] [WITH FILL] [FROM expr] [TO expr] [STEP expr] [INTERPOLATE (expr_list)]
[LIMIT [offset_value, ]n BY columns]
[LIMIT [n, ]m] [WITH TIES]
[SETTINGS ...]
Expand Down
60 changes: 60 additions & 0 deletions docs/en/sql-reference/statements/select/order-by.md
Original file line number Diff line number Diff line change
Expand Up @@ -280,13 +280,15 @@ To fill multiple columns, add `WITH FILL` modifier with optional parameters afte

``` sql
ORDER BY expr [WITH FILL] [FROM const_expr] [TO const_expr] [STEP const_numeric_expr], ... exprN [WITH FILL] [FROM expr] [TO expr] [STEP numeric_expr]
[INTERPOLATE (col AS expr, ... colN AS exprN)]
KochetovNicolai marked this conversation as resolved.
Show resolved Hide resolved
```

`WITH FILL` can be applied for fields with Numeric (all kinds of float, decimal, int) or Date/DateTime types. When applied for `String` fields, missed values are filled with empty strings.
When `FROM const_expr` not defined sequence of filling use minimal `expr` field value from `ORDER BY`.
When `TO const_expr` not defined sequence of filling use maximum `expr` field value from `ORDER BY`.
When `STEP const_numeric_expr` defined then `const_numeric_expr` interprets `as is` for numeric types, as `days` for Date type, as `seconds` for DateTime type. It also supports [INTERVAL](https://clickhouse.com/docs/en/sql-reference/data-types/special-data-types/interval/) data type representing time and date intervals.
When `STEP const_numeric_expr` omitted then sequence of filling use `1.0` for numeric type, `1 day` for Date type and `1 second` for DateTime type.
`INTERPOLATE` can be applied to columns not participating in `ORDER BY WITH FILL`. Such columns are filled based on previous field value by applying `expr`.

Example of a query without `WITH FILL`:

Expand Down Expand Up @@ -483,4 +485,62 @@ Result:
└────────────┴────────────┴──────────┘
```

Example of a query without `INTERPOLATE`:

``` sql
SELECT n, source, inter FROM (
SELECT toFloat32(number % 10) AS n, 'original' AS source, number as inter
FROM numbers(10) WHERE number % 3 = 1
) ORDER BY n WITH FILL FROM 0 TO 5.51 STEP 0.5;
```

Result:

``` text
┌───n─┬─source───┬─inter─┐
│ 0 │ │ 0 │
│ 0.5 │ │ 0 │
│ 1 │ original │ 1 │
│ 1.5 │ │ 0 │
│ 2 │ │ 0 │
│ 2.5 │ │ 0 │
│ 3 │ │ 0 │
│ 3.5 │ │ 0 │
│ 4 │ original │ 4 │
│ 4.5 │ │ 0 │
│ 5 │ │ 0 │
│ 5.5 │ │ 0 │
│ 7 │ original │ 7 │
└─────┴──────────┴───────┘
```

Same query after applying `INTERPOLATE`:

``` sql
SELECT n, source, inter FROM (
SELECT toFloat32(number % 10) AS n, 'original' AS source, number as inter
FROM numbers(10) WHERE number % 3 = 1
) ORDER BY n WITH FILL FROM 0 TO 5.51 STEP 0.5 INTERPOLATE (inter AS inter + 1);
yakov-olkhovskiy marked this conversation as resolved.
Show resolved Hide resolved
```

Result:

``` text
┌───n─┬─source───┬─inter─┐
│ 0 │ │ 0 │
│ 0.5 │ │ 0 │
│ 1 │ original │ 1 │
│ 1.5 │ │ 2 │
│ 2 │ │ 3 │
│ 2.5 │ │ 4 │
│ 3 │ │ 5 │
│ 3.5 │ │ 6 │
│ 4 │ original │ 4 │
│ 4.5 │ │ 5 │
│ 5 │ │ 6 │
│ 5.5 │ │ 7 │
│ 7 │ original │ 7 │
└─────┴──────────┴───────┘
```

[Original article](https://clickhouse.com/docs/en/sql-reference/statements/select/order-by/) <!--hide-->
2 changes: 1 addition & 1 deletion docs/ru/sql-reference/statements/select/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ SELECT [DISTINCT [ON (column1, column2, ...)]] expr_list
[WHERE expr]
[GROUP BY expr_list] [WITH ROLLUP|WITH CUBE] [WITH TOTALS]
[HAVING expr]
[ORDER BY expr_list] [WITH FILL] [FROM expr] [TO expr] [STEP expr]
[ORDER BY expr_list] [WITH FILL] [FROM expr] [TO expr] [STEP expr] [INTERPOLATE (expr_list)]
[LIMIT [offset_value, ]n BY columns]
[LIMIT [n, ]m] [WITH TIES]
[SETTINGS ...]
Expand Down
58 changes: 58 additions & 0 deletions docs/ru/sql-reference/statements/select/order-by.md
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,7 @@ SELECT * FROM collate_test ORDER BY s ASC COLLATE 'en';

```sql
ORDER BY expr [WITH FILL] [FROM const_expr] [TO const_expr] [STEP const_numeric_expr], ... exprN [WITH FILL] [FROM expr] [TO expr] [STEP numeric_expr]
[INTERPOLATE (col AS expr, ... colN AS exprN)]
```

`WITH FILL` может быть применен к полям с числовыми (все разновидности float, int, decimal) или временными (все разновидности Date, DateTime) типами. В случае применения к полям типа `String` недостающие значения заполняются пустой строкой.
Expand All @@ -289,6 +290,8 @@ ORDER BY expr [WITH FILL] [FROM const_expr] [TO const_expr] [STEP const_numeric_

Когда `STEP const_numeric_expr` не указан, тогда используется `1.0` для числовых типов, `1 день` для типа Date и `1 секунда` для типа DateTime.

`INTERPOLATE` может быть применен к колонкам, не участвующим в `ORDER BY WITH FILL`. Такие колонки заполняются значениями, вычисляемыми применением `expr` к предыдущему значению.

Пример запроса без использования `WITH FILL`:
```sql
SELECT n, source FROM (
Expand Down Expand Up @@ -395,3 +398,58 @@ ORDER BY
│ 1970-03-12 │ 1970-01-08 │ original │
└────────────┴────────────┴──────────┘
```

Пример запроса без `INTERPOLATE`:

``` sql
SELECT n, source, inter FROM (
SELECT toFloat32(number % 10) AS n, 'original' AS source, number as inter
FROM numbers(10) WHERE number % 3 = 1
) ORDER BY n WITH FILL FROM 0 TO 5.51 STEP 0.5;
```

Результат:
``` text
┌───n─┬─source───┬─inter─┐
│ 0 │ │ 0 │
│ 0.5 │ │ 0 │
│ 1 │ original │ 1 │
│ 1.5 │ │ 0 │
│ 2 │ │ 0 │
│ 2.5 │ │ 0 │
│ 3 │ │ 0 │
│ 3.5 │ │ 0 │
│ 4 │ original │ 4 │
│ 4.5 │ │ 0 │
│ 5 │ │ 0 │
│ 5.5 │ │ 0 │
│ 7 │ original │ 7 │
└─────┴──────────┴───────┘
```

Тот же запрос с `INTERPOLATE`:

``` sql
SELECT n, source, inter FROM (
SELECT toFloat32(number % 10) AS n, 'original' AS source, number as inter
FROM numbers(10) WHERE number % 3 = 1
) ORDER BY n WITH FILL FROM 0 TO 5.51 STEP 0.5 INTERPOLATE (inter AS inter + 1);
```

Результат:
``` text
┌───n─┬─source───┬─inter─┐
│ 0 │ │ 0 │
│ 0.5 │ │ 0 │
│ 1 │ original │ 1 │
│ 1.5 │ │ 2 │
│ 2 │ │ 3 │
│ 2.5 │ │ 4 │
│ 3 │ │ 5 │
│ 3.5 │ │ 6 │
│ 4 │ original │ 4 │
│ 4.5 │ │ 5 │
│ 5 │ │ 6 │
│ 5.5 │ │ 7 │
│ 7 │ original │ 7 │
└─────┴──────────┴───────┘
63 changes: 63 additions & 0 deletions src/Core/InterpolateDescription.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
#include <Core/Block.h>
#include <IO/Operators.h>
#include <Common/JSONBuilder.h>
#include <Core/InterpolateDescription.h>
#include <Interpreters/convertFieldToType.h>

namespace DB
{

void dumpInterpolateDescription(const InterpolateDescription & description, const Block & /*header*/, WriteBuffer & out)
{
bool first = true;

for (const auto & desc : description)
{
if (!first)
out << ", ";
first = false;

if (desc.column.name.empty())
out << "?";
else
out << desc.column.name;
}
}

void InterpolateColumnDescription::interpolate(Field & field) const
yakov-olkhovskiy marked this conversation as resolved.
Show resolved Hide resolved
{
if (field.isNull())
return;
Block expr_columns;
Field column_field = convertFieldToType(field, *column.type.get());
expr_columns.insert({column.type->createColumnConst(1, column_field), column.type, column.name});
actions->execute(expr_columns);
field = convertFieldToType((*expr_columns.getByPosition(0).column)[0], *column.type.get());
}

void InterpolateColumnDescription::explain(JSONBuilder::JSONMap & map, const Block & /*header*/) const
{
map.add("Column", column.name);
}

std::string dumpInterpolateDescription(const InterpolateDescription & description)
{
WriteBufferFromOwnString wb;
dumpInterpolateDescription(description, Block{}, wb);
return wb.str();
}

JSONBuilder::ItemPtr explainInterpolateDescription(const InterpolateDescription & description, const Block & header)
{
auto json_array = std::make_unique<JSONBuilder::JSONArray>();
for (const auto & descr : description)
{
auto json_map = std::make_unique<JSONBuilder::JSONMap>();
descr.explain(*json_map, header);
json_array->add(std::move(json_map));
}

return json_array;
}

}
70 changes: 70 additions & 0 deletions src/Core/InterpolateDescription.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
#pragma once

#include <vector>
#include <memory>
#include <cstddef>
#include <string>
#include <Core/Field.h>
#include <Core/SettingsEnums.h>
#include <Common/IntervalKind.h>
#include <Parsers/ASTOrderByElement.h>
#include <Parsers/ASTInterpolateElement.h>
#include <Functions/FunctionsMiscellaneous.h>

class Collator;

namespace DB
{

namespace JSONBuilder
{
class JSONMap;
class IItem;
using ItemPtr = std::unique_ptr<IItem>;
}

class Block;


/// Interpolate description
struct InterpolateColumnDescription
{
using Signature = ExecutableFunctionExpression::Signature;
yakov-olkhovskiy marked this conversation as resolved.
Show resolved Hide resolved

ColumnWithTypeAndName column;
ExpressionActionsPtr actions;
yakov-olkhovskiy marked this conversation as resolved.
Show resolved Hide resolved

explicit InterpolateColumnDescription(const ColumnWithTypeAndName & column_, ExpressionActionsPtr actions_) :
column(column_), actions(actions_) {}

bool operator == (const InterpolateColumnDescription & other) const
{
return column == other.column;
}

bool operator != (const InterpolateColumnDescription & other) const
{
return !(*this == other);
}

void interpolate(Field & field) const;

std::string dump() const
{
return fmt::format("{}", column.name);
}

void explain(JSONBuilder::JSONMap & map, const Block & header) const;
};

/// Description of interpolation for several columns.
using InterpolateDescription = std::vector<InterpolateColumnDescription>;

/// Outputs user-readable description into `out`.
void dumpInterpolateDescription(const InterpolateDescription & description, const Block & header, WriteBuffer & out);

std::string dumpInterpolateDescription(const InterpolateDescription & description);

JSONBuilder::ItemPtr explainInterpolateDescription(const InterpolateDescription & description, const Block & header);

}
Loading