Skip to content

Commit

Permalink
DOCS-510: runningAccumulate (#12061)
Browse files Browse the repository at this point in the history
* asiana21-DOCSUP-797 (#117)

* docs(runningAccumulate): the function description is added

* docs(runningAccumulate): the function description is modified

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* docs(runningAccumulate): some changes

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* docs(runningAccumulate): added ru translation

Co-authored-by: asiana21 <asiana21@yandex-team.ru>
Co-authored-by: BayoNet <da-daos@yandex.ru>

* CLICKHOUSEDOCS-510: Minor fix.

* Update docs/en/sql-reference/functions/other-functions.md

Co-authored-by: Ivan Blinkov <github@blinkov.ru>

* CLICKHOUSEDOCS-510: Fixed links.

Co-authored-by: AsiaKorushkina <43650329+AsiaKorushkina@users.noreply.github.com>
Co-authored-by: asiana21 <asiana21@yandex-team.ru>
Co-authored-by: Sergei Shtykov <bayonet@yandex-team.ru>
Co-authored-by: Ivan Blinkov <github@blinkov.ru>
  • Loading branch information
5 people committed Jul 1, 2020
1 parent 03a643e commit 41fed28
Show file tree
Hide file tree
Showing 5 changed files with 210 additions and 10 deletions.
2 changes: 1 addition & 1 deletion docs/en/development/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ There are ordinary functions and aggregate functions. For aggregate functions, s

Ordinary functions don’t change the number of rows – they work as if they are processing each row independently. In fact, functions are not called for individual rows, but for `Block`’s of data to implement vectorized query execution.

There are some miscellaneous functions, like [blockSize](../sql-reference/functions/other-functions.md#function-blocksize), [rowNumberInBlock](../sql-reference/functions/other-functions.md#function-rownumberinblock), and [runningAccumulate](../sql-reference/functions/other-functions.md#function-runningaccumulate), that exploit block processing and violate the independence of rows.
There are some miscellaneous functions, like [blockSize](../sql-reference/functions/other-functions.md#function-blocksize), [rowNumberInBlock](../sql-reference/functions/other-functions.md#function-rownumberinblock), and [runningAccumulate](../sql-reference/functions/other-functions.md#runningaccumulatexploit block processing and violate the independence of rows.

ClickHouse has strong typing, so there’s no implicit type conversion. If a function doesn't support a specific combination of types, it throws an exception. But functions can work (be overloaded) for many different combinations of types. For example, the `plus` function (to implement the `+` operator) works for any combination of numeric types: `UInt8` + `Float32`, `UInt16` + `Int8`, and so on. Also, some variadic functions can accept any number of arguments, such as the `concat` function.

Expand Down
2 changes: 1 addition & 1 deletion docs/en/sql-reference/aggregate-functions/combinators.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ To work with these states, use:

- [AggregatingMergeTree](../../engines/table-engines/mergetree-family/aggregatingmergetree.md) table engine.
- [finalizeAggregation](../../sql-reference/functions/other-functions.md#function-finalizeaggregation) function.
- [runningAccumulate](../../sql-reference/functions/other-functions.md#function-runningaccumulate) function.
- [runningAccumulate](../../sql-reference/functions/other-functions.md#runningaccumulate) function.
- [-Merge](#aggregate_functions_combinators-merge) combinator.
- [-MergeState](#aggregate_functions_combinators-mergestate) combinator.

Expand Down
107 changes: 103 additions & 4 deletions docs/en/sql-reference/functions/other-functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -1054,11 +1054,110 @@ Result:

Takes state of aggregate function. Returns result of aggregation (finalized state).

## runningAccumulate {#function-runningaccumulate}
## runningAccumulate {#runningaccumulate}

Takes the states of the aggregate function and returns a column with values, are the result of the accumulation of these states for a set of block lines, from the first to the current line.
For example, takes state of aggregate function (example runningAccumulate(uniqState(UserID))), and for each row of block, return result of aggregate function on merge of states of all previous rows and current row.
So, result of function depends on partition of data to blocks and on order of data in block.
Accumulates states of an aggregate function for each row of a data block.

!!! warning "Warning"
The state is reset for each new data block.

**Syntax**

```sql
runningAccumulate(agg_state[, grouping]);
```

**Parameters**

- `agg_state` — State of the aggregate function. [AggregateFunction](../../sql-reference/data-types/aggregatefunction.md#data-type-aggregatefunction).
- `grouping` — Grouping key. Optional. The state of the function is reset if the `grouping` value is changed. It can be any of the [supported data types](../../sql-reference/data-types/index.md) for which the equality operator is defined.

**Returned value**

- Each resulting row contains a result of the aggregate function, accumulated for all the input rows from 0 to the current position. `runningAccumulate` resets states for each new data block or when the `grouping` value changes.

Type depends on the aggregate function used.

**Examples**

Consider how you can use `runningAccumulate` to find the cumulative sum of numbers without and with grouping.

Query:

```sql
SELECT k, runningAccumulate(sum_k) AS res FROM (SELECT number as k, sumState(k) AS sum_k FROM numbers(10) GROUP BY k ORDER BY k);
```

Result:

```text
┌─k─┬─res─┐
│ 0 │ 0 │
│ 1 │ 1 │
│ 2 │ 3 │
│ 3 │ 6 │
│ 4 │ 10 │
│ 5 │ 15 │
│ 6 │ 21 │
│ 7 │ 28 │
│ 8 │ 36 │
│ 9 │ 45 │
└───┴─────┘
```

The subquery generates `sumState` for every number from `0` to `9`. `sumState` returns the state of the [sum](../aggregate-functions/reference/sum.md) function that contains the sum of a single number.

The whole query does the following:

1. For the first row, `runningAccumulate` takes `sumState(0)` and returns `0`.
2. For the second row, the function merges `sumState(0)` and `sumState(1)` resulting in `sumState(0 + 1)`, and returns `1` as a result.
3. For the third row, the function merges `sumState(0 + 1)` and `sumState(2)` resulting in `sumState(0 + 1 + 2)`, and returns `3` as a result.
4. The actions are repeated until the block ends.

The following example shows the `groupping` parameter usage:

Query:

```sql
SELECT
grouping,
item,
runningAccumulate(state, grouping) AS res
FROM
(
SELECT
toInt8(number / 4) AS grouping,
number AS item,
sumState(number) AS state
FROM numbers(15)
GROUP BY item
ORDER BY item ASC
);
```

Result:

```text
┌─grouping─┬─item─┬─res─┐
│ 0 │ 0 │ 0 │
│ 0 │ 1 │ 1 │
│ 0 │ 2 │ 3 │
│ 0 │ 3 │ 6 │
│ 1 │ 4 │ 4 │
│ 1 │ 5 │ 9 │
│ 1 │ 6 │ 15 │
│ 1 │ 7 │ 22 │
│ 2 │ 8 │ 8 │
│ 2 │ 9 │ 17 │
│ 2 │ 10 │ 27 │
│ 2 │ 11 │ 38 │
│ 3 │ 12 │ 12 │
│ 3 │ 13 │ 25 │
│ 3 │ 14 │ 39 │
└──────────┴──────┴─────┘
```

As you can see, `runningAccumulate` merges states for each group of rows separately.

## joinGet {#joinget}

Expand Down
2 changes: 1 addition & 1 deletion docs/ru/sql-reference/aggregate-functions/combinators.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@

- Движок таблиц [AggregatingMergeTree](../../engines/table-engines/mergetree-family/aggregatingmergetree.md).
- Функция [finalizeAggregation](../../sql-reference/aggregate-functions/combinators.md#function-finalizeaggregation).
- Функция [runningAccumulate](../../sql-reference/aggregate-functions/combinators.md#function-runningaccumulate).
- Функция [runningAccumulate](../../sql-reference/aggregate-functions/combinators.md#runningaccumulate).
- Комбинатор [-Merge](#aggregate_functions_combinators-merge).
- Комбинатор [-MergeState](#aggregate_functions_combinators-mergestate).

Expand Down
107 changes: 104 additions & 3 deletions docs/ru/sql-reference/functions/other-functions.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Прочие функции {#prochie-funktsii}
# Прочие функции {#other-functions}

## hostName() {#hostname}

Expand Down Expand Up @@ -1036,9 +1036,110 @@ SELECT formatReadableSize(filesystemCapacity()) AS "Capacity", toTypeName(filesy

Принимает состояние агрегатной функции. Возвращает результат агрегирования.

## runningAccumulate {#function-runningaccumulate}
## runningAccumulate {#runningaccumulate}

Принимает на вход состояния агрегатной функции и возвращает столбец со значениями, которые представляют собой результат мёржа этих состояний для выборки строк из блока от первой до текущей строки. Например, принимает состояние агрегатной функции (например, `runningAccumulate(uniqState(UserID))`), и для каждой строки блока возвращает результат агрегатной функции после мёржа состояний функции для всех предыдущих строк и текущей. Таким образом, результат зависит от разбиения данных по блокам и от порядка данных в блоке.
Накапливает состояния агрегатной функции для каждой строки блока данных.

!!! warning "Warning"
Функция обнуляет состояние для каждого нового блока.

**Синтаксис**

```sql
runningAccumulate(agg_state[, grouping]);
```

**Параметры**

- `agg_state` — Состояние агрегатной функции. [AggregateFunction](../../sql-reference/data-types/aggregatefunction.md#data-type-aggregatefunction).
- `grouping` — Ключ группировки. Опциональный параметр. Состояние функции обнуляется, если значение `grouping` меняется. Параметр может быть любого [поддерживаемого типа данных](../../sql-reference/data-types/index.md), для которого определен оператор равенства.

**Возвращаемое значение**

- Каждая результирующая строка содержит результат агрегатной функции, накопленный для всех входных строк от 0 до текущей позиции. `runningAccumulate` обнуляет состояния для каждого нового блока данных или при изменении значения `grouping`.

Тип зависит от используемой агрегатной функции.

**Примеры**

Рассмотрим примеры использования `runningAccumulate` для нахождения кумулятивной суммы чисел без и с группировкой.

Запрос:

```sql
SELECT k, runningAccumulate(sum_k) AS res FROM (SELECT number as k, sumState(k) AS sum_k FROM numbers(10) GROUP BY k ORDER BY k);
```

Результат:

```text
┌─k─┬─res─┐
│ 0 │ 0 │
│ 1 │ 1 │
│ 2 │ 3 │
│ 3 │ 6 │
│ 4 │ 10 │
│ 5 │ 15 │
│ 6 │ 21 │
│ 7 │ 28 │
│ 8 │ 36 │
│ 9 │ 45 │
└───┴─────┘
```

Подзапрос формирует `sumState` для каждого числа от `0` до `9`. `sumState` возвращает состояние функции [sum](../../sql-reference/aggregate-functions/reference.md#agg_function-sum), содержащее сумму одного числа.

Весь запрос делает следующее:

1. Для первой строки `runningAccumulate` берет `sumState(0)` и возвращает `0`.
2. Для второй строки функция объединяет `sumState (0)` и `sumState (1)`, что приводит к `sumState (0 + 1)`, и возвращает в результате `1`.
3. Для третьей строки функция объединяет `sumState (0 + 1)` и `sumState (2)`, что приводит к `sumState (0 + 1 + 2)`, и в результате возвращает `3`.
4. Действия повторяются до тех пор, пока не закончится блок.

В следующем примере показано использование параметра `grouping`:

Запрос:

```sql
SELECT
grouping,
item,
runningAccumulate(state, grouping) AS res
FROM
(
SELECT
toInt8(number / 4) AS grouping,
number AS item,
sumState(number) AS state
FROM numbers(15)
GROUP BY item
ORDER BY item ASC
);
```

Результат:

```text
┌─grouping─┬─item─┬─res─┐
│ 0 │ 0 │ 0 │
│ 0 │ 1 │ 1 │
│ 0 │ 2 │ 3 │
│ 0 │ 3 │ 6 │
│ 1 │ 4 │ 4 │
│ 1 │ 5 │ 9 │
│ 1 │ 6 │ 15 │
│ 1 │ 7 │ 22 │
│ 2 │ 8 │ 8 │
│ 2 │ 9 │ 17 │
│ 2 │ 10 │ 27 │
│ 2 │ 11 │ 38 │
│ 3 │ 12 │ 12 │
│ 3 │ 13 │ 25 │
│ 3 │ 14 │ 39 │
└──────────┴──────┴─────┘
```

Как вы можете видеть, `runningAccumulate` объединяет состояния для каждой группы строк отдельно.

## joinGet {#joinget}

Expand Down

0 comments on commit 41fed28

Please sign in to comment.