Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORDER BY WITH FILL with last values #34903

Closed
infisamk opened this issue Feb 25, 2022 · 4 comments · Fixed by #35349
Closed

ORDER BY WITH FILL with last values #34903

infisamk opened this issue Feb 25, 2022 · 4 comments · Fixed by #35349
Assignees
Labels
feature warmup task The task for new ClickHouse team members. Low risk, moderate complexity, no urgency.

Comments

@infisamk
Copy link

There is a table with balances. All balances are included in this table. Some balances can change several times a day, and some balances can change at all once in all time.

create table aaa (
    d Date,
    dd DateTime,
    balanceAccountId UInt64,
    activeBalance Int64,
    inactiveBalance Int64
) ENGINE Memory();

insert into aaa (d, dd, balanceAccountId, activeBalance, inactiveBalance) VALUES
('2022-01-03', '2022-01-03 15:44:21', 1, 1000, 0),
('2022-01-05', '2022-01-05 10:01:56', 2, 2000, 100),
('2022-01-26', '2022-01-26 08:30:23', 3, 0, 500),
('2022-01-26', '2022-01-26 16:25:41', 2, 1000, 0),
('2022-01-26', '2022-01-26 17:09:54', 3, 100, 500),
('2022-02-10', '2022-02-10 21:01:33', 1, 2000, 0);

At the same time, for example, for any day you need to receive the amount for all balances. You also need to receive balance sheet amounts by day for the last month, for example. And so on.

select any(d) d, sum(aBalance)active, sum(iBalance) inactive, active + inactive balance
from (
    with date(dateSub(DAY, 15, NOW())) as current_date
    select current_date d, argMax(activeBalance, dd) aBalance, argMax(inactiveBalance, dd) iBalance
    from aaa
    where dd <= current_date
    group by balanceAccountId
);

+----------+------+--------+-------+
|d         |active|inactive|balance|
+----------+------+--------+-------+
|2022-02-09|2100  |500     |2600   |
+----------+------+--------+-------+

select any(d) d, sum(aBalance)active, sum(iBalance) inactive, active + inactive balance
from (
    with date(dateSub(DAY, 10, NOW())) as current_date
    select current_date d, argMax(activeBalance, dd) aBalance, argMax(inactiveBalance, dd) iBalance
    from aaa
    where dd <= current_date
    group by balanceAccountId
);
+----------+------+--------+-------+
|d         |active|inactive|balance|
+----------+------+--------+-------+
|2022-02-14|3100  |500     |3600   |
+----------+------+--------+-------+

If there was an analogue of ORDER BY WITH FILL, but with the saving of data on the last records, and not with default, as is now possible, then it would probably be possible to solve this problem gracefully. But maybe there are other ways?

@alexey-milovidov
Copy link
Member

Makes sense and is easy to implement, we only need to come up with a syntax for it.

@alexey-milovidov alexey-milovidov added the warmup task The task for new ClickHouse team members. Low risk, moderate complexity, no urgency. label Feb 25, 2022
@den-crane
Copy link
Contributor

den-crane commented Feb 25, 2022

we only need to come up with a syntax for it.

SELECT
    n,
    source,
    x
FROM
(
    SELECT
        toFloat32(number % 10) AS n,
        concat('original', toString(number)) AS source,
        number AS x
    FROM numbers(10)
    WHERE (number % 3) = 1
)
ORDER BY n ASC WITH FILL FROM 0 TO 5.51 STEP 0.5

Query id: aa165b61-bd57-4ba4-8522-ff2e82d10110

┌───n─┬─source────┬─x─┐
│   0 │           │ 0 │
│ 0.5 │           │ 0 │
│   1 │ original1 │ 1 │
│ 1.5 │           │ 0 │
│   2 │           │ 0 │
│ 2.5 │           │ 0 │
│   3 │           │ 0 │
│ 3.5 │           │ 0 │
│   4 │ original4 │ 4 │
│ 4.5 │           │ 0 │
│   5 │           │ 0 │
│ 5.5 │           │ 0 │
│   7 │ original7 │ 7 │
└─────┴───────────┴───┘

something like

SELECT
    n,
    source,
    x
FROM
(
    SELECT
        toFloat32(number % 10) AS n, 'original'||toString(number) AS source, number x
    FROM numbers(10)
    WHERE (number % 3) = 1
)
ORDER BY n ASC WITH FILL FROM 0 TO 5.51 STEP 0.5 
INTERPOLATE (source PREVIOUS VALUE source||'xxx', x PREVIOUS VALUE x+1);

┌───n─┬─source────┬─x─┐
│   0 │           │ 0 │
│ 0.5 │           │ 1 │
│   1 │ original1 │ 1 │
│ 1.5 │ original1 │ 2 │
│   2 │ original1 │ 3 │
│ 2.5 │ original1 │ 4 │
│   3 │ original1 │ 5 │
│ 3.5 │ original1 │ 6 │
│   4 │ original4 │ 4 │
│ 4.5 │ original4 │ 5 │
│   5 │ original4 │ 6 │
│ 5.5 │ original4 │ 7 │
│   7 │ original7 │ 7 │
└─────┴───────────┴───┘

@infisamk
Copy link
Author

And if you simplify before adding only one INTERPOLATE statement before FILL?

SELECT
    n,
    source,
    x
FROM
(
    SELECT
        toFloat32(number % 10) AS n,
        concat('original', toString(number)) AS source,
        number AS x
    FROM numbers(10)
    WHERE (number % 3) = 1
)
ORDER BY n ASC WITH INTERPOLATE FILL FROM 0 TO 5.51 STEP 0.5

@yakov-olkhovskiy
Copy link
Member

I will take it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature warmup task The task for new ClickHouse team members. Low risk, moderate complexity, no urgency.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants