Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different results are returned at every execution by executing same query which contains several UNIONs and JOINs in version 20.3.3. #9826

Closed
lu-yuki-matsumoto opened this issue Mar 23, 2020 · 8 comments · Fixed by #10099
Assignees
Labels
bug Confirmed user-visible misbehaviour in official release comp-joins JOINs comp-processors st-need-info We need extra data to continue

Comments

@lu-yuki-matsumoto
Copy link

lu-yuki-matsumoto commented Mar 23, 2020

Overview

Different results are returned as result of executing the same query which commands joining several tables and union each tables.

Sometimes results of executing the query lacking some joined tables.

When I execute it several times, joined tables which are heavier and/or in latter position of query are lacked in more trial.

It seems that a bug occurs on executing parallel processing, and it maybe union executing before fetching all intermediate tables.

This occurs on clickhouse version20.3.3.

How to Reproduce

I execute below query.
This query is expected returning 8 tableNames, but it returns different number of tableNames at every execution.

Executing Query

SELECT
    tableName
FROM
    (
    queryNotJoin('T1_notJoin1')    
    union ALL
    queryFilteredAfterJoin('T2_filteredAfterJoin1', '2020-01-01')
    union ALL
    queryFilteredAfterJoin('T3_filteredAfterJoin2', '2020-01-02')
    union ALL
    queryFilteredBeforeJoin('T4_filteredBeforeJoin1', '2020-01-01')
    union ALL
    queryFilteredBeforeJoin('T5_filteredBeforeJoin2', '2020-01-02')
    union all
    queryFilteredAfterJoin('T6_filteredAfterJoin3', '2020-01-03')
    union all
    queryNotJoin('T7_notJoin2')
    union ALL
    queryFilteredBeforeJoin('T8_filteredBeforeJoin3', '2020-01-03')
    ) as a
group BY 
    tableName
order BY
    tableName

In above query, using below function-like table aliases to make it easier to read.

queryNotJoin(name)
 select
     col1, 
     '{name}' as tableName,
     count(*) as c
 from
     tableA

queryFilteringAfterJoin(name, date)
 select
     a.col1,
     '{name}' as tableName,
     count(*) as c
 from
     tableB as b
     inner JOIN
     tableA as a
     on a.id = b.Aid
 where
     a.colDate = '{date}'
 group by
     a.col1
   
queryFilteringBeforeJoin(name, date)
 select
     a.col1,
     '{name}' as tableName,
     count(*) as c
 from
     tableB as b
     inner JOIN
     (select col1, id from tableA WHERE colDate = '{date}') as a
     on a.id = b.Aid
 group by
     a.col1

tableA has  9.6G columns.
tableB has 59.9K columns.

Testing and consideration

I executed the query 100 times and counted values in tableName rows.

'T1_notJoin1'           : 100
'T2_filteredAfterJoin1' :  59
'T3_filteredAfterJoin2' :  47
'T4_filteredBeforeJoin1': 100
'T5_filteredBeforeJoin2':  95
'T6_filteredAfterJoin3' :  11
'T7_notJoin2'           : 100
'T8_filteredBeforeJoin3' : 54

*When executing alone, queryNotJoin, queryFilteredBeforeJoin and queryFilteredAfterJoin are spent less time in this order.

  • Comparing same joined table query, result of in latter position of query result is more lacked.

  • Comparing different joined table query, result of slower query is more lacked.

I guess that this problem is caused by that the UNION is executed without waiting to done processes of fetching joined table.

@lu-yuki-matsumoto lu-yuki-matsumoto added the bug Confirmed user-visible misbehaviour in official release label Mar 23, 2020
@lu-yuki-matsumoto lu-yuki-matsumoto changed the title Different results at every execution are returned by executing a query which contains several UNIONs and JOINs in version 20.3.3. Different results are returned at every execution by executing same query which contains several UNIONs and JOINs in version 20.3.3. Mar 23, 2020
@qoega
Copy link
Member

qoega commented Mar 24, 2020

@KochetovNicolai can you check it is not related to processors?

@lu-yuki-matsumoto Can you check that it is reproducing even with --experimental_use_processors 0 setting?

@lu-yuki-matsumoto
Copy link
Author

lu-yuki-matsumoto commented Mar 24, 2020

@qoega

I checked the query with --experimental_use_processors 0 20 times, Clickhouse returned 8 values in every execution.
It is not reproducing with --experimental_use_processors 0 setting.

Thank you,

@KochetovNicolai
Copy link
Member

Hi @lu-yuki-matsumoto
I am trying to reproduce this bug, but have not succeed so far.

Could you please show me the whole query and create table statements?
If you could also send me table's data, it would be very helpful.

Or, probably, find a smaller reproducible example (like, create some tables and insert toy data into them).

@filimonov filimonov added comp-processors st-need-info We need extra data to continue labels Mar 31, 2020
@KochetovNicolai
Copy link
Member

Things I tried:

create table tableA (id UInt64, col1 UInt64, colDate Date) engine = MergeTree order by colDate;
create table tableB (Aid UInt64) engine = MergeTree order by tuple();

insert into tableA select number, number % 10, '2020-01-01' from numbers(10000000);
insert into tableA select number, number % 100000, '2020-01-02' from numbers(10000000);
insert into tableA select number, number, '2020-01-03' from numbers(10000000);

insert into tableB select number * 2 from numbers(100000);

set max_memory_usage = 40000000000;

SELECT tableName
FROM 
(
    SELECT 
        col1, 
        'T1_notJoin1' AS tableName, 
        count(*) AS c
    FROM tableA
    GROUP BY col1
    UNION ALL
    SELECT 
        a.col1, 
        'T2_filteredAfterJoin1' AS tableName, 
        count(*) AS c
    FROM tableB AS b
    INNER JOIN tableA AS a ON a.id = b.Aid
    WHERE a.colDate = '2020-01-01'
    GROUP BY a.col1
    UNION ALL
    SELECT 
        a.col1, 
        'T3_filteredAfterJoin2' AS tableName, 
        count(*) AS c
    FROM tableB AS b
    INNER JOIN tableA AS a ON a.id = b.Aid
    WHERE a.colDate = '2020-01-02'
    GROUP BY a.col1
    UNION ALL
    SELECT 
        a.col1, 
        'T4_filteredBeforeJoin1' AS tableName, 
        count(*) AS c
    FROM tableB AS b
    INNER JOIN 
    (
        SELECT 
            col1, 
            id
        FROM tableA
        WHERE colDate = '2020-01-01'
    ) AS a ON a.id = b.Aid
    GROUP BY a.col1
    UNION ALL
    SELECT 
        a.col1, 
        'T5_filteredBeforeJoin2' AS tableName, 
        count(*) AS c
    FROM tableB AS b
    INNER JOIN 
    (
        SELECT 
            col1, 
            id
        FROM tableA
        WHERE colDate = '2020-01-02'
    ) AS a ON a.id = b.Aid
    GROUP BY a.col1
    UNION ALL
    SELECT 
        a.col1, 
        'T6_filteredAfterJoin3' AS tableName, 
        count(*) AS c
    FROM tableB AS b
    INNER JOIN tableA AS a ON a.id = b.Aid
    WHERE a.colDate = '2020-01-03'
    GROUP BY a.col1
    UNION ALL
    SELECT 
        col1, 
        'T7_notJoin2' AS tableName, 
        count(*) AS c
    FROM tableA
    GROUP BY col1
    UNION ALL
    SELECT 
        a.col1, 
        'T8_filteredBeforeJoin3' AS tableName, 
        count(*) AS c
    FROM tableB AS b
    INNER JOIN 
    (
        SELECT 
            col1, 
            id
        FROM tableA
        WHERE colDate = '2020-01-03'
    ) AS a ON a.id = b.Aid
    GROUP BY a.col1
) AS a
GROUP BY tableName
ORDER BY tableName ASC

┌─tableName──────────────┐
│ T1_notJoin1            │
│ T2_filteredAfterJoin1  │
│ T3_filteredAfterJoin2  │
│ T4_filteredBeforeJoin1 │
│ T5_filteredBeforeJoin2 │
│ T6_filteredAfterJoin3  │
│ T7_notJoin2            │
│ T8_filteredBeforeJoin3 │
└────────────────────────┘

Still can't reproduce.

@lu-yuki-matsumoto
Copy link
Author

lu-yuki-matsumoto commented Apr 3, 2020

@KochetovNicolai

I'm sorry to be very late.
In our case,

CREATE TABLE tableB
(`id` UInt64,
 `colA` UInt32,
 `colB` UInt32,
 `colC` Enum8(2 vals),
 `Aid` UInt32,
 `colD` Enum8(8 vals),
 `colE` UInt8,
 `colF` UInt8,
 `colG` UInt8,
 `colH` DateTime,
 `colDate` Date
 ) ENGINE = ReplacingMergeTree(colDate, id, 8192)
 
CREATE TABLE tableA
 (`id` UInt32,
  `colJ` UInt32,
  `colK` String,
  `colL` Enum8(11 vals),
  `colM` Enum8(12 vals),
  `colN` DateTime,
  `colO` DateTime,
  `colP` String,
  `colQ` String,
  `colR` Enum8(4 vals),
  `colS` DateTime,
  `colT` DateTime,
  `colU` DateTime,
  `colV` Date,
  `colW` UInt32) ENGINE = ReplacingMergeTree(colV, id, 8192)

max_memory_usage = 100000000000;
max_memory_usage_for_all_queries =  130000000000;  

@KochetovNicolai
Copy link
Member

@lu-yuki-matsumoto thank you.
Is the query I used (in my previous comment) the same as the your query?

I still can't reproduce the issue on my artificial generated data.
You may help me to investigate this issue:

  • Try to reduce the query. Usually, most parts of the query can be removed, until you get a small query which returns incorrect result (at least sometimes)
  • Make completely reproducible example. Create empty tables, insert artificially generated data into them, and try to reproduce over it.
  • Or send your real data, if you can. (Only parts for affected period are needed. They should start from 202001 for your tables.)

It will help me a lot if you do anything of that.

@lu-yuki-matsumoto
Copy link
Author

lu-yuki-matsumoto commented Apr 6, 2020

@KochetovNicolai

I should apologize for you about mistake I made in my query.
Group column is in tableA.

Below query can reproduce only case of returning 7 values.
(I have not find artificial data to reproduce cases of returning 4-6 values yet.)

create table tableA (id UInt64, col1 UInt64, colDate Date) engine = ReplacingMergeTree(colDate, id, 8192);
create table tableB (id UInt64, Aid UInt64, colDate Date) engine = ReplacingMergeTree(colDate, id, 8192);

insert into tableA select number, number % 10, addDays(toDate('2020-01-01'), - number % 1000) from numbers(100000);
insert into tableB select number, number % 100000, addDays(toDate('2020-01-01'), number % 90) from numbers(50000000);

SELECT tableName
FROM 
    (
        SELECT 
            col1, 
            'T1_notJoin1' AS tableName, 
            count(*) AS c
        FROM tableA
        GROUP BY col1
        UNION ALL
        SELECT 
            a.col1, 
            'T2_filteredAfterJoin1' AS tableName, 
            count(*) AS c
        FROM tableB AS b
        INNER JOIN tableA AS a ON a.id = b.Aid
        WHERE b.colDate = '2020-01-01'
        GROUP BY a.col1
        UNION ALL
        SELECT 
            a.col1, 
            'T3_filteredAfterJoin2' AS tableName, 
            count(*) AS c
        FROM tableB AS b
            INNER JOIN
            tableA AS a
            ON a.id = b.Aid
        WHERE b.colDate = '2020-01-02'
        GROUP BY a.col1
        UNION ALL
        SELECT 
            a.col1, 
            'T4_filteredBeforeJoin1' AS tableName, 
            count(*) AS c
        FROM tableA AS a
        INNER JOIN 
        (
            SELECT 
                Aid
            FROM tableB
            WHERE colDate = '2020-01-01'
        ) AS b ON a.id = b.Aid
        GROUP BY a.col1
        UNION ALL
        SELECT 
            a.col1, 
            'T5_filteredBeforeJoin2' AS tableName, 
            count(*) AS c
        FROM tableA AS a
        INNER JOIN 
        (
            SELECT 
                Aid
            FROM tableB
            WHERE colDate = '2020-01-02'
        ) AS b ON a.id = b.Aid
        GROUP BY a.col1
        UNION ALL
        SELECT 
            a.col1, 
            'T6_filteredAfterJoin3' AS tableName, 
            count(*) AS c
        FROM tableB AS b
        INNER JOIN tableA AS a ON a.id = b.Aid
        WHERE b.colDate = '2020-01-03'
        GROUP BY a.col1
        UNION ALL
        SELECT 
            col1, 
            'T7_notJoin2' AS tableName, 
            count(*) AS c
        FROM tableA
        GROUP BY col1
        UNION ALL
        SELECT 
            a.col1, 
            'T8_filteredBeforeJoin3' AS tableName, 
            count(*) AS c
        FROM tableA AS a
        INNER JOIN 
        (
            SELECT 
                Aid
            FROM tableB
            WHERE colDate = '2020-01-03'
        ) AS b ON a.id = b.Aid
        GROUP BY a.col1
    ) AS a
GROUP BY tableName
ORDER BY tableName ASC;

@KochetovNicolai
Copy link
Member

@lu-yuki-matsumoto thank you so much!
Your test perfectly reproduces the case you have described.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed user-visible misbehaviour in official release comp-joins JOINs comp-processors st-need-info We need extra data to continue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants