Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loss of interactivity with a join #7577

Closed
philip-stoev opened this issue Jul 28, 2021 · 4 comments
Closed

Loss of interactivity with a join #7577

philip-stoev opened this issue Jul 28, 2021 · 4 comments
Assignees
Labels
A-compute Area: compute C-bug Category: something is broken
Projects
Milestone

Comments

@philip-stoev
Copy link
Contributor

philip-stoev commented Jul 28, 2021

What version of Materialize are you using?

v0.8.4-dev (f98fbfdc9)

What was the issue?

Loss of interactivity is observed when the following large join is processed:

CREATE MATERIALIZED VIEW v1 AS
SELECT c_nationkey AS col1138 , c_nationkey AS col1139
FROM lineitem
JOIN orders ON ( l_orderkey = o_orderkey )
JOIN customer ON ( o_custkey = c_custkey )
WHERE c_custkey = ALL (
  SELECT o_custkey FROM orders
  WHERE l_receiptDATE = o_orderdate
  AND l_receiptDATE = o_orderdate - INTERVAL ' 1 MONTHS '
  OR l_receiptDATE <> o_orderdate
  AND  EXISTS (
    SELECT  o_custkey FROM lineitem
    JOIN orders ON ( l_orderkey = o_orderkey )
    WHERE o_totalprice = 62
    OR o_totalprice BETWEEN 3 AND 174 + 22
    AND l_linenumber != 0
  )
)
OR c_acctbal - o_totalprice = 0
AND l_commitDATE <= o_orderdate
AND c_comment NOT LIKE CONCAT( 'l' , '%' )
AND o_orderdate < l_receiptDATE - INTERVAL ' 7 MONTHS ' ;

This is against a very small TPC-C schema and the predicates in this query should all be reasonable. There are realistic join conditions throughout.

Is the issue reproducible? If so, please provide reproduction instructions.

  1. Load this dataset: gh7577.sql.zip
  2. Run the query above
  3. Observe loss of interactivity

Please attach any applicable log files.

The mz_scheduling_histogram report:

materialize=> select 
    mz_catalog.mz_scheduling_histogram.id, 
    mz_catalog.mz_scheduling_histogram.worker, 
    name, 
    dataflow_id, 
    count, 
    duration_ns / 1000000 as duration_ms 
from 
    mz_catalog.mz_scheduling_histogram, 
    mz_catalog.mz_dataflow_operator_dataflows 
where
    mz_catalog.mz_scheduling_histogram.id = mz_catalog.mz_dataflow_operator_dataflows.id and
    mz_catalog.mz_scheduling_histogram.worker = mz_catalog.mz_dataflow_operator_dataflows.worker
 order by 
    duration_ms desc limit 30;
  id  | worker |                    name                     | dataflow_id | count | duration_ms 
------+--------+---------------------------------------------+-------------+-------+-------------
 2505 |      1 | Join                                        |        1991 |     1 |      274877
 2620 |      1 | Dataflow: materialize.public.v1_primary_idx |        1991 |     1 |      274877
 1991 |      1 | Dataflow: materialize.public.v1_primary_idx |        1991 |     1 |      274877
 2620 |      1 | Dataflow: materialize.public.v1_primary_idx |        1991 |     1 |         134
 1991 |      1 | Dataflow: materialize.public.v1_primary_idx |        1991 |     1 |         134
 2285 |      0 | delta query                                 |        1991 |     1 |          67
 2285 |      1 | delta query                                 |        1991 |     1 |          67
 2285 |      2 | delta query                                 |        1991 |     1 |          67
 2285 |      3 | delta query                                 |        1991 |     1 |          67
 2159 |      0 | delta path 0                                |        1991 |     1 |          67
 2159 |      1 | delta path 0                                |        1991 |     1 |          67
 2159 |      2 | delta path 0                                |        1991 |     1 |          67
 2159 |      3 | delta path 0                                |        1991 |     1 |          67
 2620 |      0 | Dataflow: materialize.public.v1_primary_idx |        1991 |     1 |          67
 2620 |      2 | Dataflow: materialize.public.v1_primary_idx |        1991 |     1 |          67
 2620 |      3 | Dataflow: materialize.public.v1_primary_idx |        1991 |     1 |          67
 1991 |      0 | Dataflow: materialize.public.v1_primary_idx |        1991 |     1 |          67
 1991 |      2 | Dataflow: materialize.public.v1_primary_idx |        1991 |     1 |          67
 1991 |      3 | Dataflow: materialize.public.v1_primary_idx |        1991 |     1 |          67
 2145 |      0 | DeltaJoinFinalization                       |        1991 |     1 |          33
 2145 |      1 | DeltaJoinFinalization                       |        1991 |     1 |          33
 2145 |      3 | DeltaJoinFinalization                       |        1991 |     1 |          33
 2620 |      0 | Dataflow: materialize.public.v1_primary_idx |        1991 |     4 |          33
 2620 |      1 | Dataflow: materialize.public.v1_primary_idx |        1991 |     5 |          33
 2620 |      2 | Dataflow: materialize.public.v1_primary_idx |        1991 |     4 |          33
 2620 |      3 | Dataflow: materialize.public.v1_primary_idx |        1991 |     4 |          33
 1991 |      0 | Dataflow: materialize.public.v1_primary_idx |        1991 |     4 |          33
 1991 |      1 | Dataflow: materialize.public.v1_primary_idx |        1991 |     5 |          33
 1991 |      2 | Dataflow: materialize.public.v1_primary_idx |        1991 |     4 |          33
 1991 |      3 | Dataflow: materialize.public.v1_primary_idx |        1991 |     4 |          33

Perf report:

-   52.19%     0.00%  timely:work-1    [unknown]           [.] 0x1056394808568b48                                                              ◆
   - 0x1056394808568b48                                                                                                                        ▒
        22.39% <repr::row::DatumListIter as core::iter::traits::iterator::Iterator>::next                                                      ▒
        15.01% repr::row::read_datum                                                                                                           ▒
        10.39% dataflow::render::join::linear_join::<impl dataflow::render::context::Context<G,repr::row::Row,T>>::differential_join_inner::{{c▒
        2.74% dec::decimal::Decimal<_>::from_raw_parts                                                                                         ▒
        1.07% repr::row::read_lengthed_datum                                                                                                   ▒
+   25.08%    25.05%  timely:work-1    materialized        [.] <repr::row::DatumListIter as core::iter::traits::iterator::Iterator>::next      ▒
+   21.33%    21.31%  timely:work-1    materialized        [.] expr::scalar::func::BinaryFunc::eval                                            ▒
+   16.11%    16.10%  timely:work-1    materialized        [.] repr::row::read_datum                                                           ▒
+   15.36%    15.35%  timely:work-1    materialized        [.] dataflow::render::join::linear_join::<impl dataflow::render::context::Context<G,▒
+    9.46%     0.00%  timely:work-1    [unknown]           [.] 0x0000000000000004                                                              ▒
+    9.43%     0.00%  timely:work-1    [unknown]           [k] 0x0000000000000007                                                              ▒
+    7.81%     0.00%  timely:work-1    [unknown]           [.] 0000000000000000                                                                ▒
+    5.24%     5.23%  timely:work-1    materialized        [.] expr::scalar::MirScalarExpr::eval                                               ▒
+    3.45%     0.00%  timely:work-1    [unknown]           [.] 0x0000000000010033                                                              ▒
     3.45%     3.45%  timely:work-1    materialized        [.] differential_dataflow::operators::join::Deferred<K,V1,V2,T,R1,R2,R3,C1,C2,D>::wo▒
+    2.74%     2.74%  timely:work-1    materialized        [.] dec::decimal::Decimal<_>::from_raw_parts                                        ▒
+    2.51%     2.50%  timely:work-1    materialized        [.] expr::linear::plan::SafeMfpPlan::evaluate_inner                                 ▒
+    2.37%     0.00%  timely:work-1    [unknown]           [.] 0x0000000000010000                                                              ▒
+    2.02%     0.00%  timely:work-1    [unknown]           [.] 0x0000000000010032                                                              ▒
+    1.75%     1.75%  timely:work-1    libc-2.31.so        [.] __memcmp_avx2_movbe                                                             ▒
+    1.51%     0.00%  timely:work-1    [unknown]           [.] 0x0000000000000001                                                              ▒
+    1.07%     1.07%  timely:work-1    materialized        [.] repr::row::read_lengthed_datum                                                  ▒
@philip-stoev philip-stoev added the C-bug Category: something is broken label Jul 28, 2021
@uce
Copy link
Contributor

uce commented Jul 28, 2021

We need to look into the plan to understand whether this is actually expected (e.g. X cross-joins) or not.

@philip-stoev
Copy link
Contributor Author

The query did not initially have cross joins, so if any are present they are due to transformations. Either way, even a query with a cross join in it should not cause a liveness problem.

Here is the plan, it seems it has 1 cross join in it:

                                                     Optimized Plan                                                      
-------------------------------------------------------------------------------------------------------------------------
 %0 =                                                                                                                   +
 | Get materialize.public.lineitem (u64)                                                                                +
 | ArrangeBy (#0)                                                                                                       +
                                                                                                                        +
 %1 =                                                                                                                   +
 | Get materialize.public.orders (u60)                                                                                  +
 | ArrangeBy (#0) (#1)                                                                                                  +
                                                                                                                        +
 %2 =                                                                                                                   +
 | Get materialize.public.customer (u56)                                                                                +
 | ArrangeBy (#0)                                                                                                       +
                                                                                                                        +
 %3 = Let l0 =                                                                                                          +
 | Join %0 %1 %2 (= #0 #16) (= #17 #25)                                                                                 +
 | | implementation = DeltaQuery                                                                                        +
 | |   delta %0 %1.(#0) %2.(#0)                                                                                         +
 | |   delta %1 %0.(#0) %2.(#0)                                                                                         +
 | |   delta %2 %1.(#1) %0.(#0)                                                                                         +
 | | demand = (#11, #12, #17, #19, #20, #28, #30, #32)                                                                  +
                                                                                                                        +
 %4 = Let l1 =                                                                                                          +
 | Get %3 (l0)                                                                                                          +
 | Distinct group=(#12, #17)                                                                                            +
                                                                                                                        +
 %5 =                                                                                                                   +
 | Get materialize.public.lineitem (u64)                                                                                +
 | ArrangeBy (#0)                                                                                                       +
                                                                                                                        +
 %6 =                                                                                                                   +
 | Get materialize.public.orders (u60)                                                                                  +
 | ArrangeBy (#0)                                                                                                       +
                                                                                                                        +
 %7 = Let l2 =                                                                                                          +
 | Join %5 %6 (= #0 #16)                                                                                                +
 | | implementation = DeltaQuery                                                                                        +
 | |   delta %5 %6.(#0)                                                                                                 +
 | |   delta %6 %5.(#0)                                                                                                 +
 | | demand = (#3, #19)                                                                                                 +
 | Filter ((((#19 <= 401) && (#19 >= 3)) && (#3 != 0)) || (#19 = 62))                                                   +
 | Distinct group=()                                                                                                    +
                                                                                                                        +
 %8 =                                                                                                                   +
 | Get %4 (l1)                                                                                                          +
 | ArrangeBy ()                                                                                                         +
                                                                                                                        +
 %9 =                                                                                                                   +
 | Get materialize.public.orders (u60)                                                                                  +
 | ArrangeBy ()                                                                                                         +
                                                                                                                        +
 %10 =                                                                                                                  +
 | Get %7 (l2)                                                                                                          +
 | Map true                                                                                                             +
                                                                                                                        +
 %11 =                                                                                                                  +
 | Get %7 (l2)                                                                                                          +
 | Negate                                                                                                               +
                                                                                                                        +
 %12 =                                                                                                                  +
 | Constant ()                                                                                                          +
                                                                                                                        +
 %13 =                                                                                                                  +
 | Union %11 %12                                                                                                        +
 | Map false                                                                                                            +
                                                                                                                        +
 %14 =                                                                                                                  +
 | Union %10 %13                                                                                                        +
                                                                                                                        +
 %15 = Let l3 =                                                                                                         +
 | Join %8 %9 %14                                                                                                       +
 | | implementation = Differential %14 %8.() %9.()                                                                      +
 | | demand = (#0, #1, #3, #6, #11)                                                                                     +
 | Filter ((#11 && (#0 != #6)) || ((#0 = #6) && (datetots(#0) = (#6 - 1 month)))), (#1 != #3)                           +
 | Distinct group=(#0, #1)                                                                                              +
                                                                                                                        +
 %16 =                                                                                                                  +
 | Get %3 (l0)                                                                                                          +
 | ArrangeBy (#12, #17)                                                                                                 +
                                                                                                                        +
 %17 =                                                                                                                  +
 | Get %15 (l3)                                                                                                         +
 | Map true                                                                                                             +
                                                                                                                        +
 %18 =                                                                                                                  +
 | Get %15 (l3)                                                                                                         +
 | Negate                                                                                                               +
                                                                                                                        +
 %19 =                                                                                                                  +
 | Union %18 %4                                                                                                         +
 | Map false                                                                                                            +
                                                                                                                        +
 %20 =                                                                                                                  +
 | Union %17 %19                                                                                                        +
                                                                                                                        +
 %21 =                                                                                                                  +
 | Join %16 %20 (= #12 #33) (= #17 #34)                                                                                 +
 | | implementation = Differential %20 %16.(#12, #17)                                                                   +
 | | demand = (#11, #12, #19, #20, #28, #30, #32, #35)                                                                  +
 | Filter (!(#35) || ((!("^l.*$" ~(#32)) && ((0 = (#30 - #19)) && (#11 <= #20))) && (datetots(#20) < (#12 - 7 months))))+
 | Project (#28, #28)                                                                                                   +
 
(1 row)

@teskje
Copy link
Contributor

teskje commented Apr 27, 2023

@philip-stoev Do you think we can close this issue? It's not quite clear anymore what it should be tracking, since the loss of interactivity problem doesn't exist anymore with the platform's decoupling of adapter and compute.

If we take "loss of interactivity" as meaning "queries to a cluster don't return" then I think this issue might still be valid. But with the case reported in this issue, it's actually not too bad either. On my local machine:

materialize=> select * from mz_internal.mz_dataflows limit 1;
 id  | local_id |        name
-----+----------+--------------------
 858 | 24       | Dataflow: u1.u6.v1
(1 row)

Time: 2091.092 ms (00:02.091)
materialize=> select * from mz_internal.mz_dataflows limit 1;
 id  | local_id |        name
-----+----------+--------------------
 858 | 24       | Dataflow: u1.u6.v1
(1 row)

Time: 1870.903 ms (00:01.871)
materialize=> select * from mz_internal.mz_dataflows limit 1;
 id  | local_id |        name
-----+----------+--------------------
 858 | 24       | Dataflow: u1.u6.v1
(1 row)

The problematic cross join here is a delta join, and for those the fueling works reasonably well. This is not the case for differential cross joins (which are essentially unfueled), so maybe we should open a ticket for those instead.

@teskje
Copy link
Contributor

teskje commented May 2, 2023

Im closing this in favor of #19093. Feel free to reopen if you disagree, @philip-stoev.

@teskje teskje closed this as completed May 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-compute Area: compute C-bug Category: something is broken
Projects
No open projects
Compute
Icebox
Development

No branches or pull requests

3 participants