Generated left-join is too complicated for Postgres to plan well #72

mitchellwrosen · 2021-06-22T20:03:19Z

Hi folks,

My colleague @tstat and I were playing around with the library for the first time and we ran into a performance issue with a generated query that may or may not be totally expected. I understand that translating monadic syntax to SQL queries is perhaps going to be unavoidably "for-loopy" at times, so please feel free to close this issue as a #wontfix, although if that's the case I think we could put together some documentation that mention some known performance gotchas so that users won't necessarily have to examine plan output themselves.

Anyways, the query we were interested in writing with rel8 was a simple left-join. The schema is unimportant here, but let me know if it would be helpful to provide a repo that reproduces this issue exactly. (I think it will be clear enough from the following high-level description).

We have some students table, and students have first and last names, and we're interested in pairing students whose first name maybe matches the last name of another student:

select *
from students s1
left join students s2 on s1.first = s2.last;

This generates the following plan:

 Hash Left Join  (cost=2.35..4.78 rows=60 width=114)
   Hash Cond: (s1.first = s2.last)
   ->  Seq Scan on students s1  (cost=0.00..1.60 rows=60 width=57)
   ->  Hash  (cost=1.60..1.60 rows=60 width=57)
         ->  Seq Scan on students s2  (cost=0.00..1.60 rows=60 width=57)

In rel8, we translated this as:

s1 <- each studentsSchema
s2 <- optional do
  s2 <- each studentsSchema
  where_ (first s1 ==. last s2)
  pure s2
pure (s1, s2)

which generated a messier version of the following query:

select *
from students s1
cross join lateral (
  select *
  from (select 0) t1
  left outer join (
    select *
    from students s2
    where s1.first = s2.last
  ) t2 on true
) t2;

Unfortunately, Postgres' optimizer does not seem to consider a hash join in this case, and falls back to a much more expensive nested loop join:

 Nested Loop  (cost=0.00..109.00 rows=60 width=118)
   ->  Seq Scan on students s1  (cost=0.00..1.60 rows=60 width=57)
   ->  Nested Loop Left Join  (cost=0.00..1.77 rows=1 width=61)
         ->  Result  (cost=0.00..0.01 rows=1 width=0)
         ->  Seq Scan on students s2  (cost=0.00..1.75 rows=1 width=57)
               Filter: (s1.first = last)

Curiously, this is not merely due to the existence of a lateral join with a pushed-in where-clause. We tried rewriting the original SQL in this style:

select *
from students s1
left join lateral (
  select *
  from students s2
  where s1.first = s2.last
) t2 on true;

and in this case Postgres did again consider the hash join:

 Hash Left Join  (cost=2.35..4.78 rows=60 width=114)
   Hash Cond: (s1.first = s2.last)
   ->  Seq Scan on students s1  (cost=0.00..1.60 rows=60 width=57)
   ->  Hash  (cost=1.60..1.60 rows=60 width=57)
         ->  Seq Scan on students s2  (cost=0.00..1.60 rows=60 width=57)

The text was updated successfully, but these errors were encountered:

ocharles · 2021-06-22T23:22:33Z

Very interesting! I imagine there are no indices here. In practice we tend to index these join conditions, at which point we usually see a nested loop and index scan - but I don't think we ever looked at an un-indexed join. Will look into this!

ocharles · 2021-06-22T23:23:07Z

Could you share select version(), too?

ocharles · 2021-06-23T09:40:28Z

Ok, we think this is a slightly deeper problem in Opaleye itself. At work, we use Opaleye with tomjaguarpaw/haskell-opaleye#480 applied, which produces the plan that you're expecting. There are a few solutions (one is to just use that PR, another is to write smarter code). We'll see where this PR goes first, but we at least do have a handle on the problem now.

Thanks for reporting this!

0.7.3 has important changes to where lateral joins are introduced, and fixes #72.

tstat · 2021-06-29T17:32:21Z

I believe that the original query was an example of the following problem: a lateral subquery that does not get pulled up constrains the planner to do a nested loop to perform the join. and a nested loop isn't the most efficient way to do every join. This problem is described a bit here.

While the changes to opaleye simplified the original example enough to allow it to be pulled up It is probably worth noting that there remain many examples where the subquery would not be pulled up and the original problem as described above returns. For example, I think this query would be natural to write with rel8's api:

    s1 <- each studentsSchema
    s2 <- many $ do
      s2 <- each studentsSchema
      where_ $ stlastname s1 ==. stlastname s2
      pure (stfirstname s2)
    pure (stlastname s1, s2)

The array_agg in the lateral subquery prevents it from getting pulled up, and the lateral forces a nested loop, resulting in the following plan:

: Nested Loop Left Join  (cost=0.00..6643.83 rows=430 width=64)
:   ->  Seq Scan on students "T1"  (cost=0.00..14.30 rows=430 width=32)
:   ->  Subquery Scan on "T1_1"  (cost=0.00..15.41 rows=1 width=33)
:         ->  GroupAggregate  (cost=0.00..15.40 rows=1 width=36)
:               Group Key: 0
:               ->  Seq Scan on students "T1_2"  (cost=0.00..15.38 rows=2 width=36)
:                     Filter: ("T1".stlastname = stlastname)

compared to a non-lateral variant:

  explain
  select
  s1.stlastname,
  s2.first_names
  from students s1
  left outer join (
   select array_agg(s2.stfirstname) as first_names,
   s2.stlastname
   from students s2
   group by s2.stlastname
  ) s2 using (stlastname)

which allows postgres to consider a hash join:

: Hash Left Join  (cost=23.45..38.90 rows=430 width=64)
:   Hash Cond: (s1.stlastname = s2.stlastname)
:   ->  Seq Scan on students s1  (cost=0.00..14.30 rows=430 width=32)
:   ->  Hash  (cost=20.95..20.95 rows=200 width=64)
:         ->  HashAggregate  (cost=16.45..18.95 rows=200 width=64)
:               Group Key: s2.stlastname
:               ->  Seq Scan on students s2  (cost=0.00..14.30 rows=430 width=64)

ocharles · 2021-07-01T13:22:51Z

Thanks @tstat, this is a great point. I'm not entirely sure what to do, but two options come to mind:

Do nothing. This is really an appeal to hope PostgreSQL gets smarter and can optimize what we give it into something better.
Try and do the "pulling up" ourselves, if possible. This is basically getting into writing an optimizer, which we've been trying very hard to avoid as it's really hard to do right.

The possibility of introducing a special non-lateral leftJoin function is hard. The obvious leftJoin :: Query a -> (a -> Expr Bool) -> Query (MaybeTable a) isn't enough - that Query a could still refers to other bound variables in Query. It's a specific goal of Rel8 to not need ST like tricks to deal with scoping. That said, maybe doing a LATERAL LEFT JOIN would be fine, and if it's not actually LATERAL, PostgreSQL will be smart enough to figure out what's going on. Either way, I'm not very excited about this because it doesn't play well with "join functions" that I advocate for in my Rel8 talk.

I dunno if @shane-circuithub has any ideas here.

shane-circuithub · 2021-07-08T00:11:55Z

I appreciate all the input from @tstat and and @mitchellwrosen here. I still don't have a good intuition for this but hopefully that will come!

Having said that, I don't think of the examples in @tstat's comment as being the same query (the Rel8 version and the Postgres version). If I were to translate the Postgres version into Rel8, I would use Rel8.Tabulate (which unfortunately we haven't released yet).

studentsByLastname :: Tabulation (Expr Text) (Student Expr)
studentsByLastname = fromQuery $ do
  student@Student {lastname <- each studentsSchema
  pure (lastname, student)

example :: Tabulation (Expr Text) (ListTable Expr (Expr Text))
example = do
  _ <- studentsByLastname
  manyTabulation $ firstname <$> studentsByLastname

I haven't tested this (if you could post the SQL to create this students table and any indices, that would be helpful), but I'm almost certain that would produce the same query plan as the handwritten SQL.

ocharles · 2021-07-08T12:31:20Z

@shane-circuithub has asked me to elaborate on the above.

First, I can reproduce the bug/poor query plan using just values:

students :: Query (Expr Text, Expr  Text)
students = values [(lit firstName, lit lastName) | firstName <- ["A", "B", "C"], lastName <- [ "D", "E"] ]

query1 = do
    s1 <- students
    s2 <- many $ do
      s2 <- students
      where_ $ snd s1 ==. snd s2
      pure (fst s2)
    pure (snd s1, s2)

query1 produces:

SELECT
CAST("values1_1" AS text) as "_1",
CAST(CASE WHEN ("rebind0_7") IS NULL THEN CAST(ARRAY[] AS text[]) ELSE "result0_6" END AS text[]) as "_2"
FROM (SELECT *
      FROM
      (SELECT
       *
       FROM (SELECT "column1" as "values0_1",
                    "column2" as "values1_1"
             FROM
             (VALUES
              (CAST(E'A' AS text),CAST(E'D' AS text)),
              (CAST(E'A' AS text),CAST(E'E' AS text)),
              (CAST(E'B' AS text),CAST(E'D' AS text)),
              (CAST(E'B' AS text),CAST(E'E' AS text)),
              (CAST(E'C' AS text),CAST(E'D' AS text)),
              (CAST(E'C' AS text),CAST(E'E' AS text))) as "V") as "T1") as "T1"
      LEFT OUTER JOIN
      LATERAL
      (SELECT
       TRUE as "rebind0_7",
       *
       FROM (SELECT
             *
             FROM (SELECT
                   ARRAY_AGG("inner0_6") as "result0_6"
                   FROM (SELECT
                         "values0_3" as "inner0_6",
                         *
                         FROM (SELECT
                               *
                               FROM (SELECT "column1" as "values0_3",
                                            "column2" as "values1_3"
                                     FROM
                                     (VALUES
                                      (CAST(E'A' AS text),CAST(E'D' AS text)),
                                      (CAST(E'A' AS text),CAST(E'E' AS text)),
                                      (CAST(E'B' AS text),CAST(E'D' AS text)),
                                      (CAST(E'B' AS text),CAST(E'E' AS text)),
                                      (CAST(E'C' AS text),CAST(E'D' AS text)),
                                      (CAST(E'C' AS text),CAST(E'E' AS text))) as "V") as "T1"
                               WHERE (("values1_1") = ("values1_3"))) as "T1") as "T1"
                   GROUP BY COALESCE(0)) as "T1") as "T1") as "T2"
      ON
      TRUE) as "T1"

which has the plan

 Nested Loop Left Join  (cost=0.00..0.84 rows=6 width=64)
   ->  Values Scan on "*VALUES*"  (cost=0.00..0.08 rows=6 width=32)
   ->  Subquery Scan on "T1"  (cost=0.00..0.12 rows=1 width=33)
         ->  GroupAggregate  (cost=0.00..0.11 rows=1 width=36)
               Group Key: 0
               ->  Values Scan on "*VALUES*_1"  (cost=0.00..0.09 rows=1 width=36)
                     Filter: ("*VALUES*".column2 = column2)
(7 rows)

If we use the upcoming Rel8.Tabulate interface, we can write:

query2 = toQuery do
  _ <- studentsByLastName
  manyTabulation (fst <$> studentsByLastName)
  where
    studentsByLastName :: Tabulation (Expr Text) (Expr Text, Expr Text)
    studentsByLastName = fromQuery do
      s <- students
      return (snd s, s)

which produces

SELECT
CAST("values1_1" AS text) as "_1",
CAST(CASE WHEN ("rebind0_13") IS NULL THEN CAST(ARRAY[] AS text[]) ELSE "result1_6" END AS text[]) as "_2"
FROM (SELECT *
      FROM
      (SELECT
       *
       FROM (SELECT "column1" as "values0_1",
                    "column2" as "values1_1"
             FROM
             (VALUES
              (CAST(E'A' AS text),CAST(E'D' AS text)),
              (CAST(E'A' AS text),CAST(E'E' AS text)),
              (CAST(E'B' AS text),CAST(E'D' AS text)),
              (CAST(E'B' AS text),CAST(E'E' AS text)),
              (CAST(E'C' AS text),CAST(E'D' AS text)),
              (CAST(E'C' AS text),CAST(E'E' AS text))) as "V") as "T1") as "T1"
      LEFT OUTER JOIN
      LATERAL
      (SELECT
       TRUE as "rebind0_13",
       *
       FROM (SELECT
             *
             FROM (SELECT
                   "inner0_6" as "result0_6",
                   ARRAY_AGG("inner1_6") as "result1_6"
                   FROM (SELECT
                         "values1_4" as "inner0_6",
                         "values0_4" as "inner1_6",
                         *
                         FROM (SELECT
                               *
                               FROM (SELECT "column1" as "values0_4",
                                            "column2" as "values1_4"
                                     FROM
                                     (VALUES
                                      (CAST(E'A' AS text),CAST(E'D' AS text)),
                                      (CAST(E'A' AS text),CAST(E'E' AS text)),
                                      (CAST(E'B' AS text),CAST(E'D' AS text)),
                                      (CAST(E'B' AS text),CAST(E'E' AS text)),
                                      (CAST(E'C' AS text),CAST(E'D' AS text)),
                                      (CAST(E'C' AS text),CAST(E'E' AS text))) as "V") as "T1") as "T1") as "T1"
                   GROUP BY "inner0_6") as "T1"
             WHERE (("values1_1") = ("result0_6"))) as "T1") as "T2"
      ON
      TRUE) as "T1"

which has the plan

 Hash Right Join  (cost=0.26..0.47 rows=6 width=64)
   Hash Cond: ("T1".result0_6 = "*VALUES*".column2)
   ->  Subquery Scan on "T1"  (cost=0.11..0.24 rows=6 width=65)
         ->  HashAggregate  (cost=0.11..0.18 rows=6 width=64)
               Group Key: "*VALUES*_1".column2
               ->  Values Scan on "*VALUES*_1"  (cost=0.00..0.08 rows=6 width=64)
   ->  Hash  (cost=0.08..0.08 rows=6 width=32)
         ->  Values Scan on "*VALUES*"  (cost=0.00..0.08 rows=6 width=32)

There's nothing really special about Rel8.Tabulate, it's just a convenient way to write these types of queries. We think of a Tabulation k v as like a Map k v. The monad instance for a Tabulation is intersection.

tstat · 2021-07-13T14:32:48Z

Interesting, it looks ilke in the Tabulate plan that the left outer join lateral doesn't contain an array_agg because that is pushed into another subquery within the lateral subquery, and this inner subquery doesn't contain any lateral references. This results in the inner subquery not being pulled up (due to the aggregate), but the lateral subquery is pulled up and the lateral is eliminated.

As far as what to do about the problem: I think some documentation about performance considerations would be helpful to the users of rel8.

We have already discussed how lateral queries force a particular join method, but another constraint to be mindful of is that they force a join order. Consider the following example:

  create table s (
    x int8 not null primary key,
    y int8 not null
  );
  
  insert into s
  select n, mod(n, 2)
  from generate_series(1, 5) as s(n)
  order by random();
  
  create table t (
    x int8 not null primary key,
    y int8 not null
  );

  insert into t
  select n, mod(n, 5)
  from generate_series(1, 1000000) as t(n)
  order by random();

  create table u (
    x int8 not null primary key,
    y int8 not null
  );
  
  insert into u
  select n, mod(n, 5)
  from generate_series(1, 1000000) as t(n)
  order by random();
  
  analyze s, t, u;

So s has 5 rows while t and u both have 1,000,000. If we want to join them all by their x columns then we could write:

  select *
  from t
  inner join u on t.x = u.x
  inner join s on s.x = u.x

and postgres will consider different join methods and join orders (because inner join is commutative and associative)

Here is what it planned on my machine

QUERY PLAN
Nested Loop  (cost=1.96..4.27 rows=5 width=48) (actual time=0.025..0.071 rows=5 loops=1)
  ->  Merge Join  (cost=1.53..1.88 rows=5 width=32) (actual time=0.021..0.044 rows=5 loops=1)
        Merge Cond: (u.x = s.x)
        ->  Index Scan using u_pkey on u  (cost=0.42..51341.77 rows=1000000 width=16) (actual time=0.005..0.026 rows=6 loops=1)
        ->  Sort  (cost=1.11..1.12 rows=5 width=16) (actual time=0.013..0.013 rows=5 loops=1)
              Sort Key: s.x
              Sort Method: quicksort  Memory: 25kB
              ->  Seq Scan on s  (cost=0.00..1.05 rows=5 width=16) (actual time=0.001..0.002 rows=5 loops=1)
  ->  Index Scan using t_pkey on t  (cost=0.42..0.48 rows=1 width=16) (actual time=0.005..0.005 rows=1 loops=5)
        Index Cond: (x = u.x)
Planning Time: 0.515 ms
Execution Time: 0.103 ms

The statistics postgres keeps on every table helps inform postgres that it can use a merge join to read 6 rows of u's 1,000,000 since there are only 5 rows in s and the primary key btree on u is in u.x order. Postgres expects only 5 rows to result from this join (the statistics indicate that both s.x and u.x are unique). A nested loop with the 5 row outer and an index scan inner is seen as the cheapest way to join to t as index scans are cheap and we are only doing 5 loops.

If I rewrote this to use lateral joins like this (using offset 0 to prevent subquery pullup):

  select *
  from t
  , lateral (
    select *
    from u
    where u.x = t.x
    offset 0
  ) as u,
  lateral (
    select *
    from s
    where u.x = s.x
    offset 0
  ) as s

then I force the join order (t <> u) <> s and the join method to nested loop and get a terrible plan:

QUERY PLAN
Nested Loop  (cost=0.42..9560406.00 rows=1000000 width=48) (actual time=386.602..2168.792 rows=5 loops=1)
  ->  Nested Loop  (cost=0.42..8477906.00 rows=1000000 width=32) (actual time=0.050..1595.613 rows=1000000 loops=1)
        ->  Seq Scan on t  (cost=0.00..15406.00 rows=1000000 width=16) (actual time=0.006..50.673 rows=1000000 loops=1)
        ->  Index Scan using u_pkey on u  (cost=0.42..8.44 rows=1 width=16) (actual time=0.001..0.001 rows=1 loops=1000000)
              Index Cond: (x = t.x)
  ->  Seq Scan on s  (cost=0.00..1.06 rows=1 width=16) (actual time=0.000..0.000 rows=0 loops=1000000)
        Filter: (u.x = x)
        Rows Removed by Filter: 5
Planning Time: 0.288 ms
Execution Time: 2168.823 ms

Although all of the joined columns are indexed, doing 2,000,000 index scans is expensive and not necessary.

If I reorder my lateral query to read from s first:

  select *
  from s
  , lateral (
    select *
    from t
    where s.x = t.x
    offset 0
  ) as t,
  lateral (
    select *
    from u
    where u.x = t.x
    offset 0
  ) as u

Then I will read the 5 rows from s , do 5 index scans to join to t, and 5 index scans to join to u, resulting in a much nicer plan:

QUERY PLAN
Nested Loop  (cost=0.85..85.67 rows=5 width=48) (actual time=0.053..0.100 rows=5 loops=1)
  ->  Nested Loop  (cost=0.42..43.36 rows=5 width=32) (actual time=0.039..0.068 rows=5 loops=1)
        ->  Seq Scan on s  (cost=0.00..1.05 rows=5 width=16) (actual time=0.005..0.006 rows=5 loops=1)
        ->  Index Scan using t_pkey on t  (cost=0.42..8.44 rows=1 width=16) (actual time=0.012..0.012 rows=1 loops=5)
              Index Cond: (x = s.x)
  ->  Index Scan using u_pkey on u  (cost=0.42..8.44 rows=1 width=16) (actual time=0.006..0.006 rows=1 loops=5)
        Index Cond: (x = t.x)
Planning Time: 0.295 ms
Execution Time: 0.124 ms

So, when writing queries with rel8, if the subquery is not simple (and thus cannot be pulled up) then one should be mindful of the bind order.

ocharles · 2021-07-13T15:11:31Z

Edit: I was mistaken, you can ignore my original post - but it's here if you want to see it!

I was ready to say "yea, we should just write documentation", but I think you might have found another query that's not currently expressible by Rel8. I wanted to express

  select *
  from s
  , lateral (
    select *
    from t
    where s.x = t.x
    offset 0
  ) as t,
  lateral (
    select *
    from u
    where u.x = t.x
    offset 0
  ) as u

In Rel8 but without LATERAL references, and wanted SQL like:

select * from t, (select * from u offset 0) u, (select * from s offset 0) as s where t.x = u.x and s.x = u.x;

This has a good plan:

                                                       QUERY PLAN                                                        
-------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=1.59..29159.65 rows=5 width=48) (actual time=7.064..88.236 rows=5 loops=1)
   ->  Hash Join  (cost=1.16..29157.21 rows=5 width=32) (actual time=7.059..88.215 rows=5 loops=1)
         Hash Cond: (u.x = s.x)
         ->  Seq Scan on u  (cost=0.00..15406.00 rows=1000000 width=16) (actual time=0.019..44.101 rows=1000000 loops=1)
         ->  Hash  (cost=1.10..1.10 rows=5 width=16) (actual time=0.004..0.004 rows=5 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 9kB
               ->  Seq Scan on s  (cost=0.00..1.05 rows=5 width=16) (actual time=0.002..0.002 rows=5 loops=1)
   ->  Index Scan using t_pkey on t  (cost=0.42..0.48 rows=1 width=16) (actual time=0.002..0.003 rows=1 loops=5)
         Index Cond: (x = u.x)
 Planning time: 0.188 ms
 Execution time: 88.255 ms
(11 rows)

However, I don't seem to be able to express this. The obvious

putStrLn $ showQuery $ liftA3 (,,) (offset 0 (each tSchema)) (offset 0 (each sSchema)) (offset 0 (each uSchema)) >>= \(t,s,u) -> where_ (fst u ==. fst t &&. fst u ==. fst s) >> return (t, s, u)

produces

explain analyze SELECT                                                                                                       
CAST("x0_1" AS int8) as "_1/_1",
CAST("y1_1" AS int8) as "_1/_2",
CAST("x0_3" AS int8) as "_2/_1",
CAST("y1_3" AS int8) as "_2/_2",
CAST("x0_6" AS int8) as "_3/_1",
CAST("y1_6" AS int8) as "_3/_2"
FROM (SELECT
      *
      FROM (SELECT
            *
            FROM (SELECT
                  *
                  FROM (SELECT
                        "x" as "x0_1",
                        "y" as "y1_1"
                        FROM "t" as "T1") as "T1") as "T1"
            OFFSET 0) as "T1",
           LATERAL
           (SELECT
            *
            FROM (SELECT
                  *
                  FROM (SELECT
                        "x" as "x0_3",
                        "y" as "y1_3"
                        FROM "s" as "T1") as "T1") as "T1"
            OFFSET 0) as "T2",
           LATERAL
           (SELECT
            *
            FROM (SELECT
                  *
                  FROM (SELECT
                        "x" as "x0_6",
                        "y" as "y1_6"
                        FROM "u" as "T1") as "T1") as "T1"
            OFFSET 0) as "T3"
      WHERE ((("x0_6") = ("x0_1")) AND (("x0_6") = ("x0_3")))) as "T1"

But this doesn't plan as well:

                                                              QUERY PLAN                                                              
--------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=29157.27..58313.32 rows=5 width=48) (actual time=96.610..177.602 rows=5 loops=1)
   Hash Cond: ("T1".x = "T1_1".x)
   ->  Seq Scan on u "T1"  (cost=0.00..15406.00 rows=1000000 width=16) (actual time=0.023..44.073 rows=1000000 loops=1)
   ->  Hash  (cost=29157.21..29157.21 rows=5 width=32) (actual time=89.632..89.632 rows=5 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         ->  Hash Join  (cost=1.16..29157.21 rows=5 width=32) (actual time=6.532..89.631 rows=5 loops=1)
               Hash Cond: ("T1_1".x = "T1_2".x)
               ->  Seq Scan on t "T1_1"  (cost=0.00..15406.00 rows=1000000 width=16) (actual time=0.007..45.434 rows=1000000 loops=1)
               ->  Hash  (cost=1.10..1.10 rows=5 width=16) (actual time=0.005..0.005 rows=5 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 9kB
                     ->  Seq Scan on s "T1_2"  (cost=0.00..1.05 rows=5 width=16) (actual time=0.002..0.002 rows=5 loops=1)
 Planning time: 0.174 ms
 Execution time: 177.623 ms

I wonder if a custom Applicative instance is in order that doesn't use LATERAL?

I made a mistake in my Rel8 query. I wrote

liftA3 (,,) (offset 0 (each tSchema)) (offset 0 (each sSchema)) (offset 0 (each uSchema))

But the SQL I wanted is actually produced by

liftA3 (,,) (each tSchema) (offset 0 (each sSchema)) (offset 0 (each uSchema))

This does have the correct query plan.

So at this point, I think it's just a case of taking care when writing Rel8 queries, and we should write documentation as to what this care is. Fortunately there's a goldmine of information in this issue, so we can be a lot more precise than before!

ocharles mentioned this issue Jun 23, 2021

Make all queries LATERAL all the time tomjaguarpaw/haskell-opaleye#480

Closed

ocharles added a commit that referenced this issue Jun 28, 2021

Require Opaleye >= 0.7.3

060178f

0.7.3 has important changes to where lateral joins are introduced, and fixes #72.

ocharles mentioned this issue Jun 28, 2021

Require Opaleye >= 0.7.3 #81

Merged

shane-circuithub closed this as completed in #81 Jun 28, 2021

shane-circuithub pushed a commit that referenced this issue Jun 28, 2021

Require Opaleye >= 0.7.3 (#81)

bf19395

0.7.3 has important changes to where lateral joins are introduced, and fixes #72.

shane-circuithub reopened this Jun 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generated left-join is too complicated for Postgres to plan well #72

Generated left-join is too complicated for Postgres to plan well #72

mitchellwrosen commented Jun 22, 2021 •

edited

Loading

ocharles commented Jun 22, 2021

ocharles commented Jun 22, 2021

ocharles commented Jun 23, 2021 •

edited

Loading

tstat commented Jun 29, 2021

ocharles commented Jul 1, 2021 •

edited

Loading

shane-circuithub commented Jul 8, 2021 •

edited

Loading

ocharles commented Jul 8, 2021 •

edited

Loading

tstat commented Jul 13, 2021

ocharles commented Jul 13, 2021 •

edited

Loading

Generated left-join is too complicated for Postgres to plan well #72

Generated left-join is too complicated for Postgres to plan well #72

Comments

mitchellwrosen commented Jun 22, 2021 • edited Loading

ocharles commented Jun 22, 2021

ocharles commented Jun 22, 2021

ocharles commented Jun 23, 2021 • edited Loading

tstat commented Jun 29, 2021

ocharles commented Jul 1, 2021 • edited Loading

shane-circuithub commented Jul 8, 2021 • edited Loading

ocharles commented Jul 8, 2021 • edited Loading

tstat commented Jul 13, 2021

ocharles commented Jul 13, 2021 • edited Loading

mitchellwrosen commented Jun 22, 2021 •

edited

Loading

ocharles commented Jun 23, 2021 •

edited

Loading

ocharles commented Jul 1, 2021 •

edited

Loading

shane-circuithub commented Jul 8, 2021 •

edited

Loading

ocharles commented Jul 8, 2021 •

edited

Loading

ocharles commented Jul 13, 2021 •

edited

Loading