-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generated left-join is too complicated for Postgres to plan well #72
Comments
Very interesting! I imagine there are no indices here. In practice we tend to index these join conditions, at which point we usually see a nested loop and index scan - but I don't think we ever looked at an un-indexed join. Will look into this! |
Could you share |
Ok, we think this is a slightly deeper problem in Opaleye itself. At work, we use Opaleye with tomjaguarpaw/haskell-opaleye#480 applied, which produces the plan that you're expecting. There are a few solutions (one is to just use that PR, another is to write smarter code). We'll see where this PR goes first, but we at least do have a handle on the problem now. Thanks for reporting this! |
0.7.3 has important changes to where lateral joins are introduced, and fixes #72.
0.7.3 has important changes to where lateral joins are introduced, and fixes #72.
I believe that the original query was an example of the following problem: a lateral subquery that does not get pulled up constrains the planner to do a nested loop to perform the join. and a nested loop isn't the most efficient way to do every join. This problem is described a bit here. While the changes to opaleye simplified the original example enough to allow it to be pulled up It is probably worth noting that there remain many examples where the subquery would not be pulled up and the original problem as described above returns. For example, I think this query would be natural to write with rel8's api: s1 <- each studentsSchema
s2 <- many $ do
s2 <- each studentsSchema
where_ $ stlastname s1 ==. stlastname s2
pure (stfirstname s2)
pure (stlastname s1, s2) The : Nested Loop Left Join (cost=0.00..6643.83 rows=430 width=64)
: -> Seq Scan on students "T1" (cost=0.00..14.30 rows=430 width=32)
: -> Subquery Scan on "T1_1" (cost=0.00..15.41 rows=1 width=33)
: -> GroupAggregate (cost=0.00..15.40 rows=1 width=36)
: Group Key: 0
: -> Seq Scan on students "T1_2" (cost=0.00..15.38 rows=2 width=36)
: Filter: ("T1".stlastname = stlastname) compared to a non-lateral variant: explain
select
s1.stlastname,
s2.first_names
from students s1
left outer join (
select array_agg(s2.stfirstname) as first_names,
s2.stlastname
from students s2
group by s2.stlastname
) s2 using (stlastname) which allows postgres to consider a hash join: : Hash Left Join (cost=23.45..38.90 rows=430 width=64)
: Hash Cond: (s1.stlastname = s2.stlastname)
: -> Seq Scan on students s1 (cost=0.00..14.30 rows=430 width=32)
: -> Hash (cost=20.95..20.95 rows=200 width=64)
: -> HashAggregate (cost=16.45..18.95 rows=200 width=64)
: Group Key: s2.stlastname
: -> Seq Scan on students s2 (cost=0.00..14.30 rows=430 width=64) |
Thanks @tstat, this is a great point. I'm not entirely sure what to do, but two options come to mind:
The possibility of introducing a special non-lateral I dunno if @shane-circuithub has any ideas here. |
I appreciate all the input from @tstat and and @mitchellwrosen here. I still don't have a good intuition for this but hopefully that will come! Having said that, I don't think of the examples in @tstat's comment as being the same query (the Rel8 version and the Postgres version). If I were to translate the Postgres version into Rel8, I would use studentsByLastname :: Tabulation (Expr Text) (Student Expr)
studentsByLastname = fromQuery $ do
student@Student {lastname <- each studentsSchema
pure (lastname, student)
example :: Tabulation (Expr Text) (ListTable Expr (Expr Text))
example = do
_ <- studentsByLastname
manyTabulation $ firstname <$> studentsByLastname I haven't tested this (if you could post the SQL to create this students table and any indices, that would be helpful), but I'm almost certain that would produce the same query plan as the handwritten SQL. |
@shane-circuithub has asked me to elaborate on the above. First, I can reproduce the bug/poor query plan using just students :: Query (Expr Text, Expr Text)
students = values [(lit firstName, lit lastName) | firstName <- ["A", "B", "C"], lastName <- [ "D", "E"] ]
query1 = do
s1 <- students
s2 <- many $ do
s2 <- students
where_ $ snd s1 ==. snd s2
pure (fst s2)
pure (snd s1, s2)
which has the plan
If we use the upcoming query2 = toQuery do
_ <- studentsByLastName
manyTabulation (fst <$> studentsByLastName)
where
studentsByLastName :: Tabulation (Expr Text) (Expr Text, Expr Text)
studentsByLastName = fromQuery do
s <- students
return (snd s, s) which produces
which has the plan
There's nothing really special about |
Interesting, it looks ilke in the As far as what to do about the problem: I think some documentation about performance considerations would be helpful to the users of rel8. We have already discussed how lateral queries force a particular join method, but another constraint to be mindful of is that they force a join order. Consider the following example: create table s (
x int8 not null primary key,
y int8 not null
);
insert into s
select n, mod(n, 2)
from generate_series(1, 5) as s(n)
order by random();
create table t (
x int8 not null primary key,
y int8 not null
);
insert into t
select n, mod(n, 5)
from generate_series(1, 1000000) as t(n)
order by random();
create table u (
x int8 not null primary key,
y int8 not null
);
insert into u
select n, mod(n, 5)
from generate_series(1, 1000000) as t(n)
order by random();
analyze s, t, u; So select *
from t
inner join u on t.x = u.x
inner join s on s.x = u.x and postgres will consider different join methods and join orders (because inner join is commutative and associative) Here is what it planned on my machine QUERY PLAN
Nested Loop (cost=1.96..4.27 rows=5 width=48) (actual time=0.025..0.071 rows=5 loops=1)
-> Merge Join (cost=1.53..1.88 rows=5 width=32) (actual time=0.021..0.044 rows=5 loops=1)
Merge Cond: (u.x = s.x)
-> Index Scan using u_pkey on u (cost=0.42..51341.77 rows=1000000 width=16) (actual time=0.005..0.026 rows=6 loops=1)
-> Sort (cost=1.11..1.12 rows=5 width=16) (actual time=0.013..0.013 rows=5 loops=1)
Sort Key: s.x
Sort Method: quicksort Memory: 25kB
-> Seq Scan on s (cost=0.00..1.05 rows=5 width=16) (actual time=0.001..0.002 rows=5 loops=1)
-> Index Scan using t_pkey on t (cost=0.42..0.48 rows=1 width=16) (actual time=0.005..0.005 rows=1 loops=5)
Index Cond: (x = u.x)
Planning Time: 0.515 ms
Execution Time: 0.103 ms The statistics postgres keeps on every table helps inform postgres that it can use a merge join to read 6 rows of If I rewrote this to use lateral joins like this (using select *
from t
, lateral (
select *
from u
where u.x = t.x
offset 0
) as u,
lateral (
select *
from s
where u.x = s.x
offset 0
) as s then I force the join order (t <> u) <> s and the join method to nested loop and get a terrible plan: QUERY PLAN
Nested Loop (cost=0.42..9560406.00 rows=1000000 width=48) (actual time=386.602..2168.792 rows=5 loops=1)
-> Nested Loop (cost=0.42..8477906.00 rows=1000000 width=32) (actual time=0.050..1595.613 rows=1000000 loops=1)
-> Seq Scan on t (cost=0.00..15406.00 rows=1000000 width=16) (actual time=0.006..50.673 rows=1000000 loops=1)
-> Index Scan using u_pkey on u (cost=0.42..8.44 rows=1 width=16) (actual time=0.001..0.001 rows=1 loops=1000000)
Index Cond: (x = t.x)
-> Seq Scan on s (cost=0.00..1.06 rows=1 width=16) (actual time=0.000..0.000 rows=0 loops=1000000)
Filter: (u.x = x)
Rows Removed by Filter: 5
Planning Time: 0.288 ms
Execution Time: 2168.823 ms Although all of the joined columns are indexed, doing 2,000,000 index scans is expensive and not necessary. If I reorder my lateral query to read from select *
from s
, lateral (
select *
from t
where s.x = t.x
offset 0
) as t,
lateral (
select *
from u
where u.x = t.x
offset 0
) as u Then I will read the 5 rows from QUERY PLAN
Nested Loop (cost=0.85..85.67 rows=5 width=48) (actual time=0.053..0.100 rows=5 loops=1)
-> Nested Loop (cost=0.42..43.36 rows=5 width=32) (actual time=0.039..0.068 rows=5 loops=1)
-> Seq Scan on s (cost=0.00..1.05 rows=5 width=16) (actual time=0.005..0.006 rows=5 loops=1)
-> Index Scan using t_pkey on t (cost=0.42..8.44 rows=1 width=16) (actual time=0.012..0.012 rows=1 loops=5)
Index Cond: (x = s.x)
-> Index Scan using u_pkey on u (cost=0.42..8.44 rows=1 width=16) (actual time=0.006..0.006 rows=1 loops=5)
Index Cond: (x = t.x)
Planning Time: 0.295 ms
Execution Time: 0.124 ms So, when writing queries with rel8, if the subquery is not simple (and thus cannot be pulled up) then one should be mindful of the bind order. |
Edit: I was mistaken, you can ignore my original post - but it's here if you want to see it! I was ready to say "yea, we should just write documentation", but I think you might have found another query that's not currently expressible by Rel8. I wanted to express select *
from s
, lateral (
select *
from t
where s.x = t.x
offset 0
) as t,
lateral (
select *
from u
where u.x = t.x
offset 0
) as u In Rel8 but without select * from t, (select * from u offset 0) u, (select * from s offset 0) as s where t.x = u.x and s.x = u.x; This has a good plan: QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=1.59..29159.65 rows=5 width=48) (actual time=7.064..88.236 rows=5 loops=1)
-> Hash Join (cost=1.16..29157.21 rows=5 width=32) (actual time=7.059..88.215 rows=5 loops=1)
Hash Cond: (u.x = s.x)
-> Seq Scan on u (cost=0.00..15406.00 rows=1000000 width=16) (actual time=0.019..44.101 rows=1000000 loops=1)
-> Hash (cost=1.10..1.10 rows=5 width=16) (actual time=0.004..0.004 rows=5 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on s (cost=0.00..1.05 rows=5 width=16) (actual time=0.002..0.002 rows=5 loops=1)
-> Index Scan using t_pkey on t (cost=0.42..0.48 rows=1 width=16) (actual time=0.002..0.003 rows=1 loops=5)
Index Cond: (x = u.x)
Planning time: 0.188 ms
Execution time: 88.255 ms
(11 rows) However, I don't seem to be able to express this. The obvious putStrLn $ showQuery $ liftA3 (,,) (offset 0 (each tSchema)) (offset 0 (each sSchema)) (offset 0 (each uSchema)) >>= \(t,s,u) -> where_ (fst u ==. fst t &&. fst u ==. fst s) >> return (t, s, u) produces explain analyze SELECT
CAST("x0_1" AS int8) as "_1/_1",
CAST("y1_1" AS int8) as "_1/_2",
CAST("x0_3" AS int8) as "_2/_1",
CAST("y1_3" AS int8) as "_2/_2",
CAST("x0_6" AS int8) as "_3/_1",
CAST("y1_6" AS int8) as "_3/_2"
FROM (SELECT
*
FROM (SELECT
*
FROM (SELECT
*
FROM (SELECT
"x" as "x0_1",
"y" as "y1_1"
FROM "t" as "T1") as "T1") as "T1"
OFFSET 0) as "T1",
LATERAL
(SELECT
*
FROM (SELECT
*
FROM (SELECT
"x" as "x0_3",
"y" as "y1_3"
FROM "s" as "T1") as "T1") as "T1"
OFFSET 0) as "T2",
LATERAL
(SELECT
*
FROM (SELECT
*
FROM (SELECT
"x" as "x0_6",
"y" as "y1_6"
FROM "u" as "T1") as "T1") as "T1"
OFFSET 0) as "T3"
WHERE ((("x0_6") = ("x0_1")) AND (("x0_6") = ("x0_3")))) as "T1" But this doesn't plan as well:
I wonder if a custom I made a mistake in my Rel8 query. I wrote liftA3 (,,) (offset 0 (each tSchema)) (offset 0 (each sSchema)) (offset 0 (each uSchema)) But the SQL I wanted is actually produced by liftA3 (,,) (each tSchema) (offset 0 (each sSchema)) (offset 0 (each uSchema)) This does have the correct query plan. So at this point, I think it's just a case of taking care when writing Rel8 queries, and we should write documentation as to what this care is. Fortunately there's a goldmine of information in this issue, so we can be a lot more precise than before! |
Hi folks,
My colleague @tstat and I were playing around with the library for the first time and we ran into a performance issue with a generated query that may or may not be totally expected. I understand that translating monadic syntax to SQL queries is perhaps going to be unavoidably "for-loopy" at times, so please feel free to close this issue as a #wontfix, although if that's the case I think we could put together some documentation that mention some known performance gotchas so that users won't necessarily have to examine plan output themselves.
Anyways, the query we were interested in writing with
rel8
was a simple left-join. The schema is unimportant here, but let me know if it would be helpful to provide a repo that reproduces this issue exactly. (I think it will be clear enough from the following high-level description).We have some
students
table, and students havefirst
andlast
names, and we're interested in pairing students whose first name maybe matches the last name of another student:This generates the following plan:
In
rel8
, we translated this as:which generated a messier version of the following query:
Unfortunately, Postgres' optimizer does not seem to consider a hash join in this case, and falls back to a much more expensive nested loop join:
Curiously, this is not merely due to the existence of a lateral join with a pushed-in where-clause. We tried rewriting the original SQL in this style:
and in this case Postgres did again consider the hash join:
The text was updated successfully, but these errors were encountered: