-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support DML in IF/THEN/ELSE expressions #6181
Conversation
79f0e3c
to
bef861e
Compare
As we've discussed (and, tragically, recommended to users), we can rewrite the conditionals into for loops: for b in COND union ( { (for _ in (select () filter b) union (IF_BRANCH)), (for _ in (select () filter not b) union (ELSE_BRANCH)), } ) The main fiddly part is preserving the correct cardinality/multiplicity inference. I did this by adding a card_inference_override field to Set that specifies another set that should determine the cardinality. I don't love this, but I haven't thought of a cleaner approach that doesn't give up the benefits of the desugaring approach. We need more testing but I wanted to get something up for people to look at / we can catch up on testing after the feature freeze if needed. Fixes #4437.
Because of how casts are inserted in the func code, this required some tweaking to casts: - The empty array cast needs to be able to look through a Set in order to be efficient - The empty set to object cast needs to be able to look through a Set in order to not generate an IR cast that messes things up because it doesn't provide a source and thus causes the #3030 overlay issue to pop up.
Follow-up tasks:
|
bef861e
to
97e265e
Compare
edb/edgeql/compiler/expr.py
Outdated
def _compile_dml_ifelse( | ||
ir: irast.Set, *, ctx: context.ContextLevel) -> irast.Set: | ||
"""Transform an IF/ELSE that contains DML into FOR loops | ||
|
||
The basic approach is to extract the pieces from the if/then/else and | ||
rewrite them into: | ||
for b in COND union ( | ||
{ | ||
(for _ in (select () filter b) union (IF_BRANCH)), | ||
(for _ in (select () filter not b) union (ELSE_BRANCH)), | ||
} | ||
) | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting approach. At first it seemed really hacky, but when thinking about it, this will probably compile to very similar SQL as it would if implemented iterator expressions in pg compiler for IfElse too.
Can you post some generated SQL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is select if <bool>$0 then (insert X { foo := "!" }) else {};
:
WITH
"iter~1" AS
((
FROM LATERAL (
SELECT/*<pg.SelectStmt at 0x7fe9b4544e10>*/
($1)::bool AS "expr~12_value~1",
edgedb.uuid_generate_v4() AS "expr~12_identity~1"
) AS "bool~2"
SELECT/*<pg.SelectStmt at 0x7fe9b447e7d0>*/
"bool~2"."expr~12_identity~1" AS "bool_identity~1",
"bool~2"."expr~12_value~1" AS "bool_value~1"
)),
"iter~4" AS
((
FROM "iter~1" AS "iter~3"
CROSS JOIN LATERAL (
FROM LATERAL (
SELECT/*<pg.SelectStmt at 0x7fe9b44a93d0>*/
"iter~3"."bool_identity~1" AS "bool_identity~2"
WHERE
("iter~3"."bool_identity~1" IS NOT NULL)
) AS "expr-15~2"
SELECT/*<pg.SelectStmt at 0x7fe9b44ab750>*/
edgedb.uuid_generate_v4() AS "tuple_identity~1"
WHERE
("iter~3"."bool_value~1" AND
("expr-15~2"."bool_identity~2" IS NOT NULL))
) AS "tuple<>~2"
SELECT/*<pg.SelectStmt at 0x7fe9b447f310>*/
"iter~3"."bool_identity~1" AS "bool_identity~3",
"tuple<>~2"."tuple_identity~1" AS "tuple_identity~2"
)),
"ins_contents~1" AS
((
FROM "iter~4" AS "iter~6"
CROSS JOIN LATERAL (
SELECT/*<pg.SelectStmt at 0x7fe9b44cb590>*/
($2)::text AS "expr~1_value~1"
) AS "str_foo~2"
CROSS JOIN LATERAL (
FROM LATERAL (
FROM LATERAL (
FROM LATERAL (
SELECT/*<pg.SelectStmt at 0x7fe9b4532fd0>*/
edgedb.uuid_generate_v1mc() AS "expr~3_value~1"
WHERE
("iter~6"."tuple_identity~2" IS NOT NULL)
) AS "q~1"
SELECT/*<pg.SelectStmt at 0x7fe9b450aa10>*/
"q~1"."expr~3_value~1" AS "expr~3_value~2"
) AS "expr-3~2"
SELECT/*<pg.SelectStmt at 0x7fe9b44e6ad0>*/
"expr-3~2"."expr~3_value~2" AS "expr~3_value~3"
) AS "expr-4~2"
SELECT/*<pg.SelectStmt at 0x7fe9b44e5250>*/
"expr-4~2"."expr~3_value~3" AS "expr~4_value~1"
) AS "uuid_id~2"
SELECT/*<pg.SelectStmt at 0x7fe9b44c8cd0>*/
"str_foo~2"."expr~1_value~1" AS "foo",
"uuid_id~2"."expr~4_value~1" AS id,
"iter~6"."tuple_identity~2" AS "tuple_identity~3"
)),
"ins~1" AS
(INSERT INTO edgedbpub."default::X" AS "X~1"
("foo", id)
((
FROM "ins_contents~1" AS "ins_contents~2"
SELECT/*<pg.SelectStmt at 0x7fe9b44e5210>*/
"foo",
id
))
RETURNING
"X~1".id AS "X_identity~1"),
"m~1" AS
((
FROM "iter~4" AS "iter~7"
CROSS JOIN "ins~1" AS "ins~2"
JOIN "ins_contents~1" AS "ins_contents~2"
ON (("ins~2"."X_identity~1" = "ins_contents~2".id) AND
("iter~7"."tuple_identity~2" = "ins_contents~2"."tuple_identity~3"))
SELECT/*<pg.SelectStmt at 0x7fe9b450a310>*/
"iter~7"."tuple_identity~2" AS "tuple_identity~4",
"ins~2"."X_identity~1" AS "X_value~1"
))
FROM (
FROM LATERAL (
FROM "iter~1" AS "iter~2"
CROSS JOIN LATERAL (
FROM "iter~4" AS "iter~5"
CROSS JOIN LATERAL (
FROM LATERAL (
FROM LATERAL (
FROM "m~1" AS "m~2"
SELECT/*<pg.SelectStmt at 0x7fe9b454ec50>*/
"m~2"."X_value~1" AS "X_value~2",
"iter~5"."tuple_identity~2" AS "tuple_identity~5"
WHERE
("iter~5"."tuple_identity~2" = "m~2"."tuple_identity~4")
) AS "expr-10~2"
SELECT/*<pg.SelectStmt at 0x7fe9b44b3590>*/
"expr-10~2"."X_value~2" AS "expr~10_value~1",
json_build_object('id', "expr-10~2"."X_value~2") AS "expr~10_serialized~1"
WHERE
("iter~5"."tuple_identity~2" = "expr-10~2"."tuple_identity~5")
) AS "expr-11~2"
SELECT/*<pg.SelectStmt at 0x7fe9b44b0e10>*/
"expr-11~2"."expr~10_value~1" AS "expr~11_value~1",
"expr-11~2"."expr~10_serialized~1" AS "expr~11_serialized~1"
) AS "expr-16~2"
SELECT/*<pg.SelectStmt at 0x7fe9b4544990>*/
"expr-16~2"."expr~11_value~1" AS "expr~16_value~1",
"expr-16~2"."expr~11_serialized~1" AS "expr~16_serialized~1"
WHERE
("iter~2"."bool_identity~1" = "iter~5"."bool_identity~3")
) AS "expr-17~2"
SELECT/*<pg.SelectStmt at 0x7fe9b447d810>*/
"expr-17~2"."expr~16_value~1" AS "expr~17_value~1",
"expr-17~2"."expr~16_serialized~1" AS "expr~17_serialized~1"
) AS "expr-18~2"
SELECT/*<pg.SelectStmt at 0x7fe9b447cf90>*/
to_json("expr-18~2"."expr~17_serialized~1") AS "expr~18_serialized~1"
) AS "aggw~1"
SELECT/*<pg.SelectStmt at 0x7fe9b447c590>*/
COALESCE(json_agg("expr~18_serialized~1"), '[]')
iter~1
is the outer iter that computes the condition and iter~4
uses it to generate an iterator that fires once if the condition is true and zero times otherwise.
The big suboptimality in this generated code is the iterators get joined into the main result query, when really it shouldn't need to be.
This is because only DML and the IN
clauses of DML-containing FOR
s get compiled into CTEs, and a FOR
might contain both a DML and some non-DML that references the iterator, like:
for x in expr union (x, (insert Foo { x := x}))
It should be possible to do an optimization to elide joining in the iterator CTEs in the main body when they aren't needed, though. We could probably do something hacky where these desugaring passes set a flag in the iterator, but it should also be possible to do an analysis to see if the iterator is used outside of DML in the body. (But probably that optimization isn't happening for 4.x)
The basic approach is to extract the pieces from the ?? and rewrite them into: for optional x in (LHS,) union ( { x.0, (for _ in (select () filter not exists x) union (RHS)), } ) Optional for is needed because the LHS needs to be bound in a for in order to get put in a CTE and only executed once, but the RHS needs to be dependent on the LHS being empty. We hackily wrap the LHS in a 1-ary tuple and then project it back out because the OPTIONAL FOR implementation doesn't properly handle object-type iterators. OPTIONAL FOR relies on having a non-NULL identity ref but objects use their actual id, which will be NULL. Follow-up to #6181. Like #6181, needs more testing. One of the tests I wrote failed because of #6057, which I'm working on fixing also.
The basic approach is to extract the pieces from the ?? and rewrite them into: for optional x in (LHS,) union ( { x.0, (for _ in (select () filter not exists x) union (RHS)), } ) Optional for is needed because the LHS needs to be bound in a for in order to get put in a CTE and only executed once, but the RHS needs to be dependent on the LHS being empty. We hackily wrap the LHS in a 1-ary tuple and then project it back out because the OPTIONAL FOR implementation doesn't properly handle object-type iterators. OPTIONAL FOR relies on having a non-NULL identity ref but objects use their actual id, which will be NULL. Follow-up to #6181. Like #6181, needs more testing. One of the tests I wrote failed because of #6057, which I'm working on fixing also.
As we've discussed (and, tragically, recommended to users), we can
rewrite the conditionals into for loops:
The main fiddly part is preserving the correct
cardinality/multiplicity inference. I did this by adding a
card_inference_override field to Set that specifies another set that
should determine the cardinality.
I don't love this, but I haven't thought of a cleaner approach that
doesn't give up the benefits of the desugaring approach.
We need more testing but I wanted to get something up for people to
look at / we can catch up on testing after the feature freeze if
needed.
Fixes #4437.