Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: Support AS MATERIALIZED option in CTEs #45863

Closed
andy-kimball opened this issue Mar 9, 2020 · 9 comments · Fixed by #47341
Closed

opt: Support AS MATERIALIZED option in CTEs #45863

andy-kimball opened this issue Mar 9, 2020 · 9 comments · Fixed by #47341
Assignees
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) good first issue

Comments

@andy-kimball
Copy link
Contributor

PG 12 supports AS MATERIALIZED and AS NOT MATERIALIZED syntax for user controller of CTE materialization. For example:

WITH x AS MATERIALIZED (SELECT * FROM very_large_table) SELECT * FROM x;

With this syntax, the user can override the optimizer's default decision to not materialize that table (since it's only referenced once).

WITH x AS NOT MATERIALIZED (SELECT * FROM very_small_table) SELECT * FROM x UNION ALL SELECT * FROM x;

Similarly, this syntax allows the user to override the optimizer's default decision to materialize the table (since it's referenced multiple times).

Adding this option requires adding a flag to the generated WithPrivate struct, and then updating the parser to parse the new syntax, and set the flag. The optimizer's InlineWith rule (see with.opt) then changes to take into account the new flag.

@andy-kimball andy-kimball added this to Triage in BACKLOG, NO NEW ISSUES: SQL Optimizer via automation Mar 9, 2020
@awoods187 awoods187 added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Mar 9, 2020
@Anthuang
Copy link
Contributor

Anthuang commented Apr 7, 2020

First time here, I'm happy to attempt this. Do you have more info to help me start out?

@RaduBerinde
Copy link
Member

Thanks @Anthuang. Here's a list of things that need to be done:

  • add a new field to tree.CTE and add syntax for "WITH AS [NOT] MATERIALIZED" to pkg/sql/parser.sql.y, with corresponding tests to parse_test.go.

  • add the same field to the optimizer with operator: WithPrivate

  • modify the optbuilder code which constructs the With: https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/opt/optbuilder/select.go#L752 to populate the new field. The opt/memo/expr_format.go code should be updated to show this field when it is set, and then some testcases under optbuilder/with can be added.

  • understand the InlineWith rule and how it works. Run make build and then look at pkg/sql/opt/norm/factory.og.go to see the generated code for that rule. Then look at the corresponding testcases in sql/opt/norm/testdata/rules for the rule.

  • modify the rule to check the new field; if "AS MATERIALIZED" was used, we should let the rule run regardless of what CanInlineWith checks currently; if "AS NOT MATERIALIZED" was used, we should not let the rule run.

@Anthuang
Copy link
Contributor

Anthuang commented Apr 10, 2020

Thanks for the detailed steps!

Do you have any docs for how these tests work (syntax, etc.)?

And also what should I be achieving with changing opt/memo/expr_format.go?

@RaduBerinde
Copy link
Member

Each testcase is a directive (usually norm; in the tests you'd add it would be either norm expect=InlineWith or norm format=show-all expect-not=InlineWith), then a SQL statement, then the ---- delimiter. The output is the normalized tree. You can automatically generate the output of any new tests by passing TESTFLAGS=-rewrite to make test.

And also what should I be achieving with changing opt/memo/expr_format.go?

Just printing the new field that you add so we can make sure it's set as expected.

@Anthuang
Copy link
Contributor

Anthuang commented Apr 10, 2020

I'm really close. I just can't figure out why the parser tests I wrote aren't passing. It seems as if the pretty printing function is omitting "[NOT] MATERIALIZE", and thus VerifyStatementPrettyRoundtrip fails. Do you know if there's anything else I need to add?

I've also included logic to parse the added "[NOT] MATERIALIZE" syntax in func (node *With) Format(ctx *FmtCtx).

@RaduBerinde
Copy link
Member

I assume you modified the Format method for a tree struct. There is a pretty printing code that needs to be updated as well - there is a file pretty.go or similar, grep for the type in there. IMO these should live next to Format but that's for another day :)

@Anthuang
Copy link
Contributor

Ah I missed that, thanks!

@RaduBerinde
Copy link
Member

As far as I can tell, postgres still forces materialization if the cte contains volatile functions:

radu=> explain with cte as not materialized (select random()) select * from cte union select * from cte;
                                QUERY PLAN                                 
---------------------------------------------------------------------------
 Unique  (cost=0.09..0.10 rows=2 width=8)
   CTE cte
     ->  Result  (cost=0.00..0.01 rows=1 width=8)
   ->  Sort  (cost=0.08..0.09 rows=2 width=8)
         Sort Key: cte.random
         ->  Append  (cost=0.00..0.07 rows=2 width=8)
               ->  CTE Scan on cte  (cost=0.00..0.02 rows=1 width=8)
               ->  CTE Scan on cte cte_1  (cost=0.00..0.02 rows=1 width=8)
(8 rows)

radu=> explain with cte as not materialized (select 1) select * from cte union select * from cte;
                         QUERY PLAN                         
------------------------------------------------------------
 Unique  (cost=0.06..0.07 rows=2 width=4)
   ->  Sort  (cost=0.06..0.07 rows=2 width=4)
         Sort Key: (1)
         ->  Append  (cost=0.00..0.05 rows=2 width=4)
               ->  Result  (cost=0.00..0.01 rows=1 width=4)
               ->  Result  (cost=0.00..0.01 rows=1 width=4)

Not sure if we want the same semantics. In our case it may be useful to allow overriding our side-effect logic when it is too conservative. @andy-kimball what do you think?

@andy-kimball
Copy link
Contributor Author

Sorry, I didn't see this question until now.

I agree that it seems better to allow a forced override rather than ignoring explicit instructions from the user, as Postgres does.

@RaduBerinde RaduBerinde moved this from Triage to New features in BACKLOG, NO NEW ISSUES: SQL Optimizer Apr 18, 2020
@craig craig bot closed this as completed in a661d13 Apr 24, 2020
BACKLOG, NO NEW ISSUES: SQL Optimizer automation moved this from New features to Done Apr 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) good first issue
Development

Successfully merging a pull request may close this issue.

4 participants