Skip to content

Commit

Permalink
Update CBO docs to list more supported statements
Browse files Browse the repository at this point in the history
A bit more work to address #3998.

Summary of changes:

- Edit the 'Supported statements' section of the ['Cost-based
  Optimizer'][1] page as follows:

  - Add DELETE per cockroachdb/cockroach#34522

  - Add `INSERT .. ON CONFLICT` variants per cockroachdb/cockroach#33339

  - Add `SELECT`, `VALUES`, and `UNION` statements that do not include
    window functions

  - Add FILTER clause on aggregate functions per
    cockroachdb/cockroach#34077

  - Remove `experimental_optimizer_updates` cluster setting

[1]: http://www.cockroachlabs.com/docs/v2.2/cost-based-optimizer.html
  • Loading branch information
rmloveland committed Feb 11, 2019
1 parent 170bd3e commit 7fbce7b
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 21 deletions.
3 changes: 1 addition & 2 deletions _includes/v2.2/sql/settings/settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,6 @@
<tr><td><code>sql.defaults.default_int_size</code></td><td>integer</td><td><code>8</code></td><td>the size, in bytes, of an INT type</td></tr>
<tr><td><code>sql.defaults.distsql</code></td><td>enumeration</td><td><code>1</code></td><td>default distributed SQL execution mode [off = 0, auto = 1, on = 2, 2.0-off = 3, 2.0-auto = 4]</td></tr>
<tr><td><code>sql.stats.experimental_automatic_collection.enabled</code></td><td>boolean</td><td><code>true</code></td><td>If <code>true</code>, turn on the experimental <a href="create-statistics.html#automatic-table-statistics">automatic statistics collection</a>.</td></tr>
<tr><td><code>sql.defaults.experimental_optimizer_mutations</code></td><td>boolean</td><td><code>false</code></td><td>default experimental_optimizer_mutations mode</td></tr>
<tr><td><code>sql.defaults.experimental_vectorize</code></td><td>enumeration</td><td><code>0</code></td><td>default experimental_vectorize mode [off = 0, on = 1, always = 2]</td></tr>
<tr><td><code>sql.defaults.optimizer</code></td><td>enumeration</td><td><code>1</code></td><td>default cost-based optimizer mode [off = 0, on = 1, local = 2]</td></tr>
<tr><td><code>sql.defaults.results_buffer.size</code></td><td>byte size</td><td><code>16 KiB</code></td><td>size of the buffer that accumulates results for a statement or a batch of statements before they are sent to the client. Note that auto-retries generally only happen while no results have been delivered to the client, so reducing this size can increase the number of retriable errors a client receives. On the other hand, increasing the buffer size can increase the delay until the client receives the first result row. Updating the setting only affects new connections. Setting to 0 disables any buffering.</td></tr>
Expand Down Expand Up @@ -92,6 +91,6 @@
<tr><td><code>trace.debug.enable</code></td><td>boolean</td><td><code>false</code></td><td>if set, traces for recent requests can be seen in the /debug page</td></tr>
<tr><td><code>trace.lightstep.token</code></td><td>string</td><td><code></code></td><td>if set, traces go to Lightstep using this token</td></tr>
<tr><td><code>trace.zipkin.collector</code></td><td>string</td><td><code></code></td><td>if set, traces go to the given Zipkin instance (example: '127.0.0.1:9411'); ignored if trace.lightstep.token is set.</td></tr>
<tr><td><code>version</code></td><td>custom validation</td><td><code>2.1-4</code></td><td>set the active cluster version in the format '<major>.<minor>'.</td></tr>
<tr><td><code>version</code></td><td>custom validation</td><td><code>2.1-4</code></td><td>set the active cluster version in the format '{major}.{minor}'.</td></tr>
</tbody>
</table>
33 changes: 14 additions & 19 deletions v2.2/cost-based-optimizer.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,13 @@ The most important factor in determining the quality of a plan is cardinality (i

## View query plan

To see whether a query will be run with the cost-based optimizer, run the query with [`EXPLAIN (OPT)`](explain.html#opt-option). The `OPT` option displays a query plan tree, along with some information that was used to plan the query. If the query is unsupported (i.e., it returns an error like `pq: unsupported statement: *tree.Insert` or `pq: aggregates with FILTER are not supported yet`), the query will not be run with the cost-based optimizer and will be run with the legacy heuristic planner.
To see whether a query will be run with the cost-based optimizer, run the query with [`EXPLAIN (OPT)`](explain.html). The `OPT` option displays a query plan tree, along with some information that was used to plan the query. If the query is unsupported (i.e., it returns an error message that starts with e.g., `pq: unsupported statement` or `pq: aggregates with FILTER are not supported yet`), the query will not be run with the cost-based optimizer and will be run with the legacy heuristic planner.

For example, the following query (which uses [CockroachDB's TPC-H data set](https://github.com/cockroachdb/cockroach/tree/b1a57102d8e99b301b74c97527c1b8ffd4a4f3f1/pkg/workload/tpch)) returns the query plan tree, which means that it will be run with the cost-based optimizer:

{% include copy-clipboard.html %}
~~~ sql
> EXPLAIN(OPT) SELECT l_shipmode, avg(l_extendedprice) from lineitem GROUP BY l_shipmode;
> EXPLAIN (OPT) SELECT l_shipmode, avg(l_extendedprice) from lineitem GROUP BY l_shipmode;
~~~

~~~
Expand All @@ -58,32 +58,27 @@ group-by
(16 rows)
~~~

In contrast, this query returns `pq: unsupported statement: *tree.Insert`, which means that it will use the legacy heuristic planner instead of the cost-based optimizer:

{% include copy-clipboard.html %}
~~~ sql
> EXPLAIN (OPT) INSERT INTO l_shipmode VALUES ("truck");
~~~

~~~
pq: unsupported statement: *tree.Insert
~~~
In contrast, queries that are not supported by the cost-based optimizer return errors that begin with the string `pq: unsupported statement: ...` or specific messages like `pq: aggregates with FILTER are not supported yet`. Such queries will use the legacy heuristic planner instead of the cost-based optimizer.

## Types of statements supported by the cost-based optimizer

The cost-based optimizer supports most SQL statements. Specifically, the following types of statements are supported:

- [`CREATE TABLE`](create-table.html)
- [`INSERT`](insert.html)
- [Sequences](create-sequence.html)
- [Views](views.html)

The following additional statements are supported by the optimizer if you set the `experimental_optimizer_updates` [cluster setting](set-cluster-setting.html) to `true`:

- [`UPDATE`](update.html)
- [`INSERT`](insert.html), including:
- `INSERT .. ON CONFLICT DO NOTHING`
- `INSERT .. ON CONFLICT .. DO UPDATE`
- [`UPSERT`](upsert.html)
- [`DELETE`](delete.html)
- `FILTER` clauses on [aggregate functions](functions-and-operators.html#aggregate-functions)
- [Sequences](create-sequence.html)
- [Views](views.html)
- All [`SELECT`](select.html) statements that do not include window functions
- All `UNION` statements that do not include window functions
- All `VALUES` statements that do not include window functions

For instructions showing how to check whether a particular query will be run with the cost-based optimizer, see the [View query plan](#view-query-plan) section.
This is not meant to be an exhaustive list. To check whether a particular query will be run with the cost-based optimizer, follow the instructions in the [View query plan](#view-query-plan) section.

## Table statistics

Expand Down

0 comments on commit 7fbce7b

Please sign in to comment.