Skip to content

Commit

Permalink
Merge #34522 #34529
Browse files Browse the repository at this point in the history
34522: opt: Prune Update and Upsert input columns r=andy-kimball a=andy-kimball

Prune input columns that are not needed by Update and Upsert operators.
Needed columns include returned columns and columns needed to formulate
KV Put and Delete operations to implement the mutations. All other
columns can be pruned.

This commit contains new code to enable the Upsert fast path with the
CBO. The fast path uses the KV Put method to blindly insert new rows
or overwrite existing rows. Because any existing data is ignored, it is
not necessary to fetch existing rows, as is normally required. This
also allows a much simpler logical plan.

After this change, the optimizer updates/upserts no longer need to stay
behind a feature flag, as all fast paths should now work at least as
well as they do with the heuristic planner.

Release note (sql change): Delete, Update, and Upsert statements are
now planned by the cost-based optimizer.

34529: sql: enable automatic statistics by default r=rytaft a=rytaft

This commit changes the default value of the cluster setting
`sql.stats.experimental_automatic_collection.enabled` to true.
As a result, automatic statistics collection is now enabled by
default. It can still be disabled by setting
`sql.stats.experimental_automatic_collection.enabled=false`.

Release note (sql change): Enabled automatic statistics collection.

Co-authored-by: Andrew Kimball <andyk@cockroachlabs.com>
Co-authored-by: Rebecca Taft <becca@cockroachlabs.com>
  • Loading branch information
3 people committed Feb 5, 2019
3 parents 0529ceb + cabe058 + cf6eb36 commit e3138ac
Show file tree
Hide file tree
Showing 73 changed files with 2,058 additions and 895 deletions.
3 changes: 1 addition & 2 deletions docs/generated/settings/settings.html
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,6 @@
<tr><td><code>server.web_session_timeout</code></td><td>duration</td><td><code>168h0m0s</code></td><td>the duration that a newly created web session will be valid</td></tr>
<tr><td><code>sql.defaults.default_int_size</code></td><td>integer</td><td><code>8</code></td><td>the size, in bytes, of an INT type</td></tr>
<tr><td><code>sql.defaults.distsql</code></td><td>enumeration</td><td><code>1</code></td><td>default distributed SQL execution mode [off = 0, auto = 1, on = 2]</td></tr>
<tr><td><code>sql.defaults.experimental_optimizer_mutations</code></td><td>boolean</td><td><code>false</code></td><td>default experimental_optimizer_mutations mode</td></tr>
<tr><td><code>sql.defaults.experimental_vectorize</code></td><td>enumeration</td><td><code>0</code></td><td>default experimental_vectorize mode [off = 0, on = 1, always = 2]</td></tr>
<tr><td><code>sql.defaults.optimizer</code></td><td>enumeration</td><td><code>1</code></td><td>default cost-based optimizer mode [off = 0, on = 1, local = 2]</td></tr>
<tr><td><code>sql.defaults.results_buffer.size</code></td><td>byte size</td><td><code>16 KiB</code></td><td>default size of the buffer that accumulates results for a statement or a batch of statements before they are sent to the client. This can be overridden on an individual connection with the 'results_buffer_size' parameter. Note that auto-retries generally only happen while no results have been delivered to the client, so reducing this size can increase the number of retriable errors a client receives. On the other hand, increasing the buffer size can increase the delay until the client receives the first result row. Updating the setting only affects new connections. Setting to 0 disables any buffering.</td></tr>
Expand All @@ -84,7 +83,7 @@
<tr><td><code>sql.metrics.statement_details.threshold</code></td><td>duration</td><td><code>0s</code></td><td>minimum execution time to cause statistics to be collected</td></tr>
<tr><td><code>sql.parallel_scans.enabled</code></td><td>boolean</td><td><code>true</code></td><td>parallelizes scanning different ranges when the maximum result size can be deduced</td></tr>
<tr><td><code>sql.query_cache.enabled</code></td><td>boolean</td><td><code>false</code></td><td>enable the query cache</td></tr>
<tr><td><code>sql.stats.experimental_automatic_collection.enabled</code></td><td>boolean</td><td><code>false</code></td><td>experimental automatic statistics collection mode</td></tr>
<tr><td><code>sql.stats.experimental_automatic_collection.enabled</code></td><td>boolean</td><td><code>true</code></td><td>experimental automatic statistics collection mode</td></tr>
<tr><td><code>sql.tablecache.lease.refresh_limit</code></td><td>integer</td><td><code>50</code></td><td>maximum number of tables to periodically refresh leases for</td></tr>
<tr><td><code>sql.trace.log_statement_execute</code></td><td>boolean</td><td><code>false</code></td><td>set to true to enable logging of executed statements</td></tr>
<tr><td><code>sql.trace.session_eventlog.enabled</code></td><td>boolean</td><td><code>false</code></td><td>set to true to enable session tracing</td></tr>
Expand Down
2 changes: 1 addition & 1 deletion pkg/server/updates_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -680,13 +680,13 @@ func TestReportUsage(t *testing.T) {
`[false,false,false] SET application_name = $1`,
`[false,false,false] SET application_name = DEFAULT`,
`[false,false,false] SET application_name = _`,
`[false,false,false] UPDATE _ SET _ = _ + _`,
`[true,false,false] CREATE TABLE _ (_ INT8, CONSTRAINT _ CHECK (_ > _))`,
`[true,false,false] INSERT INTO _ SELECT unnest(ARRAY[_, _, __more2__])`,
`[true,false,false] INSERT INTO _ VALUES (_), (__more2__)`,
`[true,false,false] INSERT INTO _ VALUES (length($1::STRING)), (__more1__)`,
`[true,false,false] INSERT INTO _(_, _) VALUES (_, _)`,
`[true,false,false] SELECT (_, _, __more2__) = (SELECT _, _, _, _ FROM _ LIMIT _)`,
`[true,false,false] UPDATE _ SET _ = _ + _`,
`[true,false,true] CREATE TABLE _ (_ INT8 PRIMARY KEY, _ INT8, INDEX (_) INTERLEAVE IN PARENT _ (_))`,
`[true,false,true] SELECT _ / $1`,
`[true,false,true] SELECT _ / _`,
Expand Down
7 changes: 3 additions & 4 deletions pkg/sql/descriptor_mutation_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -334,10 +334,9 @@ CREATE INDEX allidx ON t.test (k, v);
// The default value of "i" for column "i" is written.
afterInsert = [][]string{{"a", "z", "q"}, {"c", "x", "i"}}
if useUpsert {
// Update is not a noop for column "i". Column "i" gets updated
// with its default value (#9474).
afterUpdate = [][]string{{"a", "u", "i"}, {"c", "x", "i"}}
afterPKUpdate = [][]string{{"a", "u", "i"}, {"d", "x", "i"}}
// Update is not a noop for column "i".
afterUpdate = [][]string{{"a", "u", "q"}, {"c", "x", "i"}}
afterPKUpdate = [][]string{{"a", "u", "q"}, {"d", "x", "i"}}
} else {
// Update is a noop for column "i".
afterUpdate = [][]string{{"a", "u", "q"}, {"c", "x", "i"}}
Expand Down
12 changes: 0 additions & 12 deletions pkg/sql/exec_util.go
Original file line number Diff line number Diff line change
Expand Up @@ -127,14 +127,6 @@ var OptimizerClusterMode = settings.RegisterEnumSetting(
},
)

// OptimizerMutationsClusterMode controls the cluster default for when the cost-
// based optimizer is planning mutation statements.
var OptimizerMutationsClusterMode = settings.RegisterBoolSetting(
"sql.defaults.experimental_optimizer_mutations",
"default experimental_optimizer_mutations mode",
false,
)

// VectorizeClusterMode controls the cluster default for when automatic
// vectorization is enabled.
var VectorizeClusterMode = settings.RegisterEnumSetting(
Expand Down Expand Up @@ -1676,10 +1668,6 @@ func (m *sessionDataMutator) SetOptimizerMode(val sessiondata.OptimizerMode) {
m.data.OptimizerMode = val
}

func (m *sessionDataMutator) SetOptimizerMutations(val bool) {
m.data.OptimizerMutations = val
}

func (m *sessionDataMutator) SetSerialNormalizationMode(val sessiondata.SerialNormalizationMode) {
m.data.SerialNormalizationMode = val
}
Expand Down
22 changes: 12 additions & 10 deletions pkg/sql/logictest/logic.go
Original file line number Diff line number Diff line change
Expand Up @@ -389,6 +389,8 @@ type testClusterConfig struct {
overrideDistSQLMode string
// if non-empty, overrides the default experimental_vectorize mode.
overrideExpVectorize string
// if non-empty, overrides the default automatic statistics mode.
overrideAutoStats string
// if set, queries using distSQL processors that can fall back to disk do
// so immediately, using only their disk-based implementation.
distSQLUseDisk bool
Expand Down Expand Up @@ -427,18 +429,18 @@ var logicTestConfigs = []testClusterConfig{
serverVersion: roachpb.Version{Major: 1, Minor: 1},
disableUpgrade: true,
},
{name: "local-opt", numNodes: 1, overrideDistSQLMode: "off", overrideOptimizerMode: "on"},
{name: "local-opt", numNodes: 1, overrideDistSQLMode: "off", overrideOptimizerMode: "on", overrideAutoStats: "false"},
{name: "local-parallel-stmts", numNodes: 1, parallelStmts: true, overrideDistSQLMode: "off", overrideOptimizerMode: "off"},
{name: "local-vec", numNodes: 1, overrideOptimizerMode: "off", overrideExpVectorize: "on"},
{name: "fakedist", numNodes: 3, useFakeSpanResolver: true, overrideDistSQLMode: "on", overrideOptimizerMode: "off"},
{name: "fakedist-opt", numNodes: 3, useFakeSpanResolver: true, overrideDistSQLMode: "on", overrideOptimizerMode: "on"},
{name: "fakedist-opt", numNodes: 3, useFakeSpanResolver: true, overrideDistSQLMode: "on", overrideOptimizerMode: "on", overrideAutoStats: "false"},
{name: "fakedist-metadata", numNodes: 3, useFakeSpanResolver: true, overrideDistSQLMode: "on", overrideOptimizerMode: "off",
distSQLMetadataTestEnabled: true, skipShort: true},
{name: "fakedist-disk", numNodes: 3, useFakeSpanResolver: true, overrideDistSQLMode: "on", overrideOptimizerMode: "off",
distSQLUseDisk: true, skipShort: true},
{name: "5node-local", numNodes: 5, overrideDistSQLMode: "off", overrideOptimizerMode: "off"},
{name: "5node-dist", numNodes: 5, overrideDistSQLMode: "on", overrideOptimizerMode: "off"},
{name: "5node-dist-opt", numNodes: 5, overrideDistSQLMode: "on", overrideOptimizerMode: "on"},
{name: "5node-dist-opt", numNodes: 5, overrideDistSQLMode: "on", overrideOptimizerMode: "on", overrideAutoStats: "false"},
{name: "5node-dist-metadata", numNodes: 5, overrideDistSQLMode: "on", distSQLMetadataTestEnabled: true,
skipShort: true, overrideOptimizerMode: "off"},
{name: "5node-dist-disk", numNodes: 5, overrideDistSQLMode: "on", distSQLUseDisk: true, skipShort: true,
Expand Down Expand Up @@ -933,13 +935,6 @@ func (t *logicTest) setUser(user string) func() {
if _, err := db.Exec(fmt.Sprintf("SET OPTIMIZER = %s;", optMode)); err != nil {
t.Fatal(err)
}

// Use the cost-based-optimizer for planning mutation statements.
if optMode == "on" {
if _, err := db.Exec("SET experimental_optimizer_mutations = true"); err != nil {
t.Fatal(err)
}
}
}
// The default value for extra_float_digits assumed by tests is
// 0. However, lib/pq by default configures this to 2 during
Expand Down Expand Up @@ -1075,6 +1070,13 @@ func (t *logicTest) setup(cfg testClusterConfig) {
t.Fatal(err)
}
}
if cfg.overrideAutoStats != "" {
if _, err := t.cluster.ServerConn(0).Exec(
"SET CLUSTER SETTING sql.stats.experimental_automatic_collection.enabled = $1::bool", cfg.overrideAutoStats,
); err != nil {
t.Fatal(err)
}
}

// db may change over the lifetime of this function, with intermediate
// values cached in t.clients and finally closed in t.close().
Expand Down
7 changes: 6 additions & 1 deletion pkg/sql/logictest/testdata/logic_test/alter_table
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ ALTER TABLE t ADD CONSTRAINT foo UNIQUE (b)
query TTTTRT
SELECT job_type, description, user_name, status, fraction_completed, error
FROM crdb_internal.jobs
WHERE job_type = 'SCHEMA CHANGE'
ORDER BY created DESC
LIMIT 1
----
Expand Down Expand Up @@ -71,6 +72,7 @@ ALTER TABLE t ADD CONSTRAINT bar UNIQUE (c)
query TTTTTRT
SELECT job_type, regexp_replace(description, 'JOB \d+', 'JOB ...'), user_name, status, running_status, fraction_completed::decimal(10,2), error
FROM crdb_internal.jobs
WHERE job_type = 'SCHEMA CHANGE'
ORDER BY created DESC
LIMIT 2
----
Expand Down Expand Up @@ -181,6 +183,7 @@ DROP INDEX foo CASCADE
query TTTTTRT
SELECT job_type, description, user_name, status, running_status, fraction_completed, error
FROM crdb_internal.jobs
WHERE job_type = 'SCHEMA CHANGE'
ORDER BY created DESC
LIMIT 1
----
Expand Down Expand Up @@ -264,6 +267,7 @@ DROP INDEX t@t_f_idx
query TTTTTRT
SELECT job_type, description, user_name, status, running_status, fraction_completed, error
FROM crdb_internal.jobs
WHERE job_type = 'SCHEMA CHANGE'
ORDER BY created DESC
LIMIT 1
----
Expand Down Expand Up @@ -650,7 +654,8 @@ SELECT count(distinct a) FROM impure

# No orphaned schema change jobs.
query I
SELECT count(*) FROM crdb_internal.jobs WHERE status = 'pending' OR status = 'started'
SELECT count(*) FROM crdb_internal.jobs
WHERE job_type = 'SCHEMA CHANGE' AND status = 'pending' OR status = 'started'
----
0

Expand Down
4 changes: 4 additions & 0 deletions pkg/sql/logictest/testdata/logic_test/delete
Original file line number Diff line number Diff line change
Expand Up @@ -231,6 +231,10 @@ COMMIT

subtest regression_33361

# Disable automatic stats to avoid flakiness (sometimes causes retry errors).
statement ok
SET CLUSTER SETTING sql.stats.experimental_automatic_collection.enabled = false

statement ok
CREATE TABLE t33361(x INT PRIMARY KEY, y INT UNIQUE, z INT); INSERT INTO t33361 VALUES (1, 2, 3)

Expand Down
4 changes: 4 additions & 0 deletions pkg/sql/logictest/testdata/logic_test/distsql_stats
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# LogicTest: 5node-dist 5node-dist-opt 5node-dist-metadata

# Disable automatic stats.
statement ok
SET CLUSTER SETTING sql.stats.experimental_automatic_collection.enabled = false

statement ok
CREATE TABLE data (a INT, b INT, c FLOAT, d DECIMAL, PRIMARY KEY (a, b, c, d), INDEX c_idx (c, d))

Expand Down
2 changes: 1 addition & 1 deletion pkg/sql/logictest/testdata/logic_test/drop_table
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ statement ok
DROP TABLE a

query TT
SELECT status, running_status FROM [SHOW JOBS]
SELECT status, running_status FROM [SHOW JOBS] WHERE job_type = 'SCHEMA CHANGE'
----
running waiting for GC TTL

Expand Down
1 change: 1 addition & 0 deletions pkg/sql/logictest/testdata/logic_test/event_log
Original file line number Diff line number Diff line change
Expand Up @@ -342,6 +342,7 @@ SELECT "targetID", "reportingID", "info"
FROM system.eventlog
WHERE "eventType" = 'set_cluster_setting'
AND info NOT LIKE '%version%' AND info NOT LIKE '%sql.defaults.distsql%' AND info NOT LIKE '%cluster.secret%'
AND info NOT LIKE '%sql.stats.experimental_automatic_collection.enabled%'
ORDER BY "timestamp"
----
0 1 {"SettingName":"diagnostics.reporting.enabled","Value":"true","User":"root"}
Expand Down
4 changes: 4 additions & 0 deletions pkg/sql/logictest/testdata/logic_test/fk
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# LogicTest: local local-opt local-parallel-stmts fakedist fakedist-opt fakedist-metadata

# Disable automatic stats to avoid flakiness.
statement ok
SET CLUSTER SETTING sql.stats.experimental_automatic_collection.enabled = false

statement ok
CREATE TABLE customers (id INT PRIMARY KEY, email STRING UNIQUE)

Expand Down
17 changes: 0 additions & 17 deletions pkg/sql/logictest/testdata/logic_test/optimizer
Original file line number Diff line number Diff line change
Expand Up @@ -90,10 +90,6 @@ SELECT * FROM tview
3 30
4 40

query I rowsort
SELECT job_id FROM crdb_internal.jobs
----

statement ok
SET OPTIMIZER = ALWAYS

Expand All @@ -115,16 +111,3 @@ SELECT * FROM test.t
2 20
3 30
4 40

# Test the experimental_optimizer_mutations flag.
statement ok
SET experimental_optimizer_mutations = false

statement error pq: no data source matches prefix: t
UPDATE t SET v=(SELECT v+1 FROM t AS t2 WHERE t2.k=t.k)

statement ok
SET experimental_optimizer_mutations = true

statement ok
UPDATE t SET v=(SELECT v+1 FROM t AS t2 WHERE t2.k=t.k)
5 changes: 2 additions & 3 deletions pkg/sql/logictest/testdata/logic_test/pg_catalog
Original file line number Diff line number Diff line change
Expand Up @@ -1326,7 +1326,7 @@ SELECT
FROM
pg_catalog.pg_settings
WHERE
name != 'optimizer' AND name != 'crdb_version' AND name != 'experimental_optimizer_mutations'
name != 'optimizer' AND name != 'crdb_version'
----
name setting category short_desc extra_desc vartype
application_name · NULL NULL NULL string
Expand Down Expand Up @@ -1372,7 +1372,7 @@ SELECT
FROM
pg_catalog.pg_settings
WHERE
name != 'optimizer' AND name != 'crdb_version' AND name != 'experimental_optimizer_mutations'
name != 'optimizer' AND name != 'crdb_version'
----
name setting unit context enumvals boot_val reset_val
application_name · NULL user NULL · ·
Expand Down Expand Up @@ -1429,7 +1429,6 @@ default_transaction_read_only NULL NULL NULL NULL NULL
distsql NULL NULL NULL NULL NULL
experimental_enable_zigzag_join NULL NULL NULL NULL NULL
experimental_force_split_at NULL NULL NULL NULL NULL
experimental_optimizer_mutations NULL NULL NULL NULL NULL
experimental_reorder_joins_limit NULL NULL NULL NULL NULL
experimental_serial_normalization NULL NULL NULL NULL NULL
experimental_vectorize NULL NULL NULL NULL NULL
Expand Down
4 changes: 4 additions & 0 deletions pkg/sql/logictest/testdata/logic_test/privileges_table
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# LogicTest: local local-opt local-parallel-stmts fakedist fakedist-opt fakedist-metadata

# Disable automatic stats to avoid flakiness.
statement ok
SET CLUSTER SETTING sql.stats.experimental_automatic_collection.enabled = false

# Test default table-level permissions.
# Default user is root.
statement ok
Expand Down
9 changes: 7 additions & 2 deletions pkg/sql/logictest/testdata/logic_test/schema_change_in_txn
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# LogicTest: local local-opt local-parallel-stmts fakedist fakedist-opt fakedist-metadata

# Disable automatic stats to avoid flakiness (sometimes causes retry errors).
statement ok
SET CLUSTER SETTING sql.stats.experimental_automatic_collection.enabled = false

subtest create_and_add_fk_in_same_txn

statement ok
Expand Down Expand Up @@ -718,7 +722,7 @@ query TTT
SELECT status,
running_status,
regexp_replace(description, 'ROLL BACK JOB \d+.*', 'ROLL BACK JOB') as description
FROM [SHOW JOBS] ORDER BY job_id DESC LIMIT 2
FROM [SHOW JOBS] WHERE job_type = 'SCHEMA CHANGE' ORDER BY job_id DESC LIMIT 2
----
running waiting for GC TTL ROLL BACK JOB
failed NULL ALTER TABLE test.public.customers ADD COLUMN i INT8 DEFAULT 5;ALTER TABLE test.public.customers ADD COLUMN j INT8 DEFAULT 4;ALTER TABLE test.public.customers ADD COLUMN l INT8 DEFAULT 3;ALTER TABLE test.public.customers ADD COLUMN m CHAR;ALTER TABLE test.public.customers ADD COLUMN n CHAR DEFAULT 'a';CREATE INDEX j_idx ON test.public.customers (j);CREATE INDEX l_idx ON test.public.customers (l);CREATE INDEX m_idx ON test.public.customers (m);CREATE UNIQUE INDEX i_idx ON test.public.customers (i);CREATE UNIQUE INDEX n_idx ON test.public.customers (n)
Expand Down Expand Up @@ -748,7 +752,8 @@ x 5 4 15 19
z 5 4 15 19

query TT
SELECT status, description FROM [SHOW JOBS] ORDER BY job_id DESC LIMIT 1
SELECT status, description FROM [SHOW JOBS]
WHERE job_type = 'SCHEMA CHANGE' ORDER BY job_id DESC LIMIT 1
----
succeeded ALTER TABLE test.public.customers ADD COLUMN i INT8 DEFAULT 5;ALTER TABLE test.public.customers ADD COLUMN j INT8 AS (i - 1) STORED;ALTER TABLE test.public.customers ADD COLUMN d INT8 DEFAULT 15, ADD COLUMN e INT8 AS (d + j) STORED

Expand Down
2 changes: 1 addition & 1 deletion pkg/sql/logictest/testdata/logic_test/show_source
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ UTF8 1
query TT colnames
SELECT *
FROM [SHOW ALL]
WHERE variable != 'optimizer' AND variable != 'crdb_version' AND variable != 'experimental_optimizer_mutations'
WHERE variable != 'optimizer' AND variable != 'crdb_version'
----
variable value
application_name ·
Expand Down
4 changes: 3 additions & 1 deletion pkg/sql/logictest/testdata/logic_test/system
Original file line number Diff line number Diff line change
Expand Up @@ -442,6 +442,7 @@ query T
SELECT name
FROM system.settings
WHERE name != 'sql.defaults.distsql'
AND name != 'sql.stats.experimental_automatic_collection.enabled'
ORDER BY name
----
cluster.secret
Expand All @@ -456,7 +457,8 @@ INSERT INTO system.settings (name, value) VALUES ('somesetting', 'somevalue')
query TT
SELECT name, value
FROM system.settings
WHERE name NOT IN ('version', 'sql.defaults.distsql', 'cluster.secret')
WHERE name NOT IN ('version', 'sql.defaults.distsql', 'cluster.secret',
'sql.stats.experimental_automatic_collection.enabled')
ORDER BY name
----
diagnostics.reporting.enabled true
Expand Down
2 changes: 1 addition & 1 deletion pkg/sql/logictest/testdata/logic_test/truncate
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ SELECT * FROM kview
----

query TT
SELECT status, running_status FROM [SHOW JOBS]
SELECT status, running_status FROM [SHOW JOBS] WHERE job_type = 'SCHEMA CHANGE'
----
running waiting for GC TTL

Expand Down
Loading

0 comments on commit e3138ac

Please sign in to comment.