Skip to content

sql: fast-path blind UPSERT does not have distinct check #146759

@michae2

Description

@michae2

(Thanks to @fabiog1901 for finding this!)

#45372 added a distinct check to UPSERT and INSERT ON CONFLICT that prevents the same row from being modified twice by the statement. But it looks like the fast-path blind upsert does not have this same distinct check, leading to inconsistent behavior for UPSERT statements.

Here's a repro using current tip of master (59c35f8):

CREATE TABLE t (id INT PRIMARY KEY, v STRING);

-- INSERT ON CONFLICT has a distinct check
EXPLAIN INSERT INTO t VALUES (5, '1'), (5, '2') ON CONFLICT (id) DO UPDATE SET v = EXCLUDED.v;
--   • upsert
--   │ into: t(id, v)
--   │ auto commit
--   │ arbiter indexes: t_pkey
--
--   └── • lookup join (left outer)
--       │ estimated row count: 2
--       │ table: t@t_pkey
--       │ equality: (column1) = (id)
--       │ equality cols are key
--
--       └── • distinct
--           │ estimated row count: 2
--           │ distinct on: column1
--           │ nulls are distinct
--           │ error on duplicate
--
--           └── • values
--                 size: 2 columns, 2 rows

INSERT INTO t VALUES (5, '1'), (5, '2') ON CONFLICT (id) DO UPDATE SET v = EXCLUDED.v;
-- ERROR: UPSERT or INSERT...ON CONFLICT command cannot affect row a second time
-- SQLSTATE: 21000

-- the fast-path blind upsert plan is missing this distinct check
EXPLAIN UPSERT INTO t VALUES (5, '1'), (5, '2');
--   • upsert
--   │ into: t(id, v)
--   │ auto commit
--
--   └── • values
--         size: 2 columns, 2 rows

UPSERT INTO t VALUES (5, '1'), (5, '2');
-- succeeds

SELECT * FROM t;
--   id | v
-- -----+----
--    5 | 2

CREATE INDEX ON t (v);

-- after we create a secondary index, the fast-path no longer applies
EXPLAIN UPSERT INTO t VALUES (6, '1'), (6, '2');
--   • upsert
--   │ into: t(id, v)
--   │ auto commit
--   │ arbiter indexes: t_pkey
--
--   └── • lookup join (left outer)
--       │ estimated row count: 2
--       │ table: t@t_pkey
--       │ equality: (column1) = (id)
--       │ equality cols are key
--
--       └── • distinct
--           │ estimated row count: 2
--           │ distinct on: column1
--           │ nulls are distinct
--           │ error on duplicate
--
--           └── • values
--                 size: 2 columns, 2 rows

-- and now this matches the behavior of INSERT ON CONFLICT
UPSERT INTO t VALUES (6, '1'), (6, '2');
-- ERROR: UPSERT or INSERT...ON CONFLICT command cannot affect row a second time
-- SQLSTATE: 21000

I don't think this fast-path UPSERT can cause corruption, as in #70731, so the only problem here is the confusing difference in behavior between plans for the same statement, depending on the indexes.

Jira issue: CRDB-50690

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-sql-optimizerSQL logical planning and optimizations.C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.T-sql-queriesSQL Queries Teambranch-release-20.1Used to mark GA and release blockers, technical advisories, and bugs for 20.1

    Type

    No type

    Projects

    Status

    Bugs to Fix

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions