Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: fix the batching behavior of mutation statements, especially with RETURNING #23373

Merged
merged 12 commits into from
Apr 21, 2018

Conversation

knz
Copy link
Contributor

@knz knz commented Mar 5, 2018

Fixes #22304.
Fixes #23156.
Fixes #24928.

To review:

  • @RaduBerinde for the physical property propagation. Check the order_by test, confirm that this is enables optimizations of DELETE in a WITH clause or [ ... ]
  • @jordanlewis for overall design of the new batchedPlanNode interface.
  • @andreimatei @jordanlewis for correctness of the batching behavior (like we discussed - I think this does everything you want)

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@knz knz changed the title 20180305 delete fixups sql: fix the batching behavior of DELETE, especially with RETURNING Mar 5, 2018
@knz knz added the do-not-merge bors won't merge a PR with this label. label Mar 5, 2018
@knz
Copy link
Contributor Author

knz commented Mar 5, 2018

I am willing to extend this PR with the two subsequent commits that fix UPDATE and INSERT accordingly; or create separate PRs, as you recommend.

In any case I'd like to start the review with the DELETE changes in here, which show the general idea I want to apply to the other things afterwards.

@knz knz force-pushed the 20180305-delete-fixups branch 2 times, most recently from 93853ad to bf938e8 Compare March 5, 2018 20:26
@knz knz removed the do-not-merge bors won't merge a PR with this label. label Mar 5, 2018
@knz knz force-pushed the 20180305-delete-fixups branch 3 times, most recently from b0b2d74 to 799746d Compare March 5, 2018 20:44
@RaduBerinde
Copy link
Member

Phys props stuff LGTM


Review status: 0 of 19 files reviewed at latest revision, all discussions resolved, some commit checks pending.


Comments from Reviewable

@knz knz force-pushed the 20180305-delete-fixups branch 2 times, most recently from 080ec1b to a52a052 Compare March 5, 2018 20:52
@andreimatei
Copy link
Contributor

Review status: 0 of 19 files reviewed at latest revision, 4 unresolved discussions, some commit checks failed.


pkg/sql/plan_batch.go, line 23 at r1 (raw file):

)

// batchedPlanNode defines the interface for executing a query

if this extends planNode, then please call this out in the comment.

But I don't really get it - why extend planNode, if Values() is invalid? Also comments like "foo is contractually re-defined here to..." don't really make sense; that's not how interfaces work :). The whole point is that no implementer or derived interface can change any contract (or at least not in ways that can break unsuspecting clients of the interface).

I think we either extend planNode, in which case Values() needs to work and the implementers need to buffer rows, or we don't. You can extract all the planNode methods except Next() and Values() into a planNodeBase that both planNode and batchedPlanNode extend.


pkg/sql/plan_batch.go, line 29 at r1 (raw file):

	// The planNode.Next() method is contractually (re-)defined here to
	// advance a full batch. Its first (boolean) return value is

I think "advance a full batch" here and "the number of rows processed in the last call to Next()" in the comment on BatchedCount don't mean much.


pkg/sql/plan_batch.go, line 33 at r1 (raw file):

	//
	// Additionally, there is no open KV batch (in particular, no
	// uncommitted in-progress KV write operation), nor unchecked or

"uncommitted" is the wrong term to use.

Generally, I don't like this comment very much. It tries to be very general, but I think that a) it will not make much sense to a reader and b) it's too general. It's unclear what "inconsistent" means, it's unclear what "observable" means...
I think we should put more narrow guidance for how writes behave in the face of constraints. Like:

NOTE: Nodes that perform writes and also return rows (e.g. INSERT... RETURNING...) do not return rows before checking foreign key, uniqueness and other CHECK constraints. The implementations execute the KV requests for writing the respective rows (which implicitly checks for some of the constraints) and otherwise perform all other checks before returning a row.

pkg/sql/plan_batch.go, line 51 at r1 (raw file):

var _ batchedPlanNode = &deleteNode{}

// serializeNode serializes the results of a batchedPlanNode into a

nit: I think batchNodeAdapter would be a better name.


Comments from Reviewable

@andreimatei
Copy link
Contributor

Review status: 0 of 19 files reviewed at latest revision, 9 unresolved discussions, some commit checks failed.


pkg/sql/delete.go, line 161 at r1 (raw file):

	rowsNeeded rowsNeeded

	// fastPath is computed during startExec.

please also say what this fastPath means


pkg/sql/delete.go, line 178 at r1 (raw file):

// single DELETE batch. This combines with maxDeleteBatchSize below
// to limit the amount of work done in each DELETE batch.
const maxDeleteBatchRowCount = 1000

why do we have both these limits? Would one (either one) do by itself?


pkg/sql/returning.go, line 27 at r1 (raw file):

)

// resultsNeeded determines whether a statement that has a RETURNING

nit: this comment is confusing because it reads as if it was already determined that a RETURNING clause is specified. Say something like "a statement that might have a returning clause"...


pkg/sql/returning.go, line 41 at r1 (raw file):

}

type rowsNeeded bool

nit: my preference is to only use bool enums for input arguments. As far as I see, this rowsNeeded type is only returned or stored, never passed around, so I think it does more harm than good; a bool would be clearer and would lead to shorter ifs.


pkg/sql/returning.go, line 56 at r1 (raw file):

) (planNode, error) {
	// serialize the data-modifying plan to ensure that no data is
	// observed that hasn't been committed/checked first.

"committed" is the wrong word


Comments from Reviewable

@knz
Copy link
Contributor Author

knz commented Mar 7, 2018

Review status: 0 of 19 files reviewed at latest revision, 9 unresolved discussions.


pkg/sql/delete.go, line 161 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

please also say what this fastPath means

Done.


pkg/sql/delete.go, line 178 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

why do we have both these limits? Would one (either one) do by itself?

Yeah I guess just the batch size is important. The row count is constrained by the max batch size anyways. Done.


pkg/sql/plan_batch.go, line 23 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

if this extends planNode, then please call this out in the comment.

But I don't really get it - why extend planNode, if Values() is invalid? Also comments like "foo is contractually re-defined here to..." don't really make sense; that's not how interfaces work :). The whole point is that no implementer or derived interface can change any contract (or at least not in ways that can break unsuspecting clients of the interface).

I think we either extend planNode, in which case Values() needs to work and the implementers need to buffer rows, or we don't. You can extract all the planNode methods except Next() and Values() into a planNodeBase that both planNode and batchedPlanNode extend.

Yes I fully agree with you on both points.
I have tried to implement this but it makes the patch much more invasive and I dislike this amount of changes so late in the 2.0 release. Consider:

  • if we keep the base interface called "planNode" used in logical planning, and introduce a new "planNodeRun" that provides Next/Values, I'd need to introduce downcasts from planNode to planNodeRun in all the Next() methods for every planNode implementer.
  • if we introduce a new interface "planNodeBase" like you suggest, the local execution code stays unchanged but I need to rewrite all the logical planning code to use that instead.

In both cases the change is rather invasive. I will do it on master in a follow-up PR but for this current PR instead I am leaving a TODO and a reference to the new tracking issue #23522.

Also I agree that it is confusing that batchedPlanNode do implement Next() but with a different contract. I have decided to simply make the base method unavailable and instead defined a new BatchedNext().


pkg/sql/plan_batch.go, line 29 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I think "advance a full batch" here and "the number of rows processed in the last call to Next()" in the comment on BatchedCount don't mean much.

Yes, agreed. Rephrased.


pkg/sql/plan_batch.go, line 33 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

"uncommitted" is the wrong term to use.

Generally, I don't like this comment very much. It tries to be very general, but I think that a) it will not make much sense to a reader and b) it's too general. It's unclear what "inconsistent" means, it's unclear what "observable" means...
I think we should put more narrow guidance for how writes behave in the face of constraints. Like:

NOTE: Nodes that perform writes and also return rows (e.g. INSERT... RETURNING...) do not return rows before checking foreign key, uniqueness and other CHECK constraints. The implementations execute the KV requests for writing the respective rows (which implicitly checks for some of the constraints) and otherwise perform all other checks before returning a row.

Thanks for the suggestion. Adopted. I extended it a bit because the reasoning goes further and also includes row counts, not just result rows.


pkg/sql/plan_batch.go, line 51 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nit: I think batchNodeAdapter would be a better name.

  1. We have two such adapters (see the code below) - would be nice to distinguish them.

  2. I prefer names that (also) describe what the thing does. "xxxAdapter" doesn't say what xxx is adapted to nor how.


pkg/sql/returning.go, line 27 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nit: this comment is confusing because it reads as if it was already determined that a RETURNING clause is specified. Say something like "a statement that might have a returning clause"...

Done


pkg/sql/returning.go, line 41 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nit: my preference is to only use bool enums for input arguments. As far as I see, this rowsNeeded type is only returned or stored, never passed around, so I think it does more harm than good; a bool would be clearer and would lead to shorter ifs.

Thanks for the advice. I wasn't sure. Done.


pkg/sql/returning.go, line 56 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

"committed" is the wrong word

Done.


Comments from Reviewable

@andreimatei
Copy link
Contributor

:lgtm:

Please run the delete benchmark just in case


Review status: 0 of 19 files reviewed at latest revision, 6 unresolved discussions, some commit checks pending.


pkg/sql/delete.go, line 38 at r2 (raw file):

type deleteNode struct {
	source  planNode
	columns sqlbase.ResultColumns

please put a comment on this columns and say something about RETURNING, since otherwise this field is pretty surprising.


pkg/sql/plan_batch.go, line 23 at r1 (raw file):

Previously, knz (kena) wrote…

Yes I fully agree with you on both points.
I have tried to implement this but it makes the patch much more invasive and I dislike this amount of changes so late in the 2.0 release. Consider:

  • if we keep the base interface called "planNode" used in logical planning, and introduce a new "planNodeRun" that provides Next/Values, I'd need to introduce downcasts from planNode to planNodeRun in all the Next() methods for every planNode implementer.
  • if we introduce a new interface "planNodeBase" like you suggest, the local execution code stays unchanged but I need to rewrite all the logical planning code to use that instead.

In both cases the change is rather invasive. I will do it on master in a follow-up PR but for this current PR instead I am leaving a TODO and a reference to the new tracking issue #23522.

Also I agree that it is confusing that batchedPlanNode do implement Next() but with a different contract. I have decided to simply make the base method unavailable and instead defined a new BatchedNext().

cool


pkg/sql/plan_batch.go, line 51 at r1 (raw file):

Previously, knz (kena) wrote…
  1. We have two such adapters (see the code below) - would be nice to distinguish them.

  2. I prefer names that (also) describe what the thing does. "xxxAdapter" doesn't say what xxx is adapted to nor how.

ok


pkg/sql/tablewriter.go, line 737 at r2 (raw file):

}

func (td *tableDeleter) rotateBatch(ctx context.Context) error {

runBatch() ?


Comments from Reviewable

knz added 8 commits April 21, 2018 12:21
This patch applies the new pattern introduced by DELETE in the
previous patch and introduces proper batching behavior for INSERT.

Perf difference:
```
name                                             old time/op    new time/op    delta
SQL/Cockroach/Insert/count=1-16                     286µs ± 4%     290µs ± 3%   ~     (p=0.548 n=5+5)
SQL/Cockroach/Insert/count=10-16                    627µs ±22%     628µs ±24%   ~     (p=1.000 n=5+5)
SQL/Cockroach/Insert/count=100-16                  5.44ms ± 9%    5.75ms ±10%   ~     (p=0.310 n=5+5)
SQL/Cockroach/Insert/count=1000-16                 60.7ms ± 3%    59.8ms ± 8%   ~     (p=0.730 n=4+5)
SQL/Cockroach/InsertDistinct/count=1-16            3.44ms ±11%    3.62ms ± 3%   ~     (p=0.151 n=5+5)
SQL/Cockroach/InsertDistinct/count=10-16           1.25ms ±69%    1.06ms ±42%   ~     (p=0.548 n=5+5)
SQL/Cockroach/InsertDistinct/count=100-16          1.53ms ±15%    1.49ms ± 2%   ~     (p=0.841 n=5+5)
SQL/Cockroach/InsertFK/count=1/nFks=1-16            492µs ±11%     499µs ±12%   ~     (p=0.841 n=5+5)
SQL/Cockroach/InsertFK/count=1/nFks=5-16            689µs ± 5%     680µs ± 3%   ~     (p=0.548 n=5+5)
SQL/Cockroach/InsertFK/count=1/nFks=10-16          1.06ms ± 2%    1.08ms ± 6%   ~     (p=0.548 n=5+5)
SQL/Cockroach/InsertFK/count=10/nFks=1-16          1.71ms ±11%    1.65ms ± 4%   ~     (p=0.690 n=5+5)
SQL/Cockroach/InsertFK/count=10/nFks=5-16          3.84ms ± 2%    3.83ms ± 3%   ~     (p=1.000 n=4+5)
SQL/Cockroach/InsertFK/count=10/nFks=10-16         5.71ms ±16%    5.85ms ±13%   ~     (p=0.841 n=5+5)
SQL/Cockroach/InsertFK/count=100/nFks=1-16         11.4ms ±10%    11.0ms ±11%   ~     (p=0.548 n=5+5)
SQL/Cockroach/InsertFK/count=100/nFks=5-16         34.2ms ±12%    33.6ms ±14%   ~     (p=1.000 n=5+5)
SQL/Cockroach/InsertFK/count=100/nFks=10-16        51.7ms ± 7%    54.6ms ±25%   ~     (p=0.690 n=5+5)
SQL/Cockroach/InsertFK/count=1000/nFks=1-16         121ms ± 8%     120ms ± 6%   ~     (p=0.841 n=5+5)
SQL/Cockroach/InsertFK/count=1000/nFks=5-16         357ms ±10%     406ms ±38%   ~     (p=0.310 n=5+5)
SQL/Cockroach/InsertFK/count=1000/nFks=10-16        603ms ± 9%     610ms ±21%   ~     (p=1.000 n=5+5)
SQL/Cockroach/InsertSecondaryIndex/count=1-16       571µs ± 1%     605µs ± 8%   ~     (p=0.730 n=4+5)
SQL/Cockroach/InsertSecondaryIndex/count=10-16     3.63ms ± 8%    3.42ms ±10%   ~     (p=0.413 n=4+5)
SQL/Cockroach/InsertSecondaryIndex/count=100-16    31.5ms ±17%    34.3ms ± 2%   ~     (p=0.800 n=3+2)

name                                             old alloc/op   new alloc/op   delta
SQL/Cockroach/Insert/count=1-16                    39.8kB ± 1%    39.9kB ± 0%   ~     (p=0.548 n=5+5)
SQL/Cockroach/Insert/count=10-16                   88.7kB ± 0%    88.7kB ± 2%   ~     (p=0.690 n=5+5)
SQL/Cockroach/Insert/count=100-16                   567kB ± 2%     574kB ± 3%   ~     (p=0.222 n=5+5)
SQL/Cockroach/Insert/count=1000-16                 5.33MB ± 3%    5.26MB ± 2%   ~     (p=0.690 n=5+5)
SQL/Cockroach/InsertDistinct/count=1-16             356kB ± 2%     358kB ± 5%   ~     (p=1.000 n=5+5)
SQL/Cockroach/InsertDistinct/count=10-16            366kB ± 5%     363kB ± 1%   ~     (p=1.000 n=5+5)
SQL/Cockroach/InsertDistinct/count=100-16           439kB ± 5%     432kB ± 2%   ~     (p=1.000 n=5+5)
SQL/Cockroach/InsertFK/count=1/nFks=1-16           57.2kB ± 3%    57.8kB ± 4%   ~     (p=0.690 n=5+5)
SQL/Cockroach/InsertFK/count=1/nFks=5-16            126kB ± 2%     125kB ± 4%   ~     (p=0.690 n=5+5)
SQL/Cockroach/InsertFK/count=1/nFks=10-16           230kB ± 9%     232kB ±11%   ~     (p=1.000 n=5+5)
SQL/Cockroach/InsertFK/count=10/nFks=1-16           327kB ±25%     300kB ±44%   ~     (p=0.548 n=5+5)
SQL/Cockroach/InsertFK/count=10/nFks=5-16          1.03MB ±45%    0.97MB ±19%   ~     (p=1.000 n=5+5)
SQL/Cockroach/InsertFK/count=10/nFks=10-16         1.92MB ±18%    2.02MB ±31%   ~     (p=0.841 n=5+5)
SQL/Cockroach/InsertFK/count=100/nFks=1-16         3.19MB ±37%    2.99MB ±44%   ~     (p=0.548 n=5+5)
SQL/Cockroach/InsertFK/count=100/nFks=5-16         14.0MB ±20%    11.8MB ±23%   ~     (p=0.151 n=5+5)
SQL/Cockroach/InsertFK/count=100/nFks=10-16        24.9MB ±17%    26.9MB ±27%   ~     (p=0.421 n=5+5)
SQL/Cockroach/InsertFK/count=1000/nFks=1-16        55.1MB ±12%    53.3MB ±32%   ~     (p=0.841 n=5+5)
SQL/Cockroach/InsertFK/count=1000/nFks=5-16         189MB ± 7%     190MB ± 6%   ~     (p=0.841 n=5+5)
SQL/Cockroach/InsertFK/count=1000/nFks=10-16        330MB ± 5%     321MB ± 7%   ~     (p=0.222 n=5+5)
SQL/Cockroach/InsertSecondaryIndex/count=1-16       279kB ±16%     290kB ±12%   ~     (p=0.841 n=5+5)
SQL/Cockroach/InsertSecondaryIndex/count=10-16     1.69MB ± 6%    1.64MB ± 7%   ~     (p=0.548 n=5+5)
SQL/Cockroach/InsertSecondaryIndex/count=100-16    15.9MB ±10%    16.7MB ± 1%   ~     (p=0.800 n=3+2)

name                                             old allocs/op  new allocs/op  delta
SQL/Cockroach/Insert/count=1-16                       311 ± 2%       313 ± 1%   ~     (p=0.143 n=5+5)
SQL/Cockroach/Insert/count=10-16                      478 ± 7%       474 ± 5%   ~     (p=0.730 n=5+5)
SQL/Cockroach/Insert/count=100-16                   2.04k ±10%     2.40k ±29%   ~     (p=0.421 n=5+5)
SQL/Cockroach/Insert/count=1000-16                  22.0k ±45%     18.0k ±22%   ~     (p=0.841 n=5+5)
SQL/Cockroach/InsertDistinct/count=1-16             2.09k ±23%     2.03k ±17%   ~     (p=1.000 n=5+5)
SQL/Cockroach/InsertDistinct/count=10-16            2.07k ±23%     2.00k ± 2%   ~     (p=0.421 n=5+5)
SQL/Cockroach/InsertDistinct/count=100-16           2.58k ± 8%     2.54k ± 3%   ~     (p=1.000 n=5+5)
SQL/Cockroach/InsertFK/count=1/nFks=1-16              479 ± 6%       496 ± 3%   ~     (p=0.310 n=5+5)
SQL/Cockroach/InsertFK/count=1/nFks=5-16              935 ± 3%       960 ± 6%   ~     (p=0.222 n=5+5)
SQL/Cockroach/InsertFK/count=1/nFks=10-16           1.68k ±18%     1.70k ±18%   ~     (p=0.841 n=5+5)
SQL/Cockroach/InsertFK/count=10/nFks=1-16           2.83k ±39%     2.67k ±73%   ~     (p=0.690 n=5+5)
SQL/Cockroach/InsertFK/count=10/nFks=5-16           8.19k ±72%     7.36k ±29%   ~     (p=1.000 n=5+5)
SQL/Cockroach/InsertFK/count=10/nFks=10-16          15.2k ±26%     16.2k ±43%   ~     (p=1.000 n=5+5)
SQL/Cockroach/InsertFK/count=100/nFks=1-16          28.7k ±47%     25.9k ±57%   ~     (p=0.548 n=5+5)
SQL/Cockroach/InsertFK/count=100/nFks=5-16           126k ±25%      100k ±33%   ~     (p=0.151 n=5+5)
SQL/Cockroach/InsertFK/count=100/nFks=10-16          215k ±23%      240k ±35%   ~     (p=0.690 n=5+5)
SQL/Cockroach/InsertFK/count=1000/nFks=1-16          554k ±14%      528k ±38%   ~     (p=1.000 n=5+5)
SQL/Cockroach/InsertFK/count=1000/nFks=5-16         1.75M ± 9%     1.77M ± 7%   ~     (p=0.690 n=5+5)
SQL/Cockroach/InsertFK/count=1000/nFks=10-16        2.85M ± 8%     2.72M ±10%   ~     (p=0.222 n=5+5)
SQL/Cockroach/InsertSecondaryIndex/count=1-16       2.49k ±19%     2.63k ±15%   ~     (p=0.841 n=5+5)
SQL/Cockroach/InsertSecondaryIndex/count=10-16      14.4k ± 9%     13.9k ± 9%   ~     (p=0.421 n=5+5)
SQL/Cockroach/InsertSecondaryIndex/count=100-16      121k ±13%      129k ± 1%   ~     (p=0.800 n=3+2)
```

Release note (bug fix): Removed a limitation where `INSERT` would fail
if the number of inserted rows was too large.

Release note (bug fix): Fixed a bug where `SELECT * FROM [INSERT
... RETURNING ...] LIMIT 1` or `WITH d AS (INSERT ... RETURNING ...)
SELECT * FROM d LIMIT 1` would fail to properly update some rows.
This patch applies the new pattern introduced by DELETE in a previous
patch and introduces proper batching behavior for UPSERT.

Perf difference:
```
name                                old time/op    new time/op    delta
SQL/Cockroach/Upsert/count=1-16        830µs ±16%     805µs ± 3%   ~     (p=1.000 n=5+5)
SQL/Cockroach/Upsert/count=10-16      1.11ms ±12%    1.10ms ± 7%   ~     (p=1.000 n=5+5)
SQL/Cockroach/Upsert/count=100-16     4.03ms ±11%    3.95ms ±14%   ~     (p=0.690 n=5+5)
SQL/Cockroach/Upsert/count=1000-16    27.8ms ±15%    25.4ms ± 2%   ~     (p=0.286 n=5+4)

name                                old alloc/op   new alloc/op   delta
SQL/Cockroach/Upsert/count=1-16        112kB ± 1%     112kB ± 1%   ~     (p=0.222 n=5+5)
SQL/Cockroach/Upsert/count=10-16       154kB ± 1%     154kB ± 1%   ~     (p=0.690 n=5+5)
SQL/Cockroach/Upsert/count=100-16      710kB ± 1%     708kB ± 1%   ~     (p=1.000 n=5+5)
SQL/Cockroach/Upsert/count=1000-16    6.22MB ± 2%    6.23MB ± 2%   ~     (p=1.000 n=5+5)

name                                old allocs/op  new allocs/op  delta
SQL/Cockroach/Upsert/count=1-16          988 ± 1%       994 ± 1%   ~     (p=0.175 n=5+5)
SQL/Cockroach/Upsert/count=10-16       1.34k ± 2%     1.35k ± 2%   ~     (p=0.317 n=5+5)
SQL/Cockroach/Upsert/count=100-16      5.27k ±16%     5.09k ± 9%   ~     (p=0.841 n=5+5)
SQL/Cockroach/Upsert/count=1000-16     44.4k ±14%     43.8k ±15%   ~     (p=1.000 n=5+5)
```

(negligible)

This also enables deleting the `returningHelper` and the `editNode`
infrastructure, which is not needed any more.

Release note (bug fix): Removed a limitation where `UPSERT` would fail
if the number of modified rows was too large.

Release note (bug fix): `UPSERT` now properly reports the count of
modified rows when `RETURNING` is not specified.
This patch introduces the logic test syntax `statement count N` that
is like `statement ok` but also checks that the final RowsAffected
count is equal to N.

It also introduces the relevant DELETE/UPSERT/INSERT/UPDATE tests,
which exercise this code path.

Release note: None
Prior to this patch, the spool stage if needed would be added just
before the mutation stage that requires it. Then a separate condition
would remove the spool stage if it was at the top level of a logical
plan

With the new handling of RETURNING however, this would cause the
unfortunate situation where a top level mutation _statement_ with
RETURNING would get a spool although it is not needed (reminder:
top-level mutations do not need a spool; a spool is only required if
there is further processing by a surrounding query) -- this is because
RETURNING has been recently modified to be handled by a regular render
(projection) stage after the mutation stage, instead of being executed
inside the mutation stage itself. So a plan would look like returning
- spool - mutation, whereas we'd really like to see returning -
mutation directly.

This patch solves the problem by introducing two plan transforms that
"pull up" a spool:

- render - spool is replaced by spool - render
- filter - spool is replaced by spool - filter
- distinct - spool is replaced by spool - distinct

The first one addresses the immediate non-optimal situation with
RETURNING that required this patch. The other two are generally-safe,
generally-desirable transform that reduce the amount or rows stored
inside the spool.

Release note (sql change): a minor optimization is introduced that
reduces the amount of RAM used when a query performs further
computations on the result of a mutation statement
(INSERT/DELETE/UPSERT/UPSERT) combined with RETURNING.
@knz
Copy link
Contributor Author

knz commented Apr 21, 2018

There we go, fixed it.

@knz
Copy link
Contributor Author

knz commented Apr 21, 2018

Also now with a regression test for the original issue (#22304) that motivated all this work.

As detected in issue cockroachdb#22304, a mutation with RETURNING could let
values go through to the client that would violate constraints. This
was fixed in a combination of previous patches; this commit adds the
corresponding regression test.

Release note: None
@knz
Copy link
Contributor Author

knz commented Apr 21, 2018

bors r+

craig bot pushed a commit that referenced this pull request Apr 21, 2018
23373: sql: fix the batching behavior of mutation statements, especially with RETURNING r=knz a=knz

Fixes #22304.
Fixes #23156.
Fixes #24928.

To review: 
- @RaduBerinde for the physical property propagation. Check the order_by test, confirm that this is enables optimizations of DELETE in a WITH clause or [ ... ]
- @jordanlewis for overall design of the new `batchedPlanNode` interface.
- @andreimatei @jordanlewis  for correctness of the batching behavior (like we discussed - I think this does everything you want)


Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Apr 21, 2018

Build succeeded

@craig craig bot merged commit 6dced3e into cockroachdb:master Apr 21, 2018
@andreimatei
Copy link
Contributor

@knz I'm ecstatic to see this merged! Would you mind updating this doc? https://www.cockroachlabs.com/docs/dev/known-limitations.html#write-and-update-limits-for-a-single-statement

@knz
Copy link
Contributor Author

knz commented Apr 23, 2018

done, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants