Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: avoid immediately launching goroutine for txn heartbeat loop #35009

Closed
nvanbenschoten opened this issue Feb 15, 2019 · 7 comments
Closed
Assignees
Labels
A-kv-client Relating to the KV client and the KV interface. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team

Comments

@nvanbenschoten
Copy link
Member

nvanbenschoten commented Feb 15, 2019

We currently launch a goroutine immediately for any transaction that doesn't send an EndTransaction in the same batch as its first write. This is a little crazy because very few transactions will ever actually run long enough to send even their first heartbeat after 1s. As an non-invasive way to improve this, we could introduce a new Stopper.RunDelayedAsyncTask method which waits for some specified period of time (in a single goroutine) before launching the async task. While waiting, the goroutine coordinating this could be monitoring context cancellation to remove queued async tasks before they fire off their own goroutine.

Using this approach, we could trigger a delayed async task in the heartbeater when we see the first write and delay the goroutine for an entire heartbeat interval, only paying the cost of the goroutine for transactions that will send at least one heartbeat.

An alternative to this is to create a single goroutine that handles transmitting all transaction heartbeats, but as has been discussed before, this will require protocol changes and migrations. If we're planning on taking something that large on, we should first consider alternative means of tracking transaction liveness, as we discuss here:

// NOTE: there are other mechanisms by which concurrent actors could determine
// the liveness of transactions. One proposal is to have concurrent actors
// communicate directly with transaction coordinators themselves. This would
// avoid the need for transaction heartbeats and the PENDING transaction state
// entirely. Another proposal is to detect abandoned transactions and failed
// coordinators at an entirely different level - by maintaining a node health
// plane. This would function under the idea that if the node a transaction's
// coordinator is running on is alive then that transaction is still in-progress
// unless it specifies otherwise. These are both approaches we could consider in
// the future.

Interestingly, there's already another case that would benefit from this in the txnPipeliner. We've always wanted to add a background job that periodically "proves" in-flight writes. Using a Stopper.RunDelayedAsyncTask method would allow us to add in this job without seeing a performance hit for short-running transactions. This would help us stay beneath the max inflight-write bytes limit we'll need to add for #32522.

cc. @ajwerner. I expect that the internal mechanism here will look very similar to what you added in requestbatcher.

Jira issue: CRDB-4614

@nvanbenschoten nvanbenschoten added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv-client Relating to the KV client and the KV interface. labels Feb 15, 2019
@ajwerner
Copy link
Contributor

Does the txnCoordSender have a defined lifecycle? The reason I ask is about when cleanup happens for the DelayedAsyncTasks. It is true that we could do it lazily based on context cancellation but it might be nice to have a mechanism to remove tasks from the delay queue eagerly.

@ajwerner
Copy link
Contributor

Ignore the above comment, I found what I was looking for

@nvanbenschoten
Copy link
Member Author

nvanbenschoten commented Feb 15, 2019

Yeah, I was thinking we would remove txnEnd and instead replace it with a context cancellation function tied to the existing hbCtx.

nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Feb 15, 2019
Fixes cockroachdb#32522.

This change creates a new cluster setting called:
```
kv.transaction.write_pipelining_max_outstanding_size
```

It limits the size in bytes that can be dedicated to tracking in-flight
writes that have been pipelined. Once this limit is hit, not more writes
will be pipelined by a transaction until some of the writes are proven
and removed from the outstanding write set.

This change once again illustrates the need for periodic proving of
outstanding writes. We touch upon that in the type definition's comment
and in cockroachdb#35009.

Release note: None
ajwerner added a commit to ajwerner/cockroach that referenced this issue Feb 19, 2019
Before this PR every transaction would start a goroutine for its heartbeat loop
immediately upon creation. Most transactions never need to heartbeat so this
goroutine is wasteful. This change delays the start of the heartbeat loop until
after the first heartbeatInterval has passed by using the new DelayedStopper.

Fixes cockroachdb#35009

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Feb 26, 2019
Fixes cockroachdb#32522.

This change creates a new cluster setting called:
```
kv.transaction.write_pipelining_max_outstanding_size
```

It limits the size in bytes that can be dedicated to tracking in-flight
writes that have been pipelined. Once this limit is hit, not more writes
will be pipelined by a transaction until some of the writes are proven
and removed from the outstanding write set.

This change once again illustrates the need for periodic proving of
outstanding writes. We touch upon that in the type definition's comment
and in cockroachdb#35009.

Release note: None
craig bot pushed a commit that referenced this issue Feb 26, 2019
35014: kv: introduce new "max outstanding size" setting to txnPipeliner r=nvanbenschoten a=nvanbenschoten

Fixes #32522.

This change creates a new cluster setting called:
```
kv.transaction.write_pipelining_max_outstanding_size
```

It limits the size in bytes that can be dedicated to tracking in-flight
writes that have been pipelined. Once this limit is hit, not more writes
will be pipelined by a transaction until some of the writes are proven
and removed from the outstanding write set.

This change once again illustrates the need for periodic proving of
outstanding writes. We touch upon that in the type definition's comment
and in #35009.

Release note: None

35199: log: fix remaining misuse of runtime.Callers/runtime.FuncForPC r=nvanbenschoten a=nvanbenschoten

Fixes #17770.

This commit fixes the last user of `runtime.Callers` that misused
the stdlib function by translating the PC values it returned directly
into symbolic information (see https://golang.org/pkg/runtime/#Callers) [1].
Go's documentation warns that this is a recipe for disaster when mixed
with mid-stack inlining.

The other concern in #17770 was this comment: #17770 (comment).
This was discussed in golang/go#29582 and addressed in golang/go@956879d.

An alternative would be to use `runtime.Caller` here, but that would
force an allocation that was hard-earned in #29017. Instead, this commit
avoids any performance hit.

```
name                 old time/op    new time/op    delta
Header-4                267ns ± 1%     268ns ± 0%    ~     (p=0.584 n=10+10)
VDepthWithVModule-4     260ns ± 3%     255ns ± 1%  -1.87%  (p=0.018 n=10+9)

name                 old alloc/op   new alloc/op   delta
Header-4                0.00B          0.00B         ~     (all equal)
VDepthWithVModule-4     0.00B          0.00B         ~     (all equal)

name                 old allocs/op  new allocs/op  delta
Header-4                 0.00           0.00         ~     (all equal)
VDepthWithVModule-4      0.00           0.00         ~     (all equal)
```

[1] I went through and verified that this was still correct.

Release note: None

35203: closedts: react faster to config changes r=danhhz a=tbg

This approximately halves the duration of

`make test PKG=./pkg/ccl/changefeedccl TESTS=/sinkless`,

from ~60s to ~30s.

Touches #34455.

Release note: None

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>
@github-actions
Copy link

github-actions bot commented Jun 5, 2021

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
5 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

@knz
Copy link
Contributor

knz commented Jun 5, 2021

still relevant

@jlinder jlinder added the T-kv KV Team label Jun 16, 2021
@ajwerner
Copy link
Contributor

I'm taking this off my list.

@ajwerner ajwerner removed their assignment Mar 23, 2022
@nvanbenschoten
Copy link
Member Author

This was addressed by #67215.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-client Relating to the KV client and the KV interface. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants