-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stats: deep-copy datums during sampling to allow GC of kv batch #42372
Conversation
LGTM I don't know that we confirmed this is the underlying cause for #42331? (although it's definitely the only hypothesis right now). Maybe we should see if they are interested in trying to turn them back on with a custom build. Do you think this is also the reason for the continued OOMs in #33660? (or it's just the batch size itself) |
Added an explicit check for the encoding so we only copy for key-encoded datums. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @justinj, @RaduBerinde, and @rytaft)
pkg/sql/stats/row_sampling.go, line 246 at r1 (raw file):
// deepCopyDatum performs a deep copy for datums such as DString to remove any // references to the kv batch and allow the batch to be garbage collected. func deepCopyDatum(evalCtx *tree.EvalContext, d tree.Datum) tree.Datum {
[nit] Add a note here that it's only called for key encodings, in case someone adds another type in the future (which might be need deep-copy with value encoding)
Previously, the SampleReservoir would call EnsureDecoded on each EncDatum from an input row, and then copy only the Datum to a new EncDatumRow to remove the reference to the encoded data from the kv batch. It turns out this was not enough to fully remove all references to the kv batch, however. For certain data types (sepecifically key-encoded strings and collated strings), it was possible that the decoded Datum referenced the original encoded kv batch, thus preventing the garbage collector from deleting it. This commit fixes the problem by performing a deep copy for certain key-encoded datums such as DString to remove any references to the kv batch and allow the batch to be garbage collected. Informs cockroachdb#42331 Informs cockroachdb#33660 Release note (bug fix): Fixed an out-of-memory error that could occur when collecting statistics on tables with a string index column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @justinj)
pkg/sql/stats/row_sampling.go, line 246 at r1 (raw file):
Previously, RaduBerinde wrote…
[nit] Add a note here that it's only called for key encodings, in case someone adds another type in the future (which might be need deep-copy with value encoding)
Done.
42372: stats: deep-copy datums during sampling to allow GC of kv batch r=rytaft a=rytaft Previously, the `SampleReservoir` would call `EnsureDecoded` on each `EncDatum` from an input row, and then copy only the `Datum` to a new `EncDatumRow` to remove the reference to the encoded data from the kv batch. It turns out this was not enough to fully remove all references to the kv batch, however. For certain data types (sepecifically key-encoded strings and collated strings), it was possible that the decoded `Datum` referenced the original encoded kv batch, thus preventing the garbage collector from deleting it. This commit fixes the problem by performing a deep copy for certain key-encoded datums such as `DString` to remove any references to the kv batch and allow the batch to be garbage collected. Informs #42331 Informs #33660 Release note (bug fix): Fixed an out-of-memory error that could occur when collecting statistics on tables with a string index column. Co-authored-by: Rebecca Taft <becca@cockroachlabs.com>
Build succeeded |
Previously, the
SampleReservoir
would callEnsureDecoded
on eachEncDatum
froman input row, and then copy only the
Datum
to a newEncDatumRow
to remove thereference to the encoded data from the kv batch. It turns out this was not
enough to fully remove all references to the kv batch, however. For certain
data types (sepecifically key-encoded strings and collated strings), it was
possible that the decoded
Datum
referenced the original encoded kv batch, thuspreventing the garbage collector from deleting it.
This commit fixes the problem by performing a deep copy for certain key-encoded
datums such as
DString
to remove any references to the kv batch and allow thebatch to be garbage collected.
Informs #42331
Informs #33660
Release note (bug fix): Fixed an out-of-memory error that could occur when
collecting statistics on tables with a string index column.