New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stats,distsqlrun: reduce memory usage of CREATE STATISTICS #32614

Merged
merged 1 commit into from Nov 27, 2018

Conversation

Projects
None yet
4 participants
@rytaft
Copy link
Contributor

rytaft commented Nov 26, 2018

Prior to this patch, running CREATE STATISTICS on a large table
could cause the memory utilization of the database to grow so large
that it crashed the server. This was due to the way garbage collection
works in go. As described in this blog post,

Re-slicing a slice doesn't make a copy of the underlying array. The
full array will be kept in memory until it is no longer referenced.
Occasionally this can cause the program to hold all the data in memory
when only a small piece of it is needed.

This is exactly what was happening with CREATE STATISTICS. The
CREATE STATISTICS command performs a full table scan, and randomly
samples rows in order to build a histogram for a column. When rows are
scanned in the kv layer, they are scanned in batches of ~10,000, creating
arrays of encoded datums of length 10,000. Since the EncDatum object
passed to the sampler contains a slice referencing that underlying array, the
entire scanned batch would remain in memory as long as a single datum
from that slice was included in the sample. With random sampling, it
is likely that most of the batches have at least one sampled row, meaning
almost the entire table will be in memory until the histogram is built.

The fix is to copy the EncDatum to the sampler with only the decoded
Datum, setting the encoded []bytes slice to nil.

Release note (bug fix): Fixed an issue where calling CREATE STATISTICS
on a large table could cause the server to crash due to running out of
memory.

@rytaft rytaft requested review from RaduBerinde and andy-kimball Nov 26, 2018

@rytaft rytaft requested review from cockroachdb/distsql-prs as code owners Nov 26, 2018

@cockroach-teamcity

This comment has been minimized.

Copy link
Member

cockroach-teamcity commented Nov 26, 2018

This change is Reviewable

@rytaft rytaft force-pushed the rytaft:stats-fix branch from eb10a21 to a49214e Nov 26, 2018

@RaduBerinde
Copy link
Member

RaduBerinde left a comment

Very nice find! :lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/sql/stats/row_sampling.go, line 46 at r1 (raw file):

type SampleReservoir struct {
	samples  []SampledRow
	outTypes []sqlbase.ColumnType

[nit] colTypes (there isn't anything specific to "output" here)

@rytaft rytaft force-pushed the rytaft:stats-fix branch from a49214e to 137ca29 Nov 26, 2018

@rytaft
Copy link
Contributor

rytaft left a comment

TFTR!

I'm currently testing that TPC-C with 400 warehouses works now with CREATE STATISTICS on a cluster of 3 nodes with 4 cores each. I'll post an update when that test completes.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/sql/stats/row_sampling.go, line 46 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] colTypes (there isn't anything specific to "output" here)

Done.

@RaduBerinde
Copy link
Member

RaduBerinde left a comment

It would be good to backport this to 2.1.x, in case we need stats in a user deployment.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

@rytaft

This comment has been minimized.

Copy link
Contributor

rytaft commented Nov 26, 2018

Will do -- and I can do 2.0.x as well just in case

@rytaft rytaft force-pushed the rytaft:stats-fix branch 2 times, most recently from 74d1229 to f79ce60 Nov 26, 2018

@rytaft

This comment has been minimized.

Copy link
Contributor

rytaft commented Nov 26, 2018

Confirmed that memory utilization barely changes when CREATE STATISTICS is called on the TPC-C tables with 400 warehouses. This previously caused the server to crash.

Also added error handling to fix a lint error.

stats,distsqlrun: reduce memory usage of CREATE STATISTICS
Prior to this patch, running CREATE STATISTICS on a large table
could cause the memory utilization of the database to grow so large
that it crashed the server. This was due to the way garbage collection
works in go. As described in this blog post:
https://blog.golang.org/go-slices-usage-and-internals,

> Re-slicing a slice doesn't make a copy of the underlying array. The
> full array will be kept in memory until it is no longer referenced.
> Occasionally this can cause the program to hold all the data in memory
> when only a small piece of it is needed.

This is exactly what was happening with CREATE STATISTICS. The CREATE
STATISTICS command performs a full table scan, and randomly samples
rows in order to build a histogram for a column. When rows are scanned
in the kv layer, they are scanned in batches of ~10,000, creating arrays
of encoded datums of length 10,000. Since the EncDatum object passed to
the sampler contains a slice referencing that underlying array, the
entire scanned batch would remain in memory as long as a single datum
from that slice was included in the sample. With random sampling, it
is likely that most of the batches have at least one sampled row, meaning
almost the entire table will be in memory until the histogram is built.

The fix is to copy the EncDatum to the sampler with only the decoded
Datum, setting the encoded []bytes slice to nil.

Release note (bug fix): Fixed an issue where calling CREATE STATISTICS
on a large table could cause the server to crash due to running out of
memory.

@rytaft rytaft force-pushed the rytaft:stats-fix branch from f79ce60 to c255cb0 Nov 26, 2018

@jordanlewis

This comment has been minimized.

Copy link
Member

jordanlewis commented Nov 26, 2018

This is a nice find indeed - and I bet it has implications for processors besides just create statistics as well, don't you think? For example, imagine a query that filters a scan and then sorts the result of the filter, like SELECT * FROM t WHERE attr1 > 10 ORDER BY attr2. The filter will remove EncDatumRows from the flow, so the sort processor will have to look at fewer rows. But, since the sort processor has to spool all rows to do its job, it has the potential to severely undercount the amount of memory it's using. If the filter filters out one out of every 10k rows, then we could be undercounting by up to 10,000x, right?

Yikes!

Not to beat a dead horse, but I think this is another argument for why we should be removing the lazy-decode thing that EncDatum is supposed to get us. It might save a little bit of up-front time, but it seems to have lots of gotchas.

@jordanlewis

This comment has been minimized.

Copy link
Member

jordanlewis commented Nov 26, 2018

I meant, if the filter retains one out of every 10k, not filters out.

@rytaft
Copy link
Contributor

rytaft left a comment

Thanks, @jordanlewis! I think you're right that this could have implications for other processors, and in pathological cases you could be undercounting memory usage by 10,000x -- it's probably worth doing a quick audit of other processors to see which ones could encounter this issue.

I don't think I have the full backstory about the EncDatum debate, but I can see arguments for both sides. This memory issue did take a long time to track down, but now that we know it exists, maybe we can avoid some of the biggest gotchas?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

@rytaft

This comment has been minimized.

Copy link
Contributor

rytaft commented Nov 27, 2018

bors r+

craig bot pushed a commit that referenced this pull request Nov 27, 2018

Merge #32614
32614: stats,distsqlrun: reduce memory usage of CREATE STATISTICS r=rytaft a=rytaft

Prior to this patch, running `CREATE STATISTICS` on a large table
could cause the memory utilization of the database to grow so large
that it crashed the server. This was due to the way garbage collection
works in go. As described in [this blog post](https://blog.golang.org/go-slices-usage-and-internals),

> Re-slicing a slice doesn't make a copy of the underlying array. The
> full array will be kept in memory until it is no longer referenced.
> Occasionally this can cause the program to hold all the data in memory
> when only a small piece of it is needed.

This is exactly what was happening with `CREATE STATISTICS`. The 
`CREATE STATISTICS` command performs a full table scan, and randomly
samples rows in order to build a histogram for a column. When rows are
scanned in the kv layer, they are scanned in batches of ~10,000, creating
arrays of encoded datums of length 10,000. Since the `EncDatum` object
passed to the sampler contains a slice referencing that underlying array, the
entire scanned batch would remain in memory as long as a single datum
from that slice was included in the sample. With random sampling, it
is likely that most of the batches have at least one sampled row, meaning
almost the entire table will be in memory until the histogram is built.

The fix is to copy the `EncDatum` to the sampler with only the decoded
`Datum`, setting the `encoded []bytes` slice to nil.

Release note (bug fix): Fixed an issue where calling CREATE STATISTICS
on a large table could cause the server to crash due to running out of
memory.

Co-authored-by: Rebecca Taft <becca@cockroachlabs.com>
@jordanlewis

This comment has been minimized.

Copy link
Member

jordanlewis commented Nov 27, 2018

Yes, we should be able to avoid the gotchas - I guess it's just a matter of nilling out the encoded part everywhere, just like you're doing. Thanks for your hard work tracking this down.

@craig

This comment has been minimized.

Copy link

craig bot commented Nov 27, 2018

Build succeeded

@craig craig bot merged commit c255cb0 into cockroachdb:master Nov 27, 2018

3 checks passed

GitHub CI (Cockroach) TeamCity build finished
Details
bors Build succeeded
Details
license/cla Contributor License Agreement is signed.
Details
@RaduBerinde

This comment has been minimized.

Copy link
Member

RaduBerinde commented Nov 27, 2018

we should be removing the lazy-decode thing that EncDatum is supposed to get us

@jordanlewis I agree, I really regret introducing that abstraction.

We have a similar (but much more limited) problem with DatumAlloc - holding on to a row will hold on to whatever batches of datums are used. This is bound to a known factor though (16).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment