Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
row: fix intra-query memory leak in kvfetcher
The KVFetcher is the piece of code that does the first level of decoding of a KV batch response, doling out slices of keys and values to higher level code that further decodes the key values into formats that the SQL engine can operate on. The KVFetcher uses a slice into the batch response to keep track of where it is during the decoding process. Once the slice is empty, it's finished until someone asks it for a new batch. However, the KVFetcher used to keep around that empty slice pointer for its lifetime, or until it was asked for a new batch. This causes the batch response to be un-garbage-collectable, since there is still a slice pointing at it, even though the slice is empty. This causes queries to use up to 2x their accounted-for batch memory, since the memory accounting system assumes that once data is transfered out of the batch response into the SQL representation, the batch response is freed - it assumes there's just 1 "copy" of this batch response memory. This is especially problematic for long queries (since they will not allow that KVFetcher memory to be freed until they're finished). In effect, this causes 1 extra batch per KVFetcher per query to be retained in memory. This doesn't sound too bad, since a batch is of fixed size. But the max batch size is 1 megabyte, so with 1024 concurrent queries, each with 3 KVFetchers, like we see in a TPCH workload with 1024 concurrent query 18s, that's 1024 * 1MB * 3 = 3GB of unaccounted for memory. This is easily enough memory to push a node over and cause it to OOM. This patch nils the batch response pointer once the KVFetcher is finished decoding it, which allows it to be garbage collected as soon as possible. In practice, this seems to allow at least a single-node concurrency-1024 query18 TPCH workload to survive indefinitely (all queries return out of budget errors) without OOMing. Release note (bug fix): queries use up to 1MB less actual system memory per scan, lookup join, index join, zigzag join, or inverted join in their query plans. This will result in improved memory performance for workloads with concurrent OLAP-style queries.
- Loading branch information