mv: adjust the overhead estimation for view updates

In order to avoid running out of memory, we can't underestimate the memory used when processing a view update. Particularly, we need to handle the remote view updates well, because we may create many of them at the same time in contrast to local updates which are processed synchronously. After investigating a coredump generated in a crash caused by running out of memory due to these remote view updates, we found that the current estimation is much lower than what we observed in practice; we identified overhead of up to 2288 bytes for each remote view update. The overhead consists of: - 512 bytes - a write_response_handler - less than 512 bytes - excessive memory allocation for the mutation in bytes_ostream - 448 bytes - the apply_to_remote_endpoints coroutine started in mutate_MV() - 192 bytes - a continuation to the coroutine above - 320 bytes - the coroutine in result_parallel_for_each started in mutate_begin() - 112 bytes - a continuation to the coroutine above - 192 bytes - 5 unspecified allocations of 32, 32, 32, 48 and 48 bytes This patch changes the previous overhead estimate of 256 bytes to 2288 bytes, which should take into account all allocations in the current version of the code. It's worth noting that changes in the related pieces of code may result in a different overhead. The allocations seem to be mostly captures for the background tasks. Coroutines seem to allocate extra, however testing shows that replacing a coroutine with continuations may result in generating a few smaller futures/continuations with a larger total size. Besides that, considering that we're waiting for a response for each remote view update, we need the relatively large write_response_handler, which also includes the mutation in case we needed to reuse it. The change should not majorly affect workloads with many local updates because we don't keep many of them at the same time anyway, and an added benefit of correct memory utilization estimation is avoiding evictions of other memory that would be otherwise necessary to handle the excessive memory used by view updates. Fixes #17364 Closes #17420
scylladb · Feb 20, 2024 · 4c767c3 · 4c767c3
1 parent e63d8ae
commit 4c767c3
Showing 1 changed file with 3 additions and 2 deletions.
diff --git a/replica/table.cc b/replica/table.cc
@@ -2467,8 +2467,9 @@ std::vector<view_ptr> table::affected_views(shared_ptr<db::view::view_update_gen
 }
 
 static size_t memory_usage_of(const utils::chunked_vector<frozen_mutation_and_schema>& ms) {
-    // Overhead of sending a view mutation, in terms of data structures used by the storage_proxy.
-    constexpr size_t base_overhead_bytes = 256;
+    // Overhead of sending a view mutation, in terms of data structures used by the storage_proxy, as well as possible background tasks
+    // allocated for a remote view update.
+    constexpr size_t base_overhead_bytes = 2288;
     return boost::accumulate(ms | boost::adaptors::transformed([] (const frozen_mutation_and_schema& m) {
         return m.fm.representation().size();
     }), size_t{base_overhead_bytes * ms.size()});