Shrink table_mutable #490

jamii · 2023-02-22T22:44:25Z

Currently the load factor is set to 50%, which is fairly wasteful. Unfortunately value_count_max is just below a power of two, so even load factors up to 99% result in https://github.com/ziglang/zig/blob/6d44a6222d6eba600deb7f16c124bfa30628fb60//lib/std/hash_map.zig#L876 rounding up the allocation size to an effective load factor of just under 50%.

The text was updated successfully, but these errors were encountered:

492: Increase lsm_batch_multiple to 64. r=jamii a=jamii This is roughly the value we plan to use in production, so we should be benchmarking this configuration as early as possible. This increases memory usage and increase the latency spike from the final phase of compaction, but we need to deal with both regardless. #491 and #490 can help with memory usage. #269 will reduce the latency spike, as will any optimization that improves compaction performance in general. ## Pre-merge checklist Performance: * [x] Compare `zig benchmark` on linux before and after this pr. ``` lbm=4 1643655168 allocated 1223 batches in 116.60 s load offered = 1000000 tx/s load accepted = 85766 tx/s batch latency p00 = 5 ms batch latency p10 = 26 ms batch latency p20 = 27 ms batch latency p30 = 32 ms batch latency p40 = 36 ms batch latency p50 = 42 ms batch latency p60 = 112 ms batch latency p70 = 142 ms batch latency p80 = 172 ms batch latency p90 = 217 ms batch latency p100 = 411 ms lbm=64 2989330432 bytes allocated 1223 batches in 93.51 s load offered = 1000000 tx/s load accepted = 106942 tx/s batch latency p00 = 5 ms batch latency p10 = 26 ms batch latency p20 = 26 ms batch latency p30 = 27 ms batch latency p40 = 31 ms batch latency p50 = 31 ms batch latency p60 = 32 ms batch latency p70 = 36 ms batch latency p80 = 37 ms batch latency p90 = 42 ms batch latency p100 = 3624 ms ``` OR * [ ] I am very sure this PR could not affect performance. Co-authored-by: Jamie Brandon <jamie@scattered-thoughts.net>

jorangreef · 2023-02-24T13:26:28Z

For hash tables that do linear probing, a load factor of 50% is important to avoid excessive collisions, clustering and probing.

We could do 60% or perhaps 70% I think, given that Zig's implementation is pretty good, but then it's risking clustering and more CPU-intensive probing, so we should then also verify that these load factors don't cause excessive probing.

sentientwaffle · 2023-02-24T14:17:04Z

For object trees only:

If we're willing to pay an extra indirection cost + complexity, the hash map could be timestamp → usize instead of Value (implied timestamp key) → void, and then store the values themselves in a fixed-capacity arraylist, indexed by that usize. We still pay the 50% hashmap load factor overhead, but we would be paying it for @sizeOf(timestamp) + @sizeOf(usize) instead of @sizeOf(Value) — 16 bytes vs 128 bytes.

sentientwaffle · 2023-02-24T19:40:15Z

store the values themselves in a fixed-capacity arraylist

Which we could then sort in-place and swap with the immutable table's arraylist when we transition mutable → immutable to avoid copying all of the values out.

jorangreef · 2023-02-28T13:23:51Z

This is a neat trick—and a "super idea" for both reasons.

jamii mentioned this issue Feb 22, 2023

Increase lsm_batch_multiple to 64. #492

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shrink table_mutable #490

Shrink table_mutable #490

jamii commented Feb 22, 2023

jorangreef commented Feb 24, 2023

sentientwaffle commented Feb 24, 2023

sentientwaffle commented Feb 24, 2023

jorangreef commented Feb 28, 2023

Shrink table_mutable #490

Shrink table_mutable #490

Comments

jamii commented Feb 22, 2023

jorangreef commented Feb 24, 2023

sentientwaffle commented Feb 24, 2023

sentientwaffle commented Feb 24, 2023

jorangreef commented Feb 28, 2023