Users have mentioned that our performance dips during checkpoints. We need to run experiments to understand why that is, but here are two theories:
Assuming experiments verify that these are issues, here are things we can do to address them:
I cannot devise an experiment that shows rebalancing of a leaf node is an issue. On mork, doing serial insertion tests with perf_insert and doing regular perf_iibench tests show no performance drop during a checkpoint. This is likely because the machine has additional resources available to handle the background work of the checkpoint. So, the checkpoint does not seem to block client threads directly. Therefore, dips in performance are likely due to the cost of resources (memory, CPU, I/O) checkpoints induce. I will focus on changing internal nodes to quicklz and seeing experiment results with that.
I am not sure if that is related to checkpoints, but I observe this problem and it is rather severe
We need to investigate. We do not know yet.
I don't think i want to mess around with the rebalancing. I cannot find a test that shows it is an issue. And clients don't ever write nodes, background threads do. But nevertheless, writing a node to disk requires a lock that is grabbed by the background thread that is writing out the clone. So I don't think there is an issue here.
We have no evidence on what the cause of Vadim's symptoms are. We need to investigate
Here is an experiment: Append an auto increment to Vadim's original primary key and run with unique checks off for the primary key.