Before - we had relatively slow check for counting wasted fragmentation via
zmalloc_get_allocator_wasted_blocks that took 10ms or more in production.
The reason for that is that it iterate over all the memory pages on a single shard
through the single call.
Now we implement an iterative version of it by iterating over a single page queue data-structure
in the heap. Once we start the iterative process we will continue aggregating stats over all the
page queues in the heap until we reach the end and then conclude if defragmentation is needed.
this should reduce the call time to EngineShard::DefragTaskState::CheckRequired by x70
(number of page queues in the heap).
Signed-off-by: Roman Gershman <roman@dragonflydb.io>