Description
Currently our memory accounting by type relies on method CompactObject::MallocUsed
and assumes that every memory change is tracked and properly reflected in per-type stats.
It seems as very fragile and error prone approach where each data type is responsible to track its own usage.
This sometimes even not possible because malloc library can allocate block sizes greater than were requested. In addition, it is very hard to track memory reserved in complicated data types like json where various internal data structures may reserve more than they use and what they account for.
Our goal is to provide insights into per type memory usage but we do not really need byte precision usage.
The critical code is in AccountObjectMemory
function that uses CompactObject::MallocUsed
to count per object deltas.
Design goals
to have a central approach to track memory without writing a complicated code in each data structure.
Proposal
Instead of relying on MallocUsed we can track changes in EngineShard::UsedMemory()
that increases or decreases as the underlying objects mutate. If we know the context of the object type that is being added/mutated we could attribute the change to the correct type. There are few caveats that needs to be handled:
- possible preemptions due to replication or snapshotting when we measure "before" and "after" of
EngineShard::UsedMemory()
- cases when we override the existing objects (MOVE, COPY, SET etc)
- Overflows (negative numbers) should be trimmed to 0 at least in release mode.
EngineShard::UsedMemory()
also include dash table table memory. it should not be accounted for when tracking object memory usage.
Further enhancements
Could be nice if we track keys related memory usage (i.e strings that are used for keys) separately as type_used_memory_keys