Improve memory instrumentation #1790
This increases the accuracy of crash dumps and the upcoming allocation tagging feature.
This may be of interest in crash dumps and allows the upcoming allocation tagging feature to track allocations on a per-NIF basis. Note that this is only updated when user code calls a NIF; it's not altered when the emulator calls NIFs during code upgrades or tracing.
This commit replaces the old memory instrumentation with a new implementation that scans carriers instead of wrapping erts_alloc/erts_free. The old implementation could not extract information without halting the emulator, had considerable runtime overhead, and the memory maps it produced were noisy and lacked critical information. Since the new implementation walks through existing data structures there's no longer a need to start the emulator with special flags to get information about carrier utilization/fragmentation. Memory fragmentation is also easier to diagnose as it's presented on a per-carrier basis which eliminates the need to account for "holes" between mmap segments. To help track allocations, each allocation can now be tagged with what it is and who allocated it at the cost of one extra word per allocation. This is controlled on a per-allocator basis with the +M<S>atags option, and is enabled by default for binary_alloc and driver_alloc (which is also used by NIFs).
I just ended up reaching for this and turns out its gone :(
What this new rewrite lacks is the most important feature, the fact you can search which terms are in memory and how many references they have. This helps quickly identify what garbage is lying about, for example this module helped me identify the binary heap leak inside jiffy. Now it just tells you information but nothing detailed. (you got a pointer to where the term was inmemory, then you can just read it from the processes memory).
This new version completely removes this feature.
Any plans to perhaps add it back?
No, you've always needed a debugger to do what you've described and the rewrite merely changes how you use it.
The gdb commands below will get you more or less the same information as before, and it's easily extended to include a lot more than the old instrument module did.
# printf at the start of the block scan loop in gather_ahist_scan. Your # line numbers may differ. ## *After* `block = SBC2BLK(allocator, carrier);` dprintf erl_alloc_util.c:7638,"SB: %p, %p\n", (((char*)(block)) + sizeof(Block_t)), block->bhdr ## At `UWord block_size = MBC_BLK_SZ(block);` dprintf erl_alloc_util.c:7653,"MB: %p, %p\n", (((char*)(block)) + sizeof(Block_t)), block->bhdr # Log the above to a file set logging overwrite on set logging file allocations.txt set logging redirect on set logging on set pagination off # Break out of the debugger, and then run instrument:allocations(). # Don't forget to turn off logging and/or redirect when you're done.
Thanks I will check that out, but its not correct. You did not need a debugger because you can just read back from /proc/mem and use os:getpid/0 to get the runtime pid. And the memory_data provided the ref counts as well as addresses. So I do not see the need to use a debugger to extract this information unless I am misunderstanding something..
I'd say that's a debugger by a different name, you're reading the raw memory of another process without its consent.
It didn't as far as I can tell, maybe I'm missing something:
In either case it's rather trivial to modify the above example to provide reference count for binaries etc. The allocation tag tells you what type the object is and where it came from, so you can just set a breakpoint after the tag is read and print the reference count if it's a binary.
Still inside managed code it makes things much easier, as well as you get the actual pid that did the allocation. VS using GDB not only does that raise the skill level to debug memory issues much higher, it also creates the requirement to write a bunch of custom tooling around GDB now if you want to get more information out the running system (vs just dumping clear text logs).
Before this was all accessible from inside the emulator, without leaving it.
I have not used it in a while, maybe it gave you the address to the allocated block the term was in (for shared binary heap terms atleast)?
Yea but its a very useful tool as it was before IMO for figuring out the hard stuff. Things like Recon (and this new instrument module) do not really cut it as they tell you there is a problem, but you cannot really pinpoint what exact code is causing the problem.
So you need to cut to GDB, but maybe that makes sense, pinpointing issues like this are a task for GDB otherwise it gets hard to maintain. Like its out of scope from inside managed code to be able to do stuff like this. Also GDB seems the goto tool as well for other issues like scheduler locking around ets tables.
I don't get it. It's like using a fingernail to drive in a screw, why not use the right tool for the job?
In return you had no idea where it originated and had to resort to reading block contents with a debugger to figure that out, which is pretty much only feasible with binaries that contain recognizable data.
The pids weren't particularly useful either since they were only present when a process was directly responsible for the allocation (ports, system tasks, async jobs, etc were hopeless), and they could be long dead by the time you inspected them.
It doesn't take long to pick up the basics and there are countless tutorials on the net to help you. I don't think it's too much to ask.
No, it just prints the block address, type, and size. It doesn't say anything about the contents nor do they have anything to do with Erlang terms.
For example, the
It's just a pile of bytes and you can't even say what the reference count is without knowing the struct layout. With
This basically sums it up. For what it's worth, I'm sorry this PR derailed your workflow.
Would it be possible to extend this module so it can show all the terms in a processes eheap?
Like for example I know I have a memory issue, eheap is growing insanely fast if some unhappy path is taken. Up to 10MB is allocated for eheap then it stops growing (maybe this is some kind of internal limit after which GC is forced?). If the unhappy path is not taken in that 1% or something chance, everything is fine. Great, so how do I find now the unhappy path that is triggering the allocations?
One way would be if I could dump all the terms in the eheap or at least get a pointer to it. This way I can easily and quickly see. Of course the best way would be tie every term in the eheap to the line number in source code that allocated it. But this might be complex due to all the optimizations with memory copying?