New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve memory instrumentation #1790

Merged
merged 6 commits into from Apr 24, 2018

Conversation

Projects
None yet
1 participant
@jhogberg
Contributor

jhogberg commented Apr 19, 2018

This PR replaces the old memory instrumentation with a new implementation that scans carriers instead of wrapping erts_alloc/erts_free. The old implementation could not extract information without halting the emulator, had considerable runtime overhead, and the memory maps it produced were noisy and lacked critical information.

Since the new implementation walks through existing data structures there's no longer a need to start the emulator with special flags to get information about carrier utilization/fragmentation. Memory fragmentation is also easier to diagnose as it's presented on a per-carrier basis which eliminates the need to account for "holes" between mmap segments.

To help track allocations, each allocation can now be tagged with what it is and who allocated it at the cost of one extra word per allocation. This is controlled on a per-allocator basis with the +M<S>atags option, and is enabled by default for binary_alloc and driver_alloc (which is also used by NIFs).

The old interface has been removed since it can't be adapted to the new implementation and the module has always been experimental. It could coexist with the new one if we kept the old implementation around, but it will be removed unless someone makes a strong case for keeping it.

There's still some work left on the documentation, but with rc1 coming soon I figured it's best to get feedback as soon as possible.


Examples:

instrument:carriers/0-1 return a list with information about each carrier's total size, combined allocation size, allocation count, whether it's in the migration pool, and a histogram of free block sizes (log2, the first interval ends at 512 by default).

In the example below, ll_alloc carrier has no free blocks at all, the binary_alloc and eheap_alloc ones look healthy with a few very large blocks, and the fix_alloc carrier is somewhat fragmented with 22 free blocks smaller than 512 bytes (although this is not a problem for that allocator type).

1> instrument:carriers().
{ok,{512,
     [{ll_alloc,1048576,0,1048344,71,false,{0,0,0,0,0,0,0,0,0,0,0,0,0,0}},
      {binary_alloc,1048576,0,324640,13,false,{3,0,0,1,0,0,0,2,0,0,0,0,0,0}},
      {eheap_alloc,2097152,0,1037200,45,false,{2,1,1,3,4,3,2,2,0,0,0,0,0,0}},
      {fix_alloc,32768,0,29544,82,false,{22,0,0,0,0,0,0,0,0,0,0,0,0,0}},
      {...}|...]}}

instrument:allocations/0-1 return an allocation summary with histograms of allocated block sizes grouped by their origin and allocation type. The example below is taken with +Muatags true to track allocations on all allocator types.

2> instrument:allocations()
{ok,{128,0,
     #{udp_inet =>
           #{driver_event_state => {0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0}},
       tty_sl =>
           #{io_queue => {0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
             drv_internal => {0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0}},
       system =>
           #{db_segment => {0,0,0,0,0,18,0,0,1,0,0,0,0,0,0,0,0,0},
             heap => {0,0,0,0,20,4,2,2,2,3,0,1,0,0,1,0,0,0},
             thr_prgr_data => {38,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
             db_term => {271,3,1,52,80,1,0,0,0,0,0,0,0,0,0,0,0,0},
             code => {0,0,0,5,3,6,11,22,19,20,10,2,1,0,0,0,0,0},
             binary => {18,0,0,0,7,0,0,1,0,0,0,0,0,0,0,0,0,0},
             atom_entry => {8681,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
             message => {0,40,78,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0},
             ... }
       spawn_forker =>
           #{driver_select_data_state =>
                 {1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}},
       ram_file_drv => #{drv_binary => {0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0}},
       prim_file =>
           #{process_specific_data => {2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
             nif_trap_export_entry => {0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
             monitor_extended => {0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
             drv_binary => {0,0,0,0,0,0,1,0,3,5,0,0,0,1,0,0,0,0},
             binary => {0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}},
       prim_buffer =>
           #{nif_internal => {0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
             binary => {0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}}}}}

instrument:allocations/1 and instrument:carriers/1 let you tweak the histograms and which allocators to look in.

jhogberg added some commits Apr 10, 2018

erts: Update esdp->current_port on immediate port calls
This increases the accuracy of crash dumps and the upcoming
allocation tagging feature.
erts: Keep track of which NIF a scheduler is executing
This may be of interest in crash dumps and allows the upcoming
allocation tagging feature to track allocations on a per-NIF basis.

Note that this is only updated when user code calls a NIF; it's not
altered when the emulator calls NIFs during code upgrades or
tracing.

@jhogberg jhogberg self-assigned this Apr 19, 2018

@jhogberg jhogberg added the team:VM label Apr 19, 2018

erts: Rewrite memory instrumentation
This commit replaces the old memory instrumentation with a new
implementation that scans carriers instead of wrapping
erts_alloc/erts_free. The old implementation could not extract
information without halting the emulator, had considerable runtime
overhead, and the memory maps it produced were noisy and lacked
critical information.

Since the new implementation walks through existing data structures
there's no longer a need to start the emulator with special flags to
get information about carrier utilization/fragmentation. Memory
fragmentation is also easier to diagnose as it's presented on a
per-carrier basis which eliminates the need to account for "holes"
between mmap segments.

To help track allocations, each allocation can now be tagged with
what it is and who allocated it at the cost of one extra word per
allocation. This is controlled on a per-allocator basis with the
+M<S>atags option, and is enabled by default for binary_alloc and
driver_alloc (which is also used by NIFs).

@jhogberg jhogberg merged commit e2a760e into erlang:master Apr 24, 2018

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
license/cla Contributor License Agreement is signed.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment