vmhealth op that spits out some stats #536

Open
wants to merge 9 commits into
from

Projects

None yet

2 participants

@timo
Member
timo commented Feb 12, 2017

i've implemented an op that you can use to query some random information about the state of the running MoarVM. Here's some example output:

:nursery_bytes((77568,))
:jitframes_produced(790)
:gc_seqnr(115)
:num_threads(1)
:gen2_pagecounts(((0, 0, 14, 843, 25, 207, 425, 4, 164, 5, 52, 2, 9, 45, 1, 1, 42, 1, 22, 3, 1, 0, 1, 0, 14, 0, 1, 0, 0, 1, 0, 0, 75, 23, 0, 0, 0, 0, 0, 0),))
:fsa_free_elems((82, 3, 50, 54, 106, 82, 105, 51, 117, 48, 53, 103, 0, 93, 119, 77, 111, 70, 65, 74, 119, 98, 122, 108, 120, 88, 115, 105, 107, 117, 125, 117, 123, 121, 121, 117, 125, 109, 126, 122, 125, 120, 127, 121, 125, 63, 126, 114, 117, 110, 124, 108, 119, 116, 115, 78, 124, 124, 127, 123, 115, 121, 123, 126, 126, 126, 123, 121, 127, 126, 114, 123, 122, 126, 118, 125, 125, 77, 120, 122, 123, 93, 124, 115, 119, 122, 123, 113, 118, 124, 125, 121, 124, 115, 126, 125))
:fsa_pagecounts((16, 8, 8, 10, 13, 800, 32, 20, 4, 5, 2, 2, 2, 1, 2, 5, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1))
:gc_bytes_promoted_since_full(14900680)
:speshframes_produced(1111)
:gen2_free_elems(((0, 0, 7, 149, 224, 90, 94, 19, 241, 70, 216, 33, 110, 37, 50, 7, 238, 87, 220, 31, 1, 0, 47, 0, 68, 0, 4, 0, 0, 168, 0, 0, 99, 100, 0, 0, 0, 0, 0, 0),))
  • nursery_bytes gives the number of bytes in use in each thread
  • jitframes_produced spits out the jit sequence number
  • speshframes_produced spits out the spesh sequence number (same number that's used for the spesh limit)
  • gc_seqnr is basically the number of GC runs so far
  • num_threads is how many threads are currently in the "started" state
  • gc_bytes_promoted_since_full (a mouthful!) is how many bytes moar thinks have been promoted to gen2 since there last was a full collection,
  • gen2_pagecounts gives a list of gen2 pages allocated each size bucket for each thread
  • gen2_free_elems gives a list of how far the allocation pointer is from the allocation limit (in units) in each size class for every thread
  • fsa_pagecounts and fsa_free_elems works the same as gen2_*, but since the fsa is shared between threads, it just has one list where the others have one list per thread.

I'm opening this pull request for discussion. Especially:

  • What other stats would be interesting to probe regularly, for example in a monitoring system like prometheus?
  • how do i make this a bit safer? currently it follows some pointers that other threads may update at some point, and that could explode when another thread frees the thing in question
  • should the API change? should the user be allowed to bitmask out stuff they want vs stuff they don't want?
  • is it acceptable for the contents of the results to change from version to version? should we warn users that they shouldn't rely on the output being compatible? add a version parameter for backwards compat? just yolo it?

One thing I'd like to see is how many major collections have been done so far, we don't count that yet. We have the number of gen2 roots in the profiler, but not yet here. Maybe that's interesting, too. In combination with the size buckets, it'd be pretty neat if you could query an object for what its size in the nursery/gen2 will be so you can correlate consumption of gen2 space with what objects are likely to be the cause.

Have at it!

@moritz
Contributor
moritz commented Feb 12, 2017

For monitoring an application, I'd want to know things like

  • memory used
  • number of GC runs
  • number of active threads
  • number of waiting threads
  • some performance counters, like percentage of time spent in the GC so far.

And of course I'd want that to be a stable interface.

Now the question is, is that actually the use case you're going for? Or do you want to debug MoarVM internals?

timo added some commits Feb 10, 2017
@timo timo Count spesh_produced even without spesh_limit. 56644d6
@timo timo introduce vmhealth op
gives moar-specific information about current state:

memory in use by nursery and FSA, count of gc runs so far,
number of frames speshed and jitted, bytes promoted since
last full collection etc.
48c0ce3
@timo timo also report page counts and free items in thread's gen2 61c3bed
@timo timo committing WIP on "blocked threads" and "per-stage" threads be7899e
@timo timo WIP: add timings for GC and also major gc count cf14339
@timo timo set maj/min flag in gc prof when finished
instead of when starting. this lets us move the start
up a bit further before a little section that coordinates
with other threads, so any delay in there can also be
measured.
0a56801
@timo timo WIP on threads-per-stage and threads blocked e244999
@timo timo vmhealth: guard against lack of event loop thread 808561b
@timo timo hook up threads_per_state list to vmhealth hash a00f0a8
@timo
Member
timo commented Feb 20, 2017

Here's an output of the current format with my shooter game (but it has a start { } at the beginning to force creation of a second thread)

:nursery_bytes((435736, 0))
:jitframes_produced(1035)
:gc_major_seqnr(1)
:gc_seqnr(5747)
:num_threads((0, 0, 0, 2, 0, 0, 0))
:gen2_pagecounts(((0, 0, 14, 954, 296, 275, 741, 10, 219, 6, 60, 4, 10, 49, 7, 1, 43, 1, 23, 3, 1, 0, 3, 0, 14, 0, 1, 0, 0, 1, 0, 0, 75, 82, 0, 0, 0, 0, 0, 0), (0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0)))
:fsa_free_elems((4, 75, 127, 49, 115, 60, 1, 74, 106, 53, 1, 105, 53, 59, 86, 45, 109, 72, 64, 62, 117, 97, 121, 106, 117, 84, 117, 108, 107, 117, 125, 103, 122, 120, 121, 117, 123, 109, 126, 122, 126, 116, 126, 121, 125, 58, 126, 111, 116, 109, 125, 107, 119, 116, 124, 122, 82, 125, 115, 117, 115, 122, 124, 126, 126, 126, 123, 121, 114, 126, 126, 124, 122, 126, 118, 125, 124, 122, 78, 122, 122, 93, 123, 115, 119, 122, 123, 113, 118, 124, 125, 121, 124, 115, 126, 122))
:fsa_pagecounts((15, 19, 31, 12, 13, 1133, 33, 33, 5, 14, 19, 4, 1, 1, 1, 5, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1))
:gc_bytes_promoted_since_full(261037)
:speshframes_produced(1389)
:gen2_free_elems(((0, 0, 36, 46, 109, 231, 196, 137, 44, 10, 230, 77, 186, 173, 184, 7, 171, 121, 52, 52, 101, 0, 91, 0, 95, 0, 4, 0, 0, 168, 0, 0, 198, 49, 0, 0, 0, 0, 0, 0), (0, 0, 1, 13, 4, 6, 47, 0, 65, 0, 0, 0, 0, 0, 0, 0, 29, 0, 10, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 11, 0, 0, 0, 0, 0, 0)))
:gc_timings_minor(1867338)
:gc_timings_major(51076178)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment