-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RTS monitor #353
Merged
Merged
Add RTS monitor #353
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
a2e930c
to
3aeec4c
Compare
Closed
plajjan
commented
Nov 15, 2021
plajjan
commented
Nov 15, 2021
45dd9fc
to
4396dc4
Compare
nordlander
approved these changes
Nov 16, 2021
4396dc4
to
733d341
Compare
This adds statistics per RTS worker thread and the ability to expose these statistics over a UNIX domain socket. It is enabled with --rts-mon PATH to listen on the socket PATH. The protocol use is a simple ASCII line based protocol. A client can send "WTS" (currently only supported command) to which the RTS will respond with a dump of the worker thread statistics (WTS) as a JSON blob on one line. There's a new thread to deal with the monitor socket. We only accept a single connection at a time. I want the monitoring thread to be completely standalone, i.e. I do not want to mix in listening on various fds into our normal eventloop. This is deliberately kept simple. To encode data in JSON we use the yyjson library, which is a wicked fast JSON encoding and decoding library. It only consists of a single .h and .c file, so they are now included in the deps directory. Not sure if this is how we want to deal with dependencies in general, but it is there for now. The statistics are: - program name (argv[0]) - PID - current worker state - one of no-exist, worker, idle or sleeping - number of times a thread has gone to sleep - number of executed continuations - sum of time spent executing continuations, in nanoseconds - buckets with execution time & "bookkeeping" of continuations - for example, if a continuation takes less than 100ns to run, it is counted in the 100ns bucket, if it is between 100ns and 1us it is counted in the 1us bucket and so forth - bookeeping includes flushing outgoing queues with the locking etc involved, if the distributed database is used, then its interaction is included - the buckets are: - < 100ns - < 1us - < 10us - < 100us - < 1ms - < 10ms - < 100ms - < 1s - < 10s - < 100s - < +Inf - 100ns is more on the level of measurement overhead so a 10ns or 1ns bucket would not yield useful information It should be possible to graph the ratio between our bookkeeping and the actual execution of threads to show the sort of run time overhead of the Acton system. Going to be particularly interesting to see for the distributed database! There is a new utility in utils/actonmon which can connect to the monitoring socket and display thread stats. It supports three different modes: - simple - rich - prometheus The rich interface (started with --rich) uses the rich library to render a table with the worker thread statistics. It updates in place and looks pretty good. The update interval can be set with --interval, in seconds. The simple mode just prints some statistics to the screen in a long log at every --interval. It doesn't have any dependencies and so is useful when you don't have rich installed. It's currently very sparse. prometheus (--prom) mode starts to listen on http://localhost:8000 and will answer GET queries with the statistics that it collects from the RTS. To start examples/count with RTS monitoring enabled: examples/count --rts-mon ~/act_mon_socket Then check the mon utility with the rich interface: utils/actonmon ~/act_mon_socket --interval 0.2 --rich Or to enable prometheus export: utils/actonmon ~/act_mon_socket --prom
I have now fixed the last couple of things, including to properly free a temporary variable :P @nordlander approved so going to merge on CI success. Yay. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This adds monitoring function to the run time system. It is enabled with --rts-mon /tmp/mon_socket after which the RTS will listen on the UNIX Domain Socket /tmp/mon_socket. For any incoming request, we will dump the current stats for all worker threads.
There's a new thread to deal with the monitor socket. We only accept a single connection at a time. I want the monitoring thread to be completely standalone, i.e. I do not want to mix in listening on various fds into our normal eventloop. This is simple, which is good.
The protocol is a quick hack of a line protocol. The client sends "WTS" (worker thread stats) and gets a single line reply, which is a JSON document. To encode data we use yyjson, which is now included in our code base.
There is a new utility in utils/actonmon which can connect to the monitoring socket and display thread stats. It does so continuously at 0.5 second intervals. The interval can be set with --interval. It requires the path to the monitoring socket as an argument.
To start examples/count with RTS monitoring enabled:
Then check the mon utility:
There's also a
--rich
option to actonmon to tell it to use the rich library to render a very pretty looking table that is then updated in place.