Skip to content

Commit 6e724a4

Browse files
stall-detection: README and improved script
1 parent db63fda commit 6e724a4

File tree

2 files changed

+33
-1
lines changed

2 files changed

+33
-1
lines changed

test/stall_detection/README.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# Stall detector
2+
3+
Not really a test, more of a demo of live-monitoring runtime-events to detect
4+
stalls (extended periods of time when the Lwt main loop is not entered).
5+
6+
Stalls happen when too many of the promises make too many blocking calls. In the
7+
example here (`Stallerlib.stall`) it's one promise which does increasingly long
8+
blocking calls to `Unix.sleepf`. But it's also possible to stall with many short
9+
blocking calls (e.g., many promises logging to a file in a blocking way). Stalls
10+
can also happen if promises execute long computations.
11+
12+
(Stalls can be lengthened by the GC. Execution of the GC is the same as a
13+
blocking call as far as Lwt is concerned.)
14+
15+
## Detection
16+
17+
The detection relies on two mechanism. The first mechanism polls the runtime
18+
event ring to record the latest occurrence of Lwt scheduler's events. The second
19+
mechanism checks that this latest occurrence is recent enough.
20+
21+
## selfdetector
22+
23+
The `selfdetector.ml` file shows how to run the detector in-process using a
24+
separate domain to run the detection in parallel.
25+
26+
## staller + detector
27+
28+
The `stall-detect.sh` runs separate processes for a program which stalls for
29+
increasingly long times and a program which monitors the events of the first
30+
process.

stall-detect.sh renamed to test/stall_detection/stall-detect.sh

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
#!/bin/bash
22
set -euo pipefail
33

4-
PROJECT_ROOT="$(git rev-parse --show-toplevel)"
4+
PROJECT_ROOT=./"$(git rev-parse --show-cdup)"
55
dune build "$PROJECT_ROOT/_build/default/test/stall_detection/staller.exe"
66
dune build "$PROJECT_ROOT/_build/default/test/stall_detection/detector.exe"
77

@@ -18,3 +18,5 @@ echo "detector started"
1818

1919
# Optional: wait for both processes to finish
2020
wait
21+
22+
rm -f "$STALLER_PID.events"

0 commit comments

Comments
 (0)