Skip to content
Newer
Older
100644 416 lines (341 sloc) 17.3 KB
0721ac4 @slfritchie Add DTrace support for OS X, Solaris, and Linux (via SystemTap), 1/4
slfritchie authored Nov 17, 2011
1 DTrace and Erlang/OTP
2 =====================
3
4 History
5 -------
6
7 The first implementation of DTrace probes for the Erlang virtual
8 machine was presented at the [2008 Erlang User Conference] [4]. That
9 work, based on the Erlang/OTP R12 release, was discontinued due to
10 what appears to be miscommunication with the original developers.
11
12 Several users have created Erlang port drivers, linked-in drivers, or
13 NIFs that allow Erlang code to try to activate a probe,
14 e.g. `foo_module:dtrace_probe("message goes here!")`.
15
16 Goals
17 -----
18
19 1. Annotate as much of the Erlang VM as is practical.
20 * The initial goal is to trace file I/O operations.
21 2. Support all platforms that implement DTrace: OS X, Solaris,
22 and (I hope) FreeBSD and NetBSD.
23 3. To the extent that it's practical, support SystemTap on Linux
24 via DTrace provider compatibility.
25 4. Allow Erlang code to supply annotations.
26
27 Supported platforms
28 -------------------
29
0fd4e39 @psyeugenic Update dtrace for changes in R15
psyeugenic authored Feb 7, 2012
30 * OS X 10.6.x / Snow Leopard. It should also work for 10.7 / Lion,
31 but I haven't personally tested it.
32 * Solaris 10. I have done limited testing on Solaris 11 and
33 OpenIndiana release 151a, and both appear to work.
34 * FreeBSD 9.0, though please see the "FreeBSD 9.0 Release Notes"
35 section below!
36 * Linux via SystemTap compatibility. Please see the file
37 `README.systemtap.md` for more details.
0721ac4 @slfritchie Add DTrace support for OS X, Solaris, and Linux (via SystemTap), 1/4
slfritchie authored Nov 17, 2011
38
0fd4e39 @psyeugenic Update dtrace for changes in R15
psyeugenic authored Feb 7, 2012
39 Just add the `--enable-dtrace` option to your command when you run the
40 `configure` script.
0721ac4 @slfritchie Add DTrace support for OS X, Solaris, and Linux (via SystemTap), 1/4
slfritchie authored Nov 17, 2011
41
42 Contributions
43 -------------
44
45 Code contributions are welcome! This is a side project for me (SLF),
46 so things would go faster if other people are willing to pitch in.
47 Please use the GitHub pull request mechanism or send me an email
48 message.
49
50 To build from scratch, use this recipe. If you're an experienced Git
51 user and wish to add my repository as a remote repository, be my
52 guest. Just resume the recipe at command #4.
53
0fd4e39 @psyeugenic Update dtrace for changes in R15
psyeugenic authored Feb 7, 2012
54 NOTE: The `dtrace-experiment+michal2` branch is used for changes that
55 can be applied to both R14B and R15 releases.
56
0721ac4 @slfritchie Add DTrace support for OS X, Solaris, and Linux (via SystemTap), 1/4
slfritchie authored Nov 17, 2011
57 % git clone git://github.com/slfritchie/otp.git
58 % cd otp
0fd4e39 @psyeugenic Update dtrace for changes in R15
psyeugenic authored Feb 7, 2012
59 % git checkout -b dtrace-experiment+michal2 origin/dtrace-experiment+michal2
0721ac4 @slfritchie Add DTrace support for OS X, Solaris, and Linux (via SystemTap), 1/4
slfritchie authored Nov 17, 2011
60 % env ERL_TOP=`pwd` ./otp_build autoconf
61 % env ERL_TOP=`pwd` ./configure --enable-dtrace + whatever args you need
62 % env ERL_TOP=`pwd` make
63
64 Then `make install` and then start an Erlang shell via
65 `/path/to/installed/bin/erl +A 8`. The Erlang shell's banner should
66 include `[dtrace]`.
67
68 Try using this (ugly) DTrace command to watch file I/O probes in use
69 (tested on OS X only, sorry):
70
71 dtrace -Z -n 'erlang*:::efile_drv-entry {printf("%d %d %s | %d | %s %s , %d %d %d", arg0, arg1, arg2 == NULL ? "" : copyinstr(arg2), arg3, arg4 == NULL ? "" : copyinstr(arg4), arg5 == NULL ? "" : copyinstr(arg5), arg6, arg7, arg8)} erlang*:::efile_drv-int* {printf("%d %d %d | %d", arg0, arg1, arg2, arg3);} erlang*:::efile_drv-return {printf("%d %d %s | %d | %d %d %d", arg0, arg1, arg2 == NULL ? "" : copyinstr(arg2), arg3, arg4, arg5, arg6 ) ; }'
72
73 Implementation summary
74 ----------------------
75
76 So far, most effort has been focused on the `efile_drv.erl` code,
77 which implements most file I/O on behalf of the Erlang virtual
78 machine. This driver also presents a big challenge: its use of an I/O
79 worker pool (enabled by using the `erl +A 8` flag, for example) makes
80 it much more difficult to trace I/O activity because each of the
81 following may be executed in a different Pthread:
82
83 * I/O initiation (Erlang code)
84 * I/O proxy process handling, e.g. read/write when file is not opened
85 in `raw` mode, operations executed by the code & file server processes.
86 (Erlang code)
87 * `efile_drv` command setup (C code)
88 * `efile_drv` command execution (C code)
89 * `efile_drv` status return (C code)
90
91 **TODO: keep this description up-to-date.**
92
93 Example output from `lib/dtrace/examples/efile_drv.d` while executing
94 `file:rename("old-name", "new-name")`:
95
96 efile_drv enter tag={3,84} user tag some-user-tag | RENAME (12) | args: old-name new-name , 0 0 (port #Port<0.59>)
97 async I/O worker tag={3,83} | RENAME (12) | efile_drv-int_entry
98 async I/O worker tag={3,83} | RENAME (12) | efile_drv-int_return
99 efile_drv return tag={3,83} user tag | RENAME (12) | errno 2
100
101 ... where the following key can help decipher the output:
102
103 * `{3,83}` is the Erlang scheduler thread number (3) and operation
104 counter number (83) assigned to this I/O operation. Together,
105 these two numbers form a unique ID for the I/O operation.
106 * `12` is the command number for the rename operation. See the
107 definition for `FILE_RENAME` in the source code file `efile_drv.c`
108 or the `BEGIN` section of the D script `lib/dtrace/examples/efile_drv.d`.
109 * `old-name` and `new-name` are the two string arguments for the
110 source and destination of the `rename(2)` system call.
111 The two integer arguments are unused; the simple formatting code
112 prints the arguments anyway, 0 and 0.
113 * The worker pool code was called on behalf of Erlang port `#Port<0.59>`.
114 * The system call failed with a POSIX errno value of 2: `ENOENT`,
115 because the path `old-name` does not exist.
116 * The `efile_drv-int_entry` and `efile_drv_int_return` probes are
117 provided in case the user is
118 interested in measuring only the latency of code executed by
119 `efile_drv` asynchronous functions by I/O worker pool threads
120 and the OS system call that they encapsulate.
121
122 So, where does the `some-user-tag` string come from?
123
124 At the moment, the user tag comes from code like the following:
125
126 put(dtrace_utag, "some-user-tag"),
127 file:rename("old-name", "new-name").
128
129 This method of tagging I/O at the Erlang level is subject to change.
130
131 Example DTrace probe specification
132 ----------------------------------
133
134 /**
135 * Fired when a message is sent from one local process to another.
136 *
137 * NOTE: The 'size' parameter is in machine-dependent words and
138 * that the actual size of any binary terms in the message
139 * are not included.
140 *
141 * @param sender the PID (string form) of the sender
142 * @param receiver the PID (string form) of the receiver
143 * @param size the size of the message being delivered (words)
144 * @param token_label for the sender's sequential trace token
145 * @param token_previous count for the sender's sequential trace token
146 * @param token_current count for the sender's sequential trace token
147 */
148 probe message__send(char *sender, char *receiver, uint32_t size,
149 int token_label, int token_previous, int token_current);
150
151 /**
152 * Fired when a message is sent from a local process to a remote process.
153 *
154 * NOTE: The 'size' parameter is in machine-dependent words and
155 * that the actual size of any binary terms in the message
156 * are not included.
157 *
158 * @param sender the PID (string form) of the sender
159 * @param node_name the Erlang node name (string form) of the receiver
160 * @param receiver the PID/name (string form) of the receiver
161 * @param size the size of the message being delivered (words)
162 * @param token_label for the sender's sequential trace token
163 * @param token_previous count for the sender's sequential trace token
164 * @param token_current count for the sender's sequential trace token
165 */
166 probe message__send__remote(char *sender, char *node_name, char *receiver,
167 uint32_t size,
168 int token_label, int token_previous, int token_current);
169
170 /**
171 * Fired when a message is queued to a local process. This probe
172 * will not fire if the sender's pid == receiver's pid.
173 *
174 * NOTE: The 'size' parameter is in machine-dependent words and
175 * that the actual size of any binary terms in the message
176 * are not included.
177 *
178 * @param receiver the PID (string form) of the receiver
179 * @param size the size of the message being delivered (words)
180 * @param queue_len length of the queue of the receiving process
181 * @param token_label for the sender's sequential trace token
182 * @param token_previous count for the sender's sequential trace token
183 * @param token_current count for the sender's sequential trace token
184 */
185 probe message__queued(char *receiver, uint32_t size, uint32_t queue_len,
186 int token_label, int token_previous, int token_current);
187
188 /**
189 * Fired when a message is 'receive'd by a local process and removed
190 * from its mailbox.
191 *
192 * NOTE: The 'size' parameter is in machine-dependent words and
193 * that the actual size of any binary terms in the message
194 * are not included.
195 *
196 * @param receiver the PID (string form) of the receiver
197 * @param size the size of the message being delivered (words)
198 * @param queue_len length of the queue of the receiving process
199 * @param token_label for the sender's sequential trace token
200 * @param token_previous count for the sender's sequential trace token
201 * @param token_current count for the sender's sequential trace token
202 */
203 probe message__receive(char *receiver, uint32_t size, uint32_t queue_len,
204 int token_label, int token_previous, int token_current);
205
206 /* ... */
207
208 /* Async driver pool */
209
210 /**
211 * Show the post-add length of the async driver thread pool member's queue.
212 *
213 * NOTE: The port name is not available: additional lock(s) must
214 * be acquired in order to get the port name safely in an SMP
215 * environment. The same is true for the aio__pool_get probe.
216 *
217 * @param port the Port (string form)
218 * @param new queue length
219 */
220 probe aio_pool__add(char *, int);
221
222 /**
223 * Show the post-get length of the async driver thread pool member's queue.
224 *
225 * @param port the Port (string form)
226 * @param new queue length
227 */
228 probe aio_pool__get(char *, int);
229
230 /* Probes for efile_drv.c */
231
232 /**
233 * Entry into the efile_drv.c file I/O driver
234 *
235 * For a list of command numbers used by this driver, see the section
236 * "Guide to probe arguments" in ../../../README.md. That section
237 * also contains explanation of the various integer and string
238 * arguments that may be present when any particular probe fires.
239 *
240 * TODO: Adding the port string, args[10], is a pain. Making that
241 * port string available to all the other efile_drv.c probes
242 * will be more pain. Is the pain worth it? If yes, then
243 * add them everywhere else and grit our teeth. If no, then
244 * rip it out.
245 *
246 * @param thread-id number of the scheduler Pthread arg0
247 * @param tag number: {thread-id, tag} uniquely names a driver operation
248 * @param user-tag string arg2
249 * @param command number arg3
250 * @param string argument 1 arg4
251 * @param string argument 2 arg5
252 * @param integer argument 1 arg6
253 * @param integer argument 2 arg7
254 * @param integer argument 3 arg8
255 * @param integer argument 4 arg9
256 * @param port the port ID of the busy port args[10]
257 */
258 probe efile_drv__entry(int, int, char *, int, char *, char *,
259 int64_t, int64_t, int64_t, int64_t, char *);
260
261 /**
262 * Entry into the driver's internal work function. Computation here
263 * is performed by a async worker pool Pthread.
264 *
265 * @param thread-id number
266 * @param tag number
267 * @param command number
268 */
269 probe efile_drv__int_entry(int, int, int);
270
271 /**
272 * Return from the driver's internal work function.
273 *
274 * @param thread-id number
275 * @param tag number
276 * @param command number
277 */
278 probe efile_drv__int_return(int, int, int);
279
280 /**
281 * Return from the efile_drv.c file I/O driver
282 *
283 * @param thread-id number arg0
284 * @param tag number arg1
285 * @param user-tag string arg2
286 * @param command number arg3
287 * @param Success? 1 is success, 0 is failure arg4
288 * @param If failure, the errno of the error. arg5
289 */
290 probe efile_drv__return(int, int, char *, int, int, int);
291
292 Guide to efile_drv.c probe arguments
293 ------------------------------------
294
295 /* Driver op code: used by efile_drv-entry arg3 */
296 /* used by efile_drv-int_entry arg3 */
297 /* used by efile_drv-int_return arg3 */
298 /* used by efile_drv-return arg3 */
299
300 #define FILE_OPEN 1 (probe arg3)
301 probe arg6 = C driver dt_i1 = flags;
302 probe arg4 = C driver dt_s1 = path;
303
304 #define FILE_READ 2 (probe arg3)
305 probe arg6 = C driver dt_i1 = fd;
306 probe arg7 = C driver dt_i2 = flags;
307 probe arg8 = C driver dt_i3 = size;
308
309 #define FILE_LSEEK 3 (probe arg3)
310 probe arg6 = C driver dt_i1 = fd;
311 probe arg7 = C driver dt_i2 = offset;
312 probe arg8 = C driver dt_i3 = origin;
313
314 #define FILE_WRITE 4 (probe arg3)
315 probe arg6 = C driver dt_i1 = fd;
316 probe arg7 = C driver dt_i2 = flags;
317 probe arg8 = C driver dt_i3 = size;
318
319 #define FILE_FSTAT 5 (probe arg3)
320 probe arg6 = C driver dt_i1 = fd;
321
322 #define FILE_PWD 6 (probe arg3)
323 none
324
325 #define FILE_READDIR 7 (probe arg3)
326 probe arg4 = C driver dt_s1 = path;
327
328 #define FILE_CHDIR 8 (probe arg3)
329 probe arg4 = C driver dt_s1 = path;
330
331 #define FILE_FSYNC 9 (probe arg3)
332 probe arg6 = C driver dt_i1 = fd;
333
334 #define FILE_MKDIR 10 (probe arg3)
335 probe arg4 = C driver dt_s1 = path;
336
337 #define FILE_DELETE 11 (probe arg3)
338 probe arg4 = C driver dt_s1 = path;
339
340 #define FILE_RENAME 12 (probe arg3)
341 probe arg4 = C driver dt_s1 = old_name;
342 probe arg5 = C driver dt_s2 = new_name;
343
344 #define FILE_RMDIR 13 (probe arg3)
345 probe arg4 = C driver dt_s1 = path;
346
347 #define FILE_TRUNCATE 14 (probe arg3)
348 probe arg6 = C driver dt_i1 = fd;
349 probe arg7 = C driver dt_i2 = flags;
350
351 #define FILE_READ_FILE 15 (probe arg3)
352 probe arg4 = C driver dt_s1 = path;
353
354 #define FILE_WRITE_INFO 16 (probe arg3)
355 probe arg6 = C driver dt_i1 = mode;
356 probe arg7 = C driver dt_i2 = uid;
357 probe arg8 = C driver dt_i3 = gid;
358
359 #define FILE_LSTAT 19 (probe arg3)
360 probe arg4 = C driver dt_s1 = path;
361
362 #define FILE_READLINK 20 (probe arg3)
363 probe arg4 = C driver dt_s1 = path;
364
365 #define FILE_LINK 21 (probe arg3)
366 probe arg4 = C driver dt_s1 = existing_path;
367 probe arg5 = C driver dt_s2 = new_path;
368
369 #define FILE_SYMLINK 22 (probe arg3)
370 probe arg4 = C driver dt_s1 = existing_path;
371 probe arg5 = C driver dt_s2 = new_path;
372
373 #define FILE_CLOSE 23 (probe arg3)
374 probe arg6 = C driver dt_i1 = fd;
375 probe arg7 = C driver dt_i2 = flags;
376
377 #define FILE_PWRITEV 24 (probe arg3)
378 probe arg6 = C driver dt_i1 = fd;
379 probe arg7 = C driver dt_i2 = flags;
380 probe arg8 = C driver dt_i3 = size;
381
382 #define FILE_PREADV 25 (probe arg3)
383 probe arg6 = C driver dt_i1 = fd;
384 probe arg7 = C driver dt_i2 = flags;
385 probe arg8 = C driver dt_i3 = size;
386
387 #define FILE_SETOPT 26 (probe arg3)
388 probe arg6 = C driver dt_i1 = opt_name;
389 probe arg7 = C driver dt_i2 = opt_specific_value;
390
391 #define FILE_IPREAD 27 (probe arg3)
392 probe arg6 = C driver dt_i1 = fd;
393 probe arg7 = C driver dt_i2 = flags;
394 probe arg8 = C driver dt_i3 = offsets[0];
395 probe arg9 = C driver dt_i4 = size;
396
397 #define FILE_ALTNAME 28 (probe arg3)
398 probe arg4 = C driver dt_s1 = path;
399
400 #define FILE_READ_LINE 29 (probe arg3)
401 probe arg6 = C driver dt_i1 = fd;
402 probe arg7 = C driver dt_i2 = flags;
403 probe arg8 = C driver dt_i3 = read_offset;
404 probe arg9 = C driver dt_i4 = read_ahead;
405
406 #define FILE_FDATASYNC 30 (probe arg3)
407 probe arg6 = C driver dt_i1 = fd;
408
409 #define FILE_FADVISE 31 (probe arg3)
410 probe arg6 = C driver dt_i1 = fd;
411 probe arg7 = C driver dt_i2 = offset;
412 probe arg8 = C driver dt_i3 = length;
413 probe arg9 = C driver dt_i4 = advise_type;
414
415 [1]: http://www.erlang.org/euc/08/
Something went wrong with that request. Please try again.