Skip to content

Commit

Permalink
Put more stuff from the blog post in the README.
Browse files Browse the repository at this point in the history
  • Loading branch information
nnethercote committed Sep 14, 2020
1 parent 3afe1ba commit 67179b8
Showing 1 changed file with 79 additions and 0 deletions.
79 changes: 79 additions & 0 deletions README.md
Expand Up @@ -54,6 +54,66 @@ such integer). On the same input, `counts -w` produces the following output.
The total and per-line counts are now weighted; the output incorporates both
frequency and a measure of magnitude.

# An example

As an example, I added print statements to Firefox's heap allocator so it
prints a line for every allocation that shows its category, requested size, and
actual size. A short run of Firefox with this instrumentation produced a 77 MB
file containing 5.27 million lines. `counts` produced the following output for
this file.
```
5270459 counts:
( 1) 576937 (10.9%, 10.9%): small 32 (32)
( 2) 546618 (10.4%, 21.3%): small 24 (32)
( 3) 492358 ( 9.3%, 30.7%): small 64 (64)
( 4) 321517 ( 6.1%, 36.8%): small 16 (16)
( 5) 288327 ( 5.5%, 42.2%): small 128 (128)
( 6) 251023 ( 4.8%, 47.0%): small 512 (512)
( 7) 191818 ( 3.6%, 50.6%): small 48 (48)
( 8) 164846 ( 3.1%, 53.8%): small 256 (256)
( 9) 162634 ( 3.1%, 56.8%): small 8 (8)
( 10) 146220 ( 2.8%, 59.6%): small 40 (48)
( 11) 111528 ( 2.1%, 61.7%): small 72 (80)
( 12) 94332 ( 1.8%, 63.5%): small 4 (8)
( 13) 91727 ( 1.7%, 65.3%): small 56 (64)
( 14) 78092 ( 1.5%, 66.7%): small 168 (176)
( 15) 64829 ( 1.2%, 68.0%): small 96 (96)
( 16) 60394 ( 1.1%, 69.1%): small 88 (96)
( 17) 58414 ( 1.1%, 70.2%): small 80 (80)
( 18) 53193 ( 1.0%, 71.2%): large 4096 (4096)
( 19) 51623 ( 1.0%, 72.2%): small 1024 (1024)
( 20) 45979 ( 0.9%, 73.1%): small 2048 (2048)
```
Unsurprisingly, small allocations dominate. But what happens if we weight each
entry by its size? counts -w produced the following output.
```
2554515775 counts:
( 1) 501481472 (19.6%, 19.6%): large 32768 (32768)
( 2) 217878528 ( 8.5%, 28.2%): large 4096 (4096)
( 3) 156762112 ( 6.1%, 34.3%): large 65536 (65536)
( 4) 133554176 ( 5.2%, 39.5%): large 8192 (8192)
( 5) 128523776 ( 5.0%, 44.6%): small 512 (512)
( 6) 96550912 ( 3.8%, 48.3%): large 3072 (4096)
( 7) 94164992 ( 3.7%, 52.0%): small 2048 (2048)
( 8) 52861952 ( 2.1%, 54.1%): small 1024 (1024)
( 9) 44564480 ( 1.7%, 55.8%): large 262144 (262144)
( 10) 42200576 ( 1.7%, 57.5%): small 256 (256)
( 11) 41926656 ( 1.6%, 59.1%): large 16384 (16384)
( 12) 39976960 ( 1.6%, 60.7%): large 131072 (131072)
( 13) 38928384 ( 1.5%, 62.2%): huge 4864000 (4866048)
( 14) 37748736 ( 1.5%, 63.7%): huge 2097152 (2097152)
( 15) 36905856 ( 1.4%, 65.1%): small 128 (128)
( 16) 31510912 ( 1.2%, 66.4%): small 64 (64)
( 17) 24805376 ( 1.0%, 67.3%): huge 3097600 (3100672)
( 18) 23068672 ( 0.9%, 68.2%): huge 1048576 (1048576)
( 19) 22020096 ( 0.9%, 69.1%): large 524288 (524288)
( 20) 18980864 ( 0.7%, 69.9%): large 5432 (8192)
```
This shows that the cumulative count of allocated bytes (2.55GB) is dominated
by a mixture of larger allocation sizes.

This example gives just a taste of what counts can do.

# Typical uses

This technique is often useful when you already know something -- e.g. a
Expand Down Expand Up @@ -110,6 +170,25 @@ might sometimes end up with output files that are gigabytes in size. But
`counts` is fast because it's so simple. Let the machine do the work for you.
(It does help if you have a machine with an SSD.)

# Ad Hoc Profiling

For a long time I have, in my own mind, used the term ad hoc profiling to
describe this combination of logging print statements and frequency-based
post-processing. Wikipedia defines “ad hoc” as follows.

> In English, it generally signifies a solution designed for a specific problem
> or task, non-generalizable, and not intended to be able to be adapted to
> other purposes
The process of writing custom code to collect this kind of profiling data — in
the manner I disparaged in the previous section — truly matches this definition
of "ad hoc".

But `counts` is valuable specifically because it makes this type of custom
profiling less ad hoc and more repeatable. I should arguably call it
"generalized ad hoc profiling" or "not so ad hoc profiling", but those names
don’t have quite the same ring to them.

# Tips

Use unbuffered output for the print statements. In C and C++ code, use
Expand Down

0 comments on commit 67179b8

Please sign in to comment.