test code against a certain rate of production traffic
Loops a task function, for a given duration, across multiple threads.
A test is deemed succesful if it ends without creating a cycle backlog.
example: benchmark a recursive fibonacci function across 4 threads
// benchmark.js
import { dyno } from '@nicholaswmin/dyno'
await dyno(async function cycle() {
// <benchmarked-code>
function fibonacci(n) {
return n < 1 ? 0
: n <= 2 ? 1 : fibonacci(n - 1) + fibonacci(n - 2)
}
fibonacci(35)
// </benchmarked-code>
}, {
// test parameters
parameters: { cyclesPerSecond: 100, threads: 4, durationMs: 5 * 1000 },
// log live stats
onTick: list => {
console.clear()
console.table(list().primary().pick('count'))
console.table(list().threads().pick('mean'))
}
})run it:
node benchmark.jslogs:
cycle stats
┌─────────┬────────┬───────────┬─────────┐
│ uptime │ issued │ completed │ backlog │
├─────────┼────────┼───────────┼─────────┤
│ 4 │ 100 │ 95 │ 5 │
└─────────┴────────┴───────────┴─────────┘
average timings/durations, in ms
┌─────────┬───────────┬────────┐
│ thread │ evt_loop │ cycle │
├─────────┼───────────┼────────┤
│ '46781' │ 10.47 │ 10.42 │
│ '46782' │ 10.51 │ 10.30 │
│ '46783' │ 10.68 │ 10.55 │
│ '46784' │ 10.47 │ 10.32 │
└─────────┴───────────┴────────┘npm i @nicholaswmin/dynonpx initcreates a preconfigured sample
benchmark.js.
Run it:
node benchmark.jsimport { dyno } from '@nicholaswmin/dyno'
await dyno(async function cycle() {
// add benchmarked task
// code in this block runs in its own thread
}, {
parameters: {
// add test parameters
},
onTick: list => {
// build logging from the provided measurements
}
})| name | type | default | description |
|---|---|---|---|
cyclesPerSecond |
Number |
50 |
global cycle issue rate |
durationMs |
Number |
5000 |
how long the test should run |
threads |
Number |
auto |
number of spawned threads |
automeans it detects the available cores but can be overridenthese parameters are user-configurable on test startup.
The primary spawns the benchmarked code as task threads.
Then, it starts issuing cycle commands to each one, in round-robin,
at a set rate, for a set duration.
The task threads must execute their tasks faster than the time it takes for
their next cycle command to come through, otherwise the test will start
accumulating a cycle backlog.
When that happens, the test stops; the configured cycle rate is deemed as
the current breaking point of the benchmarked code.
An example:
A benchmark configured to use
threads: 4&cyclesPerSecond: 4.
Each task thread must execute its own code in < 1 second since this
is the rate at which it receives cycle commands.
The main process. Orchestrates the test and the spawned task threads.
The benchmarked code, running in its own separate process.
Receives cycle commands from the primary, executes it's code and records
its timings.
The benchmarked code
A command that signals a task thread to execute it's code.
The rate at which the primary sends cycle commands to the task threads
Amount of time it takes a task thread to execute it's own code
Count of issued cycle commands that have been issued/sent but not
executed yet.
This is how the process model would look, if sketched out.
// assume `fib()` is the benchmarked code
Primary 0: cycles issued: 100, finished: 93, backlog: 7
│
│
├── Thread 1
│ └── function fib(n) {
│ ├── return n < 1 ? 0
│ └── : n <= 2 ? 1 : fib(n - 1) + fib(n - 2)}
│
├── Thread 2
│ └── function fib(n) {
│ ├── return n < 1 ? 0
│ └── : n <= 2 ? 1 : fib(n - 1) + fib(n - 2)}
│
└── Thread 3
└── function fib(n) {
├── return n < 1 ? 0
└── : n <= 2 ? 1 : fib(n - 1) + fib(n - 2)}The benchmarker comes with a statistical measurement system that can be optionally used to diagnose bottlenecks.
Some metrics are recorded by default; others can be recorded by the user within a task thread.
Every recorded value is tracked as a Metric, represented as a
histogram with min, mean, max properties.
A metric is represented as a histogram with the following properties:
| name | description |
|---|---|
count |
number of values/samples. |
min |
minimum value |
mean |
mean/average of values |
max |
maximum value |
stddev |
standard deviation between values |
last |
last value |
snapshots |
last 50 states |
Timing metrics are collected in milliseconds.
Metrics can be queried from the list argument of the onTick callback.
// ...
onTick: list => {
// primary metrics
console.log(list().primary())
// task thread metrics
console.log(list().threads())
}get all primary/main metrics
// log all primary metrics
console.log(list().primary())get all metrics, for each task thread
// log all metric of every task-thread
console.log(list().threads())reduce all metrics to a single histogram property
list().threads().pick('min')
// from this: { cycle: [{ min: 4, max: 5 }, evt_loop: { min: 2, max: 8 } ...
// to this : { cycle: 4, evt_loop: 2 ...available:
min,mean,max,stdev,snapshots,count,last
stddev: standard deviation between recorded valueslast: last recorded valuecount: number of recorded values
reduce all metrics that have been pick-ed to an array of histograms,
to an array of single histogram values.
list().primary().pick('snapshots').of('max')
// from this: [{ cycle: [{ ... max: 5 }, { ... max: 3 }, { ... max: 2 } ] } ...
// to this : [{ cycle: [5,3,2 ....] } ...note: only makes sense if it comes after
.pick('snapshots')
get specific metric(s) instead of all of them
const loopMetrics = list().threads().metrics('evt_loop', 'fibonacci')
// only the `evt_loop` and `fibonacci` metricssort by specific metric
const sorted = list().threads().pick('min').sort('cycle', 'desc')
// sort by descending min 'cycle' durationsavailable:
desc,asc
get result as an Object, like `Object.groupBy
with the metric name used as the key.
const obj = list().threads().pick('snapshots').of('mean').group()The following metrics are collected by default:
| name | description |
|---|---|
issued |
count of issued cycles |
completed |
count of completed cycles |
backlog |
size of cycles backlog |
uptime |
seconds since test start |
| name | description |
|---|---|
cycles |
cycle timings |
evt_loop |
event loop timings |
any custom metrics will appear here.
Custom metrics can be recorded with either:
both of them are native extensions of the User Timing APIs.
The metrics collector records their timings and attaches the tracked Metric
histogram to its corresponding task thread.
example: instrumenting a function using
performance.timerify:
// performance.timerify example
import { dyno } from '@nicholaswmin/dyno'
await dyno(async function cycle() {
performance.timerify(function fibonacci(n) {
return n < 1 ? 0
: n <= 2 ? 1
: fibonacci(n - 1) + fibonacci(n - 2)
})(30)
}, {
parameters: { cyclesPerSecond: 20 },
onTick: list => {
console.log(list().threads().metrics().pick('mean'))
}
})
// logs
// ┌─────────┬───────────┐
// │ cycle │ fibonacci │
// ├─────────┼───────────┤
// │ 7 │ 7 │
// │ 11 │ 5 │
// │ 11 │ 5 │
// └─────────┴───────────┘note: the stats collector uses the function name for the metric name, so named
functions should be preffered to anonymous arrow-functions
Each metric contains up to 50 snapshots of its past states.
This allows plotting them as a timeline, using the
console.plot module.
The following example benchmarks 2
sleepfunctions & plots their timings as an ASCII chart
// Requires:
// `npm i @nicholaswmin/console-plot --no-save`
import { dyno } from '@nicholaswmin/dyno'
import console from '@nicholaswmin/console-plot'
await dyno(async function cycle() {
await performance.timerify(function sleepRandom1(ms) {
return new Promise(r => setTimeout(r, Math.random() * ms))
})(Math.random() * 20)
await performance.timerify(function sleepRandom2(ms) {
return new Promise(r => setTimeout(r, Math.random() * ms))
})(Math.random() * 20)
}, {
parameters: { cyclesPerSecond: 15, durationMs: 20 * 1000 },
onTick: list => {
console.clear()
console.plot(list().threads().pick('snapshots').of('mean').group(), {
title: 'Plot',
subtitle: 'mean durations (ms)'
})
}
})which logs:
Plot
-- sleepRandom1 -- cycle -- sleepRandom2 -- evt_loop
11.75 ┤╭╮
11.28 ┼─────────────────────────────────────────────────────────────────────╮
10.82 ┤│╰───╮ ╭╯ ╰╮ │╰╮ ╭─────────╯╰──────────╮ ╭─────────────────╯ ╰───────────╮╭─╮ ╭──────────
10.35 ┼╯ ╰╮╭╮╭╯ ╰───╯ ╰──╯ ╰─╯ ╰╯ ╰────╯
9.88 ┤ ╰╯╰╯
9.42 ┤
8.95 ┤
8.49 ┤
8.02 ┤
7.55 ┤
7.09 ┤╭╮
6.62 ┼╯╰───╮ ╭─────────╮ ╭──╮
6.16 ┤ ╰╮╭──╯ ╰───╯ ╰───────────────────────╮ ╭─────────────────────╮╭───╮ ╭─────────
5.69 ┤╭╮ ╰╯ ╭───────────╮ ╭╮╭──────╮ ╰╯ ╰──╭╮╭─╮╭─────
5.22 ┤│╰╮╭─╮ ╭──╮ ╭───╮╭─╮ ╭────────────────────╯ ╰──╯╰╯ ╰────────────────╯╰╯ ╰╯
4.76 ┤│ ╰╯ ╰───╯ ╰─────╯ ╰╯ ╰─╯
4.29 ┼╯
mean durations (ms)
- last: 100 itemsUsing lambdas/arrow functions means the metrics collector has no function name to use for the metric. By their own definition, they are anonymous.
Change this:
const foo = () => {
// test code
}
performance.timerify(foo)()to this:
function foo() {
// test code
}
performance.timerify(foo)()The benchmark file self-forks itself. 👀
This means that any code that exists outside the dyno block will also
run in multiple threads.
This is a design tradeoff, made to provide the ability to create simple,
single-file benchmarks but it can create issues if you intent to run code
after the dyno() resolves/ends;
or when running this as part of an automated test suite.
In this example,
'done'is logged3times instead of1:
import { dyno } from '@nicholaswmin/dyno'
const result = await dyno(async function cycle() {
// task code, expected to run 3 times ...
}, { threads: 3 })
console.log('done')
// 'done'
// 'done'
// 'done'To work around this, the before/after hooks can be used for setup and
teardown, like so:
await dyno(async function cycle() {
console.log('task')
}, {
parameters: { durationMs: 5 * 1000, },
before: async parameters => {
console.log('before')
},
after: async parameters => {
console.log('after')
}
})
// "before"
// ...
// "task"
// "task"
// "task"
// "task"
// ...
// "after"Alternatively, the task function can be extracted to it's own file.
// task.js
import { task } from '@nicholaswmin/dyno'
task(async function task(parameters) {
// task code ...
// `benchmark.js` test parameters are
// available here.
})then referenced as a path in benchmark.js:
// benchmark.js
import { join } from 'node:path'
import { dyno } from '@nicholaswmin/dyno'
const result = await dyno(join(import.meta.dirname, './task.js'), {
threads: 5
})
console.log('done')
// 'done'This should be the preferred method when running this as part of a test suite.
This is not a stress-testing tool.
Stress-tests are far more complex and require a near-perfect
replication of an actual production environment.
This is a prototyping tool that helps testing whether some prototype idea is worth proceeding with or whether it has unworkable scalability issues.
It's multi-threaded model is meant to mimic the execution model of horizontally-scalable, share-nothing services.
It's original purpose was for benchmarking a module prototype that heavily interacts with a data store over a network.
It's not meant for side-to-side benchmarking of synchronous code, Google's Tachometer being a much better fit.
install deps:
npm ciunit & integration tests:
npm testtest coverage:
npm run test:coveragenote: the parameter prompt is suppressed when
NODE_ENV=test
meta checks:
npm run checksgenerate a sample benchmark:
npx initgenerate Heroku-deployable benchmark:
npx init-cloudTodos are available here
update README.md code snippets:
npm run examples:updatesource examples are located in:
/bin/example