Skip to content
This repository was archived by the owner on Nov 28, 2020. It is now read-only.
This repository was archived by the owner on Nov 28, 2020. It is now read-only.

[Update] Benchmarking Microtests: State of the union of node/benchmark tests and where to go next #25

@ThePrimeagen

Description

@ThePrimeagen

The goal of this issue is to demonstrate the following:

  • Why microbenching is important.
  • Benchmark.js is a good library to use
  • Where am I going to start and the path forward

Why micro benchmark?

Micro benchmarking is arguably more important from a library's standpoint than application / integration level benchmarking (I have heard macro benchmark as a term to). Micro benchmarking will quickly flag slowdowns in the system with little noise. This will help diagnosis the issue with little to no investigation needed.

Arguments against micro benchmarks

  • Can be less reliable.
    • This is addressed below in more detail (linked article). It can be measured accurately*.
  • Application / integration benchmarks are more meaningful measurements.
    • Correct and Incorrect. Its meaningful for its estimated performance of said application / integration that is being measured, but does not mean it will be as performant for my application as our calling patterns could be different, thus different performance characteristics.
    • Second, there is no reasonable/practical way to determine where performance issues are arising from if the granularity of performance tests are at application level. The noise is to loud.
    • Finally, due to the performance of application level measurements, some operations can become 2 - 3x slower and be eclipsed by the performance of the application itself. The 1000 paper cuts of slow down can be observed overtime with no individual being able to determine where/when it happened.

* measured accurately: If a stable platform and multiple runs are used, one gets the most consistent measurements possible from javascript measuring javascript.

Where are the current tests at and where do we go?

Overview

After reviewing the set of tests for nodejs/node/benchmark I see an awesome set of micro benchmarks. It really is a great place to start. It appears that the represented set of node specific libraries are here.

Why not just use those tests?

The primary reason why the tests are invalid forms of measurement can be found here(for the TL/DR; portion, read how option A and D work). Secondly, it would also reduce potential bugs / learning curve just by using a well known performance measuring library. Especially since the custom benchmark minimally suffers from more than all the same downfalls of benchmark.js.

Downfalls of Benchmark.js and their workaround.

The downfalls of benchmark.js is its javascript measuring javascript (one of the same downfalls of node/benchmark/common.js). The operating system can do who knows what during performance runs and cause incorrect measurements. Thus a more consistent platform (EC2 as an example) can make results more stable. Multiple runs (say 10), tossing out the high/low will help remove v8 mid-run optimizing, OS context switching, Wednesdays bad weather, etc. etc. issues.

What about flamegraphs?

Flamegraphs do not give an absolute number, they give relative numbers. Flamegraphs are amazing for understanding whats taking the most time within a library, not the performance of the library itself..

A side note: This would be a very interesting tool to use for performance charting overtime. One could use the % of samples as an indicator of growth of running time. If all tests were measured for a long enough period of time, a complete picture could be established and used build over build / day over day / some frequency. The only issue I see with this is that there is no out of box solution to this. Secondly writing this library would be a feat in of itself. So we will defer discussions / implementation of this for a later time or never.

Where to go from here?

  • Talk to @mhdawson on where to commit these tests too.
  • Now that we have a baseline of where to start I'll create a set of tests for require. It may be impossible for require new module code to be tested by benchmark.js due to the caching nature (it really depends if I can muck with the memory or not). It will be trivial to test requires cached result retrieval with benchmark js.
  • I'll talk to @mhdawson and learn how to integrate the results into the already built
    charting/storage system.
  • I'll start building a suite of tests using benchmark js for each of node's subsystems. This would be buffer, path, urlparse, etc. I would follow suit of node/benchmark.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions