Skip to content
This repository has been archived by the owner on Aug 24, 2022. It is now read-only.

JavaScript Performance For Madmen

KeyboardFire edited this page May 14, 2016 · 61 revisions

JavaScript performance is terrifying and unpredictable. If you want things to run fast, you'll need a dowsing rod, but these test cases might help:

Performance minutiae

Most JS runtimes

  • Wrapping integer arithmetic expressions in ( ) | 0 allows the runtime to be sure that you're doing integer arithmetic instead of floating-point arithmetic. This allows it to avoid checking for overflow and produce faster code in many cases.

  • See http://people.mozilla.com/~dmandelin/KnowYourEngines_Velocity2011.pdf for performance guidance for various JS engines (note: somewhat outdated)

  • In practice, while arguments[x] with integral indices is fast these days, using named arguments is still faster. In some cases it is even faster to have more named arguments than you actually intend to accept, and then use arguments.length to determine how many of those named arguments to use, i.e.:

    function variadic(arg0, arg1) {
      var argc = arguments.length | 0;
      if (argc === 1)
        alert("one argument " + arg0);
      else if (argc === 2)
        alert("two arguments " + arg0 + ", " + arg1);
    }

    This is likely to change in the future; when last tested in V8 and SpiderMonkey this was slightly faster because the explicit list of named arguments makes it easier for the VM to allocate sufficient space for them.

  • Various seemingly-identical ways of calling functions have differing performance characteristics. For example, the following two code blocks seem behaviorally identical:

    function dispatch (x, methodName, arg) {
      return x[methodName](arg);
    }
    
    dispatch(window, "alert", "hello");
    var lookupTable = {
      "alert": 1
    };
    
    function dispatch (x, methodName, arg) {
      var index = lookupTable[methodName] | 0;
      switch (index) {
        case 1:
          return x["alert"](arg);
        default:
          return x[methodName](arg);
      }
    }
    
    dispatch(window, "alert", "hello");

    Despite their behavioral similarity, the second function is (in modern runtimes at the time of this test) faster. Why? The second function has a rough form of an inline cache, where a cheap test is done to determine whether a static dispatch can occur. The function call to x["alert"] is syntactically equivalent to x[methodName] but semantically, the first one is a statically known invocation of a property named "alert", and the latter has to look up the property name at runtime. In practice, this allows a JIT to optimize the call to alert. For cases where the method you're calling is in the lookup table, fetching the named property from the lookup table and performing that invocation is faster.

    In a future JS runtime, the costs involved here may change up, and the manual inline cache might end up slower, or it might get even faster relative to the naive code.

  • Calling a single function but passing different types as arguments consistently deoptimizes that function in most runtimes. For example, if you have this function:

    function addValues (x, y) {
      return x + y;
    }

    Calling it on integers or on floats will be fast, but if you call it on both integers and floats, or a mix of the two, or even throw a string in there, your performance is going to go into the toilet. Sometimes inlining will mitigate this - the function getting inlined means the type information doesn't get messed up quite so bad - but it's still a fundamental problem. The reason this sucks is that VMs have to make guesses about what types your arguments and variables are, so intentionally passing different types to a function means that the VM now has to check types on every function call. If it always gets the same types, it's possible for the VM to cheat and perform checks less often, or even skip the checks entirely in cases where it is safe. A trivial workaround for this is to make a copy of the function for each known argument type - an adder for floats, an adder for integers, etc.

SpiderMonkey/IonMonkey (Mozilla Firefox)

  • Code loaded using eval suffers various performance penalties compared to code loaded via a script file or evaluated using new Function.
  • Functions fall into two categories, roughly described as 'singleton' and 'non-singleton'. A function in the 'singleton' category is eligible for a much larger set of optimizations and if your functions are non-singletons they will typically perform much worse. Many rules govern whether a function can be a singleton, but one good rule of thumb is that a function must be a function statement (not an expression passed as an argument) and that if the function is defined inside another function, the outer function must be an IIFE (immediately invoked function expression) and not a function being passed as an argument or stored in a variable. Setting INFERFLAGS=result in a recent build of the spidermonkey JS shell can give you insight into this (singleton functions are wrapped in < > while non-singletons are wrapped in [ ])
  • Any array instance that isn't initialized sequentially, or has non-indexed properties, gets deoptimized to a regular old Object (hash table). This includes adding a named property to an array after initializing it normally.
  • Local variables should only be assigned values of the same type (all objects count as one type). This is especially important for numeric computation. In many cases the optimizer will be able to optimize code that assigns different types to the same variable - but in more complex functions it may not be able to do so.
  • Functions containing a 'throw' statement cannot be inlined. See http://dxr.mozilla.org/mozilla-central/source/js/src/jit/IonBuilder.cpp#3516 To mitigate this, move your throw statement into a utility function.

V8 (Google Chrome)

  • Compiled code produced by the V8 JIT lives on the JavaScript heap and as a result, if your functions are re-jitted or deoptimized, this can increase the amount of garbage on the heap (and possibly trigger collections?)
  • See http://www.youtube.com/watch?v=XAqIpGU8ZZk and http://www.youtube.com/watch?v=UJPdhx5zTaw for various tips
  • Any array instance that is not sequentially initialized may end up as a 'sparse array', which is basically a hash table. Whether or not this happens is based on some heuristics based on the size of the array and its capacity. (Note that named properties can live in an array, unlike SpiderMonkey)
  • This page describes how to pass flags to the V8 runtime when starting chrome. There are some flags you can pass to cause the V8 runtime to tell you when it fails to optimize a function. Unfortunately, these do not seem to be documented on the wiki, so see the below blog post...
  • This series of blog posts goes into depth on various V8 performance gotchas and describes how to diagnose some of the problems.
  • V8 cannot represent integers larger than 31 bits as an integer (they get promoted to floats).
  • Floating-point values are almost universally stored in the heap by V8 (which means each one is an allocation).
  • Any function containing a try { } block is never optimized (regardless of whether it has any catch or finally blocks).
  • Functions that are too long (including comments and newlines/whitespace) are not inlined and may not be optimized.
  • Long-lived closures can hold onto variables from outer scopes that they never use, keeping them alive as garbage. To address this, set any outer scope variables you don't intend to use in the closure to null or undefined.
  • Calling functions with different hidden classes from a single call site will deoptimize the call site. Setting/removing properties on a function instance (like debugName, displayName, or toString) will cause its hidden class to diverge from built-in functions. Some built-in functions like the result of .bind() also have different classes from normal functions. Strict mode functions have different hidden classes from non-strict functions. See this example.
  • V8 hidden classes maintain the name & order of properties contained by the object, but not the actual types of the properties. One exception to this is that they also maintain the exact value of properties that are functions. This means that 2 objects with the same exact property list can have separate hidden classes if one of their properties has a different function (polymorphic inheritance, for example)

Chakra (Internet Explorer 9+)

See http://msdn.microsoft.com/en-us/library/windows/apps/hh781219.aspx.

Vaguely useful profiling tools

Mozilla Firefox

  • SPS: A sampling profiler that can record mixed native/JavaScript stacks so you can see why a particular JS function is slow. Lets you share profiles on the web with other people! Amazing! Default accuracy is somewhat low, but you can adjust the sampling rate and recording size in about:config.
  • JIT Inspector: Tells you various things about what the SpiderMonkey JIT believes about your code (and to an extent, how it is performing). Completely inscrutable unless you read this PDF, at which point it is only partially inscrutable. Also, doesn't display actual numbers anywhere or let you save profiles... Activating this deoptimizes your JS!
  • Firebug: If you can get the profiler to work instead of crashing the browser entirely, apparently it's pretty good. I've never gotten it to work.

Google Chrome

  • Web Inspector's Profiles tab: This is a sampling profiler with poor accuracy that often omits entire native call paths from your profiles, so the data is often a lie. Simply opening the Web Inspector deoptimizes your JS, and activating the profiler double-deoptimizes it.
  • chrome://tracing/: You can instrument your JS to show up here via console.time/console.timeEnd. Can only trace a few seconds at a time, but is fairly accurate. You can save/load traces.
  • WebGL Inspector: Let's pretend for a moment that WebGL performance is JS performance, since it sort of is. WebGL inspector gives you pretty accurate timings and recordings for your WebGL calls, so you can combine it with the built-in profiler to understand why your renderer is 50x slower than the one you wrote in Python.