Cheerp PreExecuter

sderossi edited this page Feb 18, 2016 · 1 revision

Introduction

Standard methods to reduce the size of JavaScript code are compression (using a packer or gzip/brotli through the HTTP server), and minification (e.g. removing comments and white spaces from the source, shortening variable names). This latter approach has been an integral part of Cheerp since the 1.1 release.

More aggressive methods that are commonly employed are constant propagation, dead code elimination and function inlining. This type of minimization typically takes a long time to perform, makes debugging the optimized code much harder, and is therefore typically only intended to be applied on release builds.

Cheerp PreExecuter

Starting from Cheerp 1.2, our compiler includes the PreExecuter, an approach to further reducing the JavaScript output size that relies on a compile-time execution of portions of the code, and their substitution with constant values. Applying the PreExecuter leads to a significant reduction in the output size, particularly for small programs. Moreover, it also reduces the time-to-main, since global memory is directly initialized with the saved values instead of computed at runtime.

The PreExecuter is currently an experimental feature of Cheerp, and needs to be activated with the --cheerp-preexecute command line flag. We currently use by default the PreExecuter on all the standard libraries shipped with Cheerp.

Dead code elimination is not sufficient

The Cheerp PreExecuter goes a step further compared to other aggressive minimization methods. It optimizes the output by selectively executing portions of code at compile-time, recording the resulting global memory state, then substituting the code with a global memory initialization using the saved values.

Currently, Cheerp combines all C and C++ symbols into a single JavaScript output file. This is the equivalent of statically linking a program in the native world. Statically linking allows Cheerp to do global dead code elimination and whole program optimization during link time optimization. These optimizations are applied when -O1 or higher is used.

When global dead code elimination is applied, Cheerp compiles a simple hello world C++ program to 1 MB of JavaScript code. One of the reasons is that Cheerp links to the standard C++ library (libstdc++) which calls a complex constructor that initializes all its memory manually. In the native world, this constructor is executed at runtime and stored in the symbol llvm.global_ctors. That symbol marks the constructor as ‘used’ during compile time. As a result, it is not possible to remove the constructors at compile time using dead code elimination.

Although constructors cannot be eliminated (since that would change the semantics of the program), they can be evaluated at compile time. The constructors of libstdc++ do not depend on any run time input (e.g. environment variables or command line flags), which allows the compiler to compute the initialized values of the constructors at compile time. The initialized values are then stored in memory and loaded at run time.

Evaluating 32-bit code in a 64-bit host OS

When Cheerp compiles C++ code to JavaScript, it uses a 32-bit platform data layout. Given the 32-bit data layout in use, evaluation of the constructors is not trivial on 64-bit operating systems. Think of reading and writing data to memory, where the bit width is important to determine how much data has to be read or written. To simplify the development initially, a 32-bit chroot was used to emulate a 32-bit environment.

Once the LLVM interpreter was correctly executing the LLVM intermediate representation of the constructors in 32-bit, it was time to port the code to a 64-bit architecture. Since pointers are represented with 32-bit in a 32-bit data layout, malloc() cannot be used to allocate memory because only 32 bits of a 64-bit pointer can be stored. Luckily, mmap() has a flag called MAP_32BIT that causes mmap() to return memory that’s located in in the lower 32-bit address space.

Effects of the PreExecuter - Benchmarks

In order to evaluate the effect of the PreExecuter, we will refer to two simple examples:

  1. An Hello World program.

#include <cheerp/clientlib.h>

void webMain() {
    client::console.log("Hello world!");
}
  1. A simple program which uses the standard library's vector class.

#include <cheerp/clientlib.h>

#include <vector>

class A {
public:
    const char *a;
    A(int i) { a = nullptr; }
};

void webMain()
{
    std::vector<A> v;
    client::console.log("sizeof(A):", sizeof(A), "== 4");
    client::console.log("v.end()-v.begin():", v.end()-v.begin(), "== 0");
    for(int i = 0; i < 10; i++)
        v.push_back(A(i));
    client::console.log("v.end()-v.begin():", v.end()-v.begin(), "== 10");
    client::console.log("v.size():", v.size(), "== 10");
    v.erase(v.begin() + 3);
    client::console.log("v.end()-v.begin():", v.end()-v.begin(), "== 9");
}

The following plot illustrates the output size of these examples with different levels of optimization: our standard minification available in Cheerp 1.1, a more aggressive baseline minimization included in Cheerp 1.2, and the maximum optimization including the PreExecuter.

PreExecuter Benchmarks

Compiling the hello world program with only -O3 without minification results in a 1.3 MB JavaScript. With minification enabled by default now, the file size is reduced by two (659 KB). After applying the pre-execute pass, the file size reduces to a mere 548 bytes. That’s a three orders of magnitude difference compared to the minified output.

The simple vector program also compiles to a large JavaScript output (665 KB). Performing the pre-execute pass reduces the size to 6968 bytes, a two orders of magnitude difference. The relatively large file size compared to the hello world program is caused by std::vector’s method push_back(). That method includes a slow path that reallocates the array when the array is full and another element is about to be added. The for-loop is also unrolled 10 times, since the loop body is small, but that increases the file size a bit as well.

Limitations

At the moment, the PreExecuter does only evaluate constructors from the standard C++ library. The PreExecuter could be enabled to optimize other code portions as well in the future.

As of Cheerp 1.2, the PreExecuter is currently only available for Linux, because mmap() with the MAP_32BIT flag is used for allocating 32-bit addresses in a 64-bit architecture. However, the standard libraries that are shipped with Cheerp for Windows and MacOSX are already pre-executed, thus guaranteeing the same advantages to users.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.