Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Breaking] Introduce Bundle Compiler #631

Merged
merged 42 commits into from Aug 28, 2017

Conversation

@chadhietala
Copy link
Member

commented Aug 25, 2017

This introduces an optimizing compiler for Handlebars templates known as the Bundle Compiler.

Architecture

The key insight of Glimmer is that Handlebars is a declarative programming language for building and updating DOM. By structuring web UI around Handlebars templates as the central abstraction, we can use advanced techniques from programming languages and compilers to significantly boost the performance of web applications in practice.

Because of this, Glimmer's architecture has more in common with compiler toolchains like clang/LLVM or javac/JVM than traditional JavaScript libraries.

At a high level, Glimmer is made up of two parts:

  1. The compiler, which turns templates into optimized binary bytecode.
  2. The runtime, which evaluates that bytecode and translates its instructions into
    things like creating DOM elements or instantiating JavaScript component classes.

Compiler

The compiler is responsible for turning your program's Handlebars templates into Glimmer binary bytecode.

Because Glimmer is an optimizing compiler, it must know about all of the templates in a program in order to understand how they work together. This is in contrast to transpilers like Babel, which can transform each file in isolation.

As the compiler traverses your application and discovers templates, it parses each one and creates an intermediate representation (IR). The IR is similar to the final bytecode program but contains symbolic references to external objects (other templates, helpers, etc.) that may not have been discovered yet.

Once all of the templates have been parsed into IR, the compiler performs a final pass that resolves symbolic addresses and writes the final opcodes into a shared binary buffer. In native compiler terms, you can think of this as the "linking" step that produces the final executable.

This binary executable is saved to disk as a .gbx file that can be served to a browser and evaluated with the runtime.

But we're not quite done yet. The bytecode program will be evaluated in the browser where it needs to interoperate with JavaScript. For example, users implement their template helpers as JavaScript functions. In our compiled program, how do we know what function to call if the user types {{formatDate user.createdAt}}?

During compilation, Glimmer will assign unique numeric identifiers to each referenced external object (like a helper or component). We call these identifiers handles, and they are how we refer to "live" JavaScript objects like functions in the binary bytecode.

When evaluating Glimmer bytecode in the browser, instead of asking for the "formatDate" helper, the runtime might ask for the object with handle 4.

In order to satisfy this request, the compiler also produces a data structure called the external module table that maps each handle to its associated JavaScript object.

For example, imagine we compile a template that invokes two helpers, formatDate and pluralize. These helpers get assigned handles 0 and 1 respectively. In order to allow the runtime to turn those handles into the correct function object, the compiler might produce a map like this:

// module-table.ts
import formatDate from 'app/helpers/format-date';
import pluralize from 'app/helpers/pluralize';

export default [formatDate, pluralize];

With this data structure, we can easily implement a function that translates handles into
the appropriate live object:

import moduleTable from './module-table';

function resolveHandle<T>(handle: number): T {
  return moduleTable[handle];
}

You can think of the external module table as the bridge between Glimmer's bytecode and the JavaScript VM. We can compactly represent references to external objects in the bytecode using handles, and then rehydrate them later with minimal overhead.

Now that we have our compiled bytecode and the module table, we're ready to run our app in the browser, or any other JavaScript environment, like Node.js.

Runtime

During the AoT compilation we have produced not only binary program and handle table, but we have also serialized off the a constants pool which contains things like user space literals. These pre-computed data structures are all the information we need to render the application.

Prior to the introduction of the optimizing compiler these data structures would be built up at runtime. This effectively meant that we had to compile from the JSON wire-format into the VM's bytecode before we knew what DOM to create. With the introduction of the Bundle Compiler we forgo all of the runtime compilation and instead hydrate the VM with the pre-computed data structures. This has the advantage that we get to the point of DOM creation much quicker which should reduce TTFP and TTI times. It also means that the runtime library can become much more smaller as we don't need the runtime compiler.

As a result of compiling to a binary format the template code does not have to go through the JavaScript Parse and Compile pipeline at runtime. Instead the VM can work directly off the binary using a Uint16Array view into the ArrayBuffer.

Once we have loaded the program we simply set the program counter to the handle and start executing the VM. Below is some pseudo code as to what this sort of looks like.

render(handle: Handle, resolver: RuntimeResolver, program: Uint16Array, pool: ConstantsPool) {
  let { env } = this;
  let runtimeProgram = new RuntimeProgram(new RuntimeConstants(resolver, pool), program);
  let vm = LowLevelVM.initial(runtimeProgram, handle);
  let iterator = new TemplateIterator(vm);
  
  env.begin();

  let iteratorResult: IteratorResult<RenderResult>;

  do {
    iteratorResult = iterator.next() as IteratorResult<RenderResult>;
  } while (!iteratorResult.done);

  let result = iteratorResult.value;

  env.commit();

  return result; // Done rendering
}
wycats and others added 30 commits Aug 2, 2017
Initial extraction of the opcode compiler into its own package.

The bulk of this work is trimming down incidental dependencies that the
opcode compiler has on other runtime types. For the most part, they are
either completely unnecessary or can be easily dependency injected.

The next step is to get `runtime` to use this package.

The step after that is to get `bundle-compiler` to use this package to
produce a fully precompiled output.

The most notable change in the interfaces in this commit is that the
`compilation options`, which are now expected to be used in the eager
bundler as well, no longer include the full resolver. Instead, they
include a `getCapabilities` method that requires only the capabilities
for a given component, and not access to arbitrary runtime objects.
This commit is further preparation for eager mode.

Previously, the opcode compiler was part of the runtime package. This
meant that it inadvertantly had a number of dependencies (minor and
major) on concepts in the runtime package. In some cases, those
dependencies were inappropriate. In other cases, they simply didn't
belong in the runtime package.

This commit moves the entire opcode compiler into its own self-contained
package.

Some notable improvements:

- There is now a `ParsedLayout` type that represents the deserialized
  template from the wire format. For silly reasons, this type didn't
  exist before, and the constituent parts of the parsed layout were
  passed around in an ad-hoc fashion. Moving to this type simplified
  many of the surrounding objects considerably.
- There is now a `TemplateOptions` type that reflects what is needed to
  compile a template: the Program (array of int32) to compile into, any
  syntactic macros, and a static module lookup interface.
- There is now a `CompileTimeLookup` interface that is a subset of the
  `Resolver` interface and is limited to static information only.
  Previously, the compiler had access to instantiate runtime objects,
  which it only used to get (statically available, ideally) information
  about component capabilities. That access was refactored into two new
  methods: getCapabilities and getLayout, which are expected to be
  resolvable statically in the future.
- The debugging aspects of the opcode compiler (including the code to
  disassemble and log a program) were moved into the opcode-compiler
  package.
Prior to this commit parts of the compiler hardcoded the references to the LazyOpcodeBuilder. To be able to switch between eager and lazy compilation we need to pass an OpcodeBuilderConstructor function into all the places where we were hardcoding the LazyOpcodeBuilder, specifically CompilableTemplate and WrappedTemplate. By treating the Builder's concrete type unknown, we can swap between builders.
More progress made on the bundle compiler.

We're now testing the whole flow of bundle compiler -> runtime.
There's still a lot of cruft, missing features, and type errors, but the
first blob of tests are passing in bundle compilation mode!
This commit restructures the tests (and some code) to allow the
rendering test suite to be written once and used with rehydration, using
the lazy builder, and using the bundle compiler.

This is accomplished by moving the environment specific questions into a
RenderDelegate, which is responsible for:

- constructing the initial element to render into
- registering components
- registering helpers
- rendering a template for a context into an element

At the moment, three rehydration tests fail, but I'm not entirely sure
how they were ever passing. They involve quirks of browser parsing (for
example, <table><!-- comment --><tr></tr><!-- comment --></table> is
parsed as <table><!-- comment --><tbody><tr></tr><!-- comment
--></tbody></table>, which mis-nests rehydration markers.

We also have tests that confirm that <option selected={{value}}> always
produces <option> even when selected is truthy. These tests seem like
they're testing quirks of the current attribute implementation, when SSR
would of course need to retain those attributes.

I might have made a mistake, or these tests might not have been running
correctly before.
Ported the component test case from runtime to a component suite and
then set it up to use the `BundlingRenderDelegate`.

To make the tests pass, the bundling tests implement a reference implementation
of the compile-time and runtime environments designed to work together. These
tests do *not* use the TestEnvironment, which is highly coupled to lazy mode.

The next steps are to finish porting over the rest of the suites that already
subclass AbstractRenderTest, and once done, port the rest of the component
tests to work with that system as suites.
But not {{ember-components}} or {{component helper}} yet.
Only one of these is strictly required and imposes a fair bit of complexity on the consuming side.
This makes it easier to use the utility function without conflicting with local variables named `specifier`
The bundle compiler is in some sense like a map of specifier -> template, so it makes sense to make the argument whose identity is meaningful the first argument.
After calling `compile()`, it's not totally obvious what the consumer is supposed to serialize. This commit changes `compile()` to return a `BundleCompilationResult`, which contains the serializable heap and pool. We may also want to consider returning the various handle maps as well.
wycats and others added 12 commits Aug 24, 2017
These tests haven't been tested in a long time and many of them are no
longer applicable.

If somebody wants to resuscitate them, feel free to update them.
This commit adds support for both {{curly-component}}s and {{component
helper}} invocation by implementing the rest of the test harness in the
bundle compiler mode.
Lazy mode does static invocation by pushing the uncompiled block into
the constants, then following by a CompileBlock instruction, which puts
the handle on the stack, followed by an InvokeVirtual instruction, which
invokes the handle at the top of the stack.

AOT mode compiles the blocks at compile time, so they can include the
handle in the instruction itself.

This commit distinguishes between InvokeVirtual, which expects a Handle
at the top of the stack (which is also necessary to yield to blocks),
and InvokeStatic, which includes the Handle in the opcode, and is able
to avoid using the stack.
@chadhietala chadhietala merged commit fde5360 into master Aug 28, 2017
2 checks passed
2 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
chadhietala added a commit that referenced this pull request Aug 28, 2017
[Relies on #631] Move Migrated Tests To Suites
@chadhietala chadhietala deleted the bundle-compiler-wip branch Oct 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.