Skip to content
This repository has been archived by the owner on Mar 30, 2020. It is now read-only.

HHVM experimental LLVM code generator #160

Closed
denji opened this issue Apr 25, 2015 · 0 comments
Closed

HHVM experimental LLVM code generator #160

denji opened this issue Apr 25, 2015 · 0 comments
Labels

Comments

@denji
Copy link

denji commented Apr 25, 2015

HHVM includes an experimental LLVM code generator that requires a custom version of LLVM at the moment.

We introduced a number of extensions to LLVM that are highly unstable. Each of them could be modified or dropped in the future.
Once we finalize our requirements for LLVM we plan to upstream the changes. For the moment we will keep LLVM patch in tools/llvm directory to allow everyone to build and use LLVM backend with HHVM.

The included patch has been applied and verified to work with trunk@215415 (git 23761603fe609770cc6fd3e42edf96b273265b7d).

In case you need to build clang with this LLVM branch, the corresponding clang commit is trunk@215290.

To use the patch with git:

  $ git clone https://github.com/llvm-mirror/llvm
  $ cd llvm
  $ git checkout -b hhvm 2376160
  $ patch -p0 < ${path_to_hhvm}/tools/llvm/llvm.patch
  ...

Our current list of extensions to LLVM includes smashable attribute, locrecs metadata, HHVM-specific calling conventions, and HHVM-specific optimizations.

I. Location records.

HHVM runtime needs to patch tail calls generated by the backend and thus needs to identify code locations generated for these instructions.
We've considered using LLVM's patchpoint interface for this, but given its limitations (mainly the lack of tail call support and negative effect on optimizations) decided to use a special kind of metadata that gets propagated to MC level (current implementation piggybacks on debug info).

As of LLVM 3.5 the syntax for the metadata is as follows:

musttail call void @foo(i64 %val), !locrec !{i32 42}

Note that there's no direct relationship between LLVM instructions and instructions at machine level. After optimizations instructions could be eliminated, combined, cloned, etc. Similarly LLVM function
code could be inlined, cloned, or outlined and thus we do not include function context for location records, and expect locrecs IDs to be unique within a module.

Location records are written into .llvm_locrecs section with the following format:

Header {
  uint8   : Major Version (1)
  uint8   : Minor Version (0)
  uint16  : <reserved>
  uint32  : NumRecords
}

LocationRecord[NumRecords] {
  uint64  : Address           ; absolute address of the instruction
  uint32  : LocRec ID         ; ID of !locrec
  uint8   : Size              ; size of the instruction
  uint8   : <reserved>
  uint16  : <reserved>
}

For the reasons mentioned above there could be multiple records for a single instruction marked with a unique locrec ID in LLVM IR, or no records at all.

II. Smashable attribute.

The 'smashable' attribute can be attached to LLVM 'call' and 'invoke' instructions. This guarantees that resulting machine-level instruction(s) (if any) will not straddle a cache-line boundary and thus could be safely smashed in multi-threaded environment. E.g.

  musttail call void @foo(i64 %val) readonly smashable

III. Calling conventions.

4 new calling conventions have been added to cover different calling scenarios in HHVM.

IV. Optimizations.

HHVM-specific optimizations include conditional tail call optimizations and a prototype for hot-cold code splitting based on block frequency.

V. Misc.

We disable generation of MCJIT stubs to avoid gaps in generated code.

1-byte alignment (i.e. no alignment) can be specified for functions that don't have OptimizeForSize attribute.

To keep functions from different modules placed as tightly as possible and yet to satisfy internal alignment requirements, we use module flags to tell LLVM that the code we are about to output will be skewed with respect to a requested section alignment.

When emitting code for HHVM tracelets identified by HHVM_TC calling convention, the generated stack prologue and epilogue are modified to take into account the fact that stack pointer is differently
aligned on entrance to the tracelet. Standard X86_64 ABI puts the return value at 16-byte alignment, while for HHVM tracelets this alignment is offset by 8 bytes.

https://github.com/facebook/hhvm/tree/master/hphp/tools/llvm

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant