PHP 8.x → LLVM-IR → native ELF/EXE/Mach-O First true AOT pipeline that skips C as an intermediate step.
Traditional “compile PHP” efforts transpile to C first. php2ir compiles directly to LLVM IR, producing fast native binaries for Linux, macOS, and Windows without a C intermediary. This unlocks LTO, PGO, and modern LLVM optimizations while keeping the pipeline simple and deterministic.
- Direct AOT: PHP 8.x source → LLVM IR → native binary (ELF/EXE/Mach-O)
- Zero C shim: no C transpile stage, no platform-specific codegen glue
- SSA at the source level: aggressive constant folding, DCE, inlining
- Mixed runtime modes: static (no VM) or minimal runtime shims
- Interop: FFI for native calls (dlopen/dlsym on *nix, Win32 on Windows)
- Portable toolchain: clang/llc/lld or
llvm
-monolithic toolchains - Deterministic builds: reproducible, hermetic Docker images
- Unit + IR tests: golden IR snapshots to catch regressions
Early alpha. The compiler supports a pragmatic PHP subset + a growing standard library. See Supported PHP below.
# 1) Prereqs (LLVM 17+ recommended)
# macOS (brew):
brew install llvm@17
# Linux (apt):
sudo apt-get install -y llvm lld clang
# 2) Clone
git clone https://github.com/makalin/php2ir.git
cd php2ir
# 3) Build compiler
make build # or: cargo build --release (if using Rust toolchain)
# 4) Compile PHP → native
./target/release/php2ir examples/hello.php -o hello
./hello
Docker:
docker build -t php2ir .
docker run --rm -v $PWD:/w -w /w php2ir ./php2ir examples/hello.php -o hello
<?php
function greet(string $name): string {
return "Hello, $name!";
}
echo greet("World"), PHP_EOL;
Compile:
php2ir hello.php -o hello
./hello
- Scalars: int, float, bool, string
- Arrays: packed arrays, associative arrays (lowered to hashmap vectors)
- Control flow: if/else, while/for/foreach, switch, match
- Functions: user functions, default params, return types, recursion
- Closures: by-value capture (by-ref WIP)
- OOP: classes, properties, methods,
new
, visibility (runtime-enforced),static
- Attributes: parsed, exposed to IR metadata (custom passes possible)
- Exceptions:
try/catch/finally
(zero-cost where available) - I/O:
echo
, basic filesystem APIs via runtime shims - FFI: call native functions (see Interop)
Not yet: fibers, generators, dynamic properties (deprecated), traits (partial), enums (parsing ok, codegen WIP), references (&) semantics (partial), magic methods (partial), JIT (not applicable), full ext/*
set.
PHP source
└─▶ Parser (php-src AST or nikic/php-parser)
└─▶ Normalizer (types, attributes, constant folding)
└─▶ High-level SSA (phi insertion, dominance, CPC)
└─▶ Lowering to LLVM IR (typed ops, GC barriers)
└─▶ LLVM passes (O2/LTO/PGO)
└─▶ lld → native binary
Key components:
- Front-end: AST import + semantic analysis (type hints + flow inference)
- IR Builder: high-level SSA → LLVM IR (call graph, inliner, DCE)
- Runtime: small
libphp2ir
for arrays/strings/hashmaps/exceptions/IO - GC: configurable (ARC-like refcount default; optional Boehm/MC WIP)
- Linker:
lld
by default;link.exe
supported on Windows
make build
# binaries in ./target/release
- LLVM 16+ (17+ recommended),
clang
,llc
,lld
- CMake (for runtime lib), Ninja (optional)
- PHP 8.x headers if building with php-src AST mode (optional)
- Rust 1.78+ or C++20 (depending on selected backend in
Makefile.config
)
php2ir <input.php> [-o <out>] [--emit-llvm] [--emit-llvm-only]
[--lto <thin|full>] [--pgo-gen|--pgo-use=<profdata>]
[--opt <O0|O1|O2|O3|Oz>] [--target <triple>]
[--stdlib <path>] [--no-rt] [--sanitize <address|ubsan>]
Examples:
# Emit IR only:
php2ir foo.php --emit-llvm -o foo.ll
# Native with ThinLTO at O3:
php2ir app.php --lto thin --opt O3 -o app
# Cross-compile (static):
php2ir svc.php --target x86_64-unknown-linux-gnu --opt O2 -o svc
<?php
#[ffi("libm.so.6", "double cos(double)")]
function cos_native(float $x): float {}
echo cos_native(0.0), PHP_EOL;
Compile and link will auto-discover libm
via -lm
(configurable).
- Strings: UTF-8, small-string optimization
- Arrays/Hashmaps: packed + dict with copy-on-write fast paths
- Exceptions: zero-cost tables (Itanium on *nix, SEH on Windows)
- IO:
fopen/fread/fwrite
, argv/env, timers - Platform: POSIX & Win32 shims, high-res time, random
Toggle features in runtime/config.h
(GC, SSO threshold, hash policy).
- Type specialization and devirtualization (monomorphic hot paths)
- Escape analysis + stack promotion
- Interprocedural constant propagation and inlining
- Bounds-check hoisting for arrays/strings
- Optional
-fexperimental-new-pass-manager
pipelines - LTO (Thin/Full), PGO (
.profdata
)
# Configure
cp Makefile.config.example Makefile.config
# edit LLVM_PREFIX, BUILD_BACKEND=(rust|cpp), DEFAULT_OPTS, etc.
# Build compiler + runtime
make clean && make -j
# Test
make test
- Unit tests:
cargo test
orctest
(backend-dependent) - IR golden tests: compare
*.ll
against snapshots - Runtime tests: black-box run + stdout/exit-code assertions
- Bench: micro-bench harness (see
benches/
)
make test
make irtest
make bench
- Full references semantics (
&$x
) and alias analysis - Generators /
yield
, fibers - Traits/enums full codegen
- OPcache profile import → PGO seed
- Advanced GC (immix/RC hybrid), arena allocators
- Windows MSVC + ARM64 macOS native releases
- Composer package AOT (compile entire dependency graph)
- Not a drop-in replacement for the PHP VM yet
- Dynamic features (eval, variable variables, include hooks) are restricted or require AOT-friendly patterns
- Extensions: curated subset via shims; full
ext/*
parity is long-term
examples/hello.php
– basicsexamples/http_micro.php
– tiny HTTP serverexamples/ffi_math.php
– native interopexamples/oop.php
– classes & methods
Run all:
make examples
Why not transpile to C first? Skipping C removes an entire compilation layer, avoids C UB pitfalls, shortens build times, and lets LLVM see richer semantics earlier.
Will my existing PHP app work? CLI apps with minimal dynamic features are great candidates today. Web apps can work via AOT of entrypoints + a tiny SAPI shim (WIP).
Performance vs PHP-FPM? CPU-bound workloads typically see substantial gains with O2/LTO. IO-bound workloads benefit less; measure with our bench harness.
- Fork & branch:
feat/<name>
- Run tests:
make test
- Add IR snapshot tests when changing codegen
- Open PR with a brief design note and benchmarks
Issues and discussions welcome!
Copyright 2025 Mehmet T. AKALIN
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
If this project helps your research or product, consider citing:
@software{php2ir,
title = {php2ir: Direct PHP → LLVM IR AOT compiler},
author = {Mehmet T. AKALIN},
year = {2025},
url = {https://github.com/makalin/php2ir}
}
- LLVM community and docs
- PHP internals &
nikic/php-parser
ecosystem (when used as front-end)