Home

This repository contains my work on a new ARM backend for the OCaml native code compiler, currently based on the 3.12.1 release. Compared to the old ARM backend, this one does the following:

Support for both software and hardware floating-point using VFPv3-D16 or VFPv3(-D32).
Properly supports interworking with Thumb/Thumb-2 code for both OCaml and C code.
Supports dynamic linking and large memory models.
Optional support for position-independent code via a command line option -fPIC. This is disabled by default and not required for natdynlink.
Can emit both ARM and Thumb-2 code, with avg. code size savings of 27% for Thumb-2 (quite close the optimal 30% advertised by ARM Ltd.).
Supports both AAPCS (armel) as well as extended VFP calling conventions (armhf).
Supports several special ARM instructions to reduce code size and latency.
Uses standard ARM EABI runtime functions instead of relying on GCC internals.
Supports exception backtraces.
Supports profiling using gprof.

Reported upstream as PR#5433.

Architecture selection

You can specify the architecture when compiling with ocamlopt using the -farch command line option. Currently we support the following ARM architectures:

armv4 for ARMv4 / ARMv4T / StrongARM (not supported for armhf)
armv5 (not supported for armhf)
armv5te (not supported for armhf)
armv6 (not supported for armhf)
armv6t2 (not supported for armhf)
armv7 for Cortex

FPU selection

You can specify the floating-point unit when compiling with ocamlopt using the -ffpu command line option. Currently we support the following FPUs:

soft for software floating-point emulation (currently the only option for armel)
vfpv3-d16 for VFPv3-D16 (armhf only)
vfpv3 for VFPv3(-D32) (armhf only)

Performance / Code size savings

The new OCaml ARM backend (on armhf) is up to 4x faster in floating-point benchmarks compared to the 3.12.1 ARM backend (on armel), and code size is decreased by up to 28%. See the ocaml-arm 3.12.1+20111218 Benchmark for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Architecture selection

FPU selection

Performance / Code size savings

Clone this wiki locally