Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
GitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Linear Scan Register Allocator for ocamlopt and ocamlnat #5324
Original bug ID: 5324
As announced on the mailinglist, here is the first version of the linear scan algorithm for ocamlopt and ocamlnat.
Comment author: @alainfrisch
This patch is very interesting.
Could you post here some timings (compilation speed and performance of compiled code)?
Minor question about the patch: wouldn't it be cleaner to have a single function Asmgen.regalloc with a conditional on use_linscan, rather than having two functions called in sequence that do their job or nothing according to the situation?
Also, is it really necessary to redefine sys_time? (As opposed to using Sys.time which is in the stdlib.)
Comment author: meurer
I've imported the most recent patch from Marcell with some cleanups, merged the latest changes from the 3.12 branch and uploaded the project to GitHub at https://github.com/bmeurer/ocaml-experimental/tree/linear-scan-register-allocator in the linear-scan-register-allocator branch (for reference).
Marcell did some benchmarking and will publish the results soon.
Comment author: marcell
I've added timings for the runtime performance for amd64 and i386. Both showing the runtime for code compiled with the graph-coloring and the linear-scan algorithm and the ratio of the two measurements.
Although meurer already merged the latest changed into the GitHub-Repository I also added the latest diff.
I will add timings for the compilation time soon.
Comment author: @xavierleroy
Here is my analysis of this proposal. The "big" users of OCaml, esp. the members of the Caml consortium, push strongly towards improving the performance of ocamlopt-generated code, but don't really care about compilation times. Linear scan register allocation goes in a different direction: fast compilation times at the cost of slightly lower quality of the generated code. I would rather invest time and effort on replacing ocamlopt's Briggs-style graph coloring by George and Appel's IRC graph coloring: this should result in slightly faster generated code and slightly lower compilation times at the same time. For the time being, let me just put this PR in the "suspended" state.