Performance

Improvements

Version 1.2.3: The data-flow framework has been highly improved to fix the main bottleneck.
Version 1.2.4 (dev): The offset constant propagation has been made a fast forward data-flow problem.

Timings

Here are the timings on the compilation of whole set of test cases and samples (date: 02/08/2013):

3457:parsing:40.3761
1911:assemble:22.3196
942:dead_code_elimination:11.0021
640:constant_propagation:7.47489
517:offset_constant_propagation:6.03831
453:register_allocation:5.29082
366:peephole_optimization:4.2747
126:common_subexpression_elimination:1.47162
70:ast_passes:0.817566
50:inline_functions:0.583976
16:mtac_compilation:0.186872
8:prologue_generation:0.0934361
1:pre_alloc_cleanup:0.0116795
Total:8562

There are compiled with --O3

Bottlenecks

The parsing starts to be quite problematic. One solution would be to improve the handling of strings which causes a lot of copies in my opinion
The assembler step cannot be really improved
DCE remains the slower optimizations. It has to seen which part is the slowest
Constant Propagation could probably be made faster. The transfer function is very complicated, it could probably be tuned. Moreover, if it was possible to make it a Fast_Forward_Block problem, it could be much faster
Offset Constant Propagation can be tuned to make it a fast forward problem
Register allocation can definitely be improved by tuning data-flow live analysis and by improving handling of bound registers
Peephole optimizations can be made faster by grouping the optimizations and improving the conditions that are quite heavy right now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance

Improvements

Timings

Bottlenecks

Clone this wiki locally