Skip to content
Baptiste Wicht edited this page Jun 5, 2013 · 6 revisions

Improvements

  • Version 1.2.3: The data-flow framework has been highly improved to fix the main bottleneck.
  • Version 1.2.4 (dev): The offset constant propagation has been made a fast forward data-flow problem.

Timings

Here are the timings on the compilation of whole set of test cases and samples (date: 02/08/2013):

3457:parsing:40.3761
1911:assemble:22.3196
942:dead_code_elimination:11.0021
640:constant_propagation:7.47489
517:offset_constant_propagation:6.03831
453:register_allocation:5.29082
366:peephole_optimization:4.2747
126:common_subexpression_elimination:1.47162
70:ast_passes:0.817566
50:inline_functions:0.583976
16:mtac_compilation:0.186872
8:prologue_generation:0.0934361
1:pre_alloc_cleanup:0.0116795
Total:8562

There are compiled with --O3

Bottlenecks

  1. The parsing starts to be quite problematic. One solution would be to improve the handling of strings which causes a lot of copies in my opinion
  2. The assembler step cannot be really improved
  3. DCE remains the slower optimizations. It has to seen which part is the slowest
  4. Constant Propagation could probably be made faster. The transfer function is very complicated, it could probably be tuned. Moreover, if it was possible to make it a Fast_Forward_Block problem, it could be much faster
  5. Offset Constant Propagation can be tuned to make it a fast forward problem
  6. Register allocation can definitely be improved by tuning data-flow live analysis and by improving handling of bound registers
  7. Peephole optimizations can be made faster by grouping the optimizations and improving the conditions that are quite heavy right now.