Performance
Baptiste Wicht edited this page Jun 5, 2013
·
6 revisions
- Version 1.2.3: The data-flow framework has been highly improved to fix the main bottleneck.
- Version 1.2.4 (dev): The offset constant propagation has been made a fast forward data-flow problem.
Here are the timings on the compilation of whole set of test cases and samples (date: 02/08/2013):
3457:parsing:40.3761
1911:assemble:22.3196
942:dead_code_elimination:11.0021
640:constant_propagation:7.47489
517:offset_constant_propagation:6.03831
453:register_allocation:5.29082
366:peephole_optimization:4.2747
126:common_subexpression_elimination:1.47162
70:ast_passes:0.817566
50:inline_functions:0.583976
16:mtac_compilation:0.186872
8:prologue_generation:0.0934361
1:pre_alloc_cleanup:0.0116795
Total:8562
There are compiled with --O3
- The parsing starts to be quite problematic. One solution would be to improve the handling of strings which causes a lot of copies in my opinion
- The assembler step cannot be really improved
- DCE remains the slower optimizations. It has to seen which part is the slowest
- Constant Propagation could probably be made faster. The transfer function is very complicated, it could probably be tuned. Moreover, if it was possible to make it a Fast_Forward_Block problem, it could be much faster
- Offset Constant Propagation can be tuned to make it a fast forward problem
- Register allocation can definitely be improved by tuning data-flow live analysis and by improving handling of bound registers
- Peephole optimizations can be made faster by grouping the optimizations and improving the conditions that are quite heavy right now.