optimization flags #292
optimization flags #292
Comments
|
could we modify the build process such that certain large ops (for example op_kria) get compiled with -Os, whereas the majority of the code gets -O3 ? I was able to mix & match optimisation settings like this for a desktop linux application - there I was trading off compile speed with execution speed... |
|
oof... i suppose it's certainly possible.. but sounds like an awful slog throught the ASF makefile and so on. some outputs for the kria op: with -O3: with -O2: with -Os: so it doesn't look like Os is buying us much compared to O2 in that case. (by skipping code reordering.) but something in O3 is really putting the hurt on code size i'm hitting some roadblocks with trying to set all the flags manually. some inline functions in headers (in fat fs lib) are getting stripped out before link... will keep looking at it; i'd like to know whether the size exploder is inlining or unrolling or something else. but i wouldn't be opposed to just bumping everything down to O2 and seeing what happens. |
|
more data points. been using -O2 with extra flags on top and trying to find which ones are the size-exploders. with -O2: .text 0x20138 0x80008008 total (.text + .rodata + .data) = 163588 B with -O3: .text = 0x2fe6c adding the -O3 flags i've determined to be most size-costly:
.text = 0x2f144 and the -O3 flags i've determined to be least costly:
.text 0x20408 0x80008008 so.. seems like the biggest culprits are inlining and loop peeling (makes sense.) unfortunately i'd also guess they are also the most effective for speed. notes on some of the other flags:
so i'm going to go ahead and use that last block for now. profiling session will focus on just seeing if we get significant gains from the more space-costly -O3 flags: |
following discussion from lines:
https://llllllll.co/t/modern-c-programming-tips-and-tricks/3039/124?u=zebra
we need to free up some more code space in bees, so it's probably time to move away from
-O3from the full list of gcc optimization options, these are the ones i've found to be supported by avr32-gcc:
currently experimenting with which flags have the greatest impact on code size.
profiling for speed is quite a bit harder but will construct some suitable scenes and start flipping GPIO in main event loop.
any advice would be greatly appreciated
The text was updated successfully, but these errors were encountered: