My main goal with this issue is two fold:
1.-Can we profile and optimize with optimal results using the same binary? If so what are the recommended compile and link flags fro ARM
2.-If we are to use different binaries for profiling and optimization what are the compile and link fags for both?
Background: earlier this year I had issues optimizing postgres, using regular compile flags I could not get significant improvements with default compilation and link flags and I observed several crashes in BOLT. I was finally advised to create PIE executables and then I was able to halve my inst-TLB hits for regular benchmarks. I was able to use the same binary to train and optimize.
However, I did find that the final binary had TEXREL on it due to a bug in the gclib version I was using. After fixing this issue, I was not able to get significant improvements in performance using the same binary.
After some experimenting, I found that only if I use the binary that has TEXTREL to profile I can keep the significant gains.
My questions are:
1.-Can we use a single binary for both profiling and bolting and still get the perf gains without having binaries with TEXREL which is a sec risk?
2.-Why do some binaries result in enhanced profiles that yield significant higher boost?