Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[BOLT] Improve ICP activation policy and hot jt processing
Summary: Previously, ICP worked with a budget of N targets to convert to direct calls. As long as the frequency of up to N of the hottest targets surpassed a given fraction (threshold) of the total frequency, say, 90%, then the optimization would convert a number of targets (up to N) to direct calls. Otherwise, it would completely abort processing this call site. The intent was to convert a given fraction of the indirect call site frequency to use direct calls instead, but this ends up being a "all or nothing" strategy. In this patch we change this to operate with the same strategy seem in LLVM's ICP, with two thresholds. The idea is that the hottest target of an indirect call site will be compared against these two thresholds: one checks its frequency relative to the total frequency of the original indirect call site, and the other checks its frequency relative to the remaining, unconverted targets (excluding the hottest targets that were already converted to direct calls). The remaining threshold is typically set higher than the total threshold. This allows us more control over ICP. I expose two pairs of knobs, one for jump tables and another for indirect calls. To improve the promotion of hot jump table indices when we have memory profile, I also fix a bug that could cause us to promote extra indices besides the hottest ones as seen in the memory profile. When we have the memory profile, I reapply the dual threshold checks to the memory profile which specifies exactly which indices are hot. I then update N, the number of targets to be promoted, based on this new information, and update frequency information. To allow us to work with smaller profiles, I also created an option in perf2bolt to filter out memory samples outside the statically allocated area of the binary (heap/stack). This option is on by default. (cherry picked from FBD15187832)
- Loading branch information