Increase size of RegisterRuleMap to avoid TooManyRegisterRules #487

jan-auer · 2020-04-16T12:30:17Z

Increases the size of RegisterRuleMap to fit more rules in rare cases.

coveralls · 2020-04-16T13:02:36Z

Coverage increased (+0.02%) to 85.391% when pulling e471a94 on jan-auer:fix/too-many-rules into 53c802a on gimli-rs:master.

philipc · 2020-04-16T13:40:25Z

I think ArrayVec was intentional because this code is meant to be async signal safe and not allocate, as mentioned in the comment.

I wonder how other unwinders handle this.

jan-auer · 2020-04-16T15:24:00Z

Completely missed that comment.

What if we keep the error and swap out ArrayVec vs SmallVec based on a feature flag? There may be contexts where signal safety is not required in which case it would be desirable to collect all rules.

philipc · 2020-04-17T06:24:43Z

I'm not keen on using a feature flag to switch between ArrayVec and SmallVec. We need to be able to handle all rules, even when signal safety is required. libunwind uses an architecture specific value for its array (33 for x86-64, 97 for aarch64). How would you feel about increasing the fixed size for now, and later making it configurable with const generics?

jan-auer · 2020-04-17T07:46:38Z

Sounds good to me. I should have done the reading in libunwind right away. Would you mind to link me to it, for some reason I'm not able to find it.

Increased the limit to 100 now (since arrayvec::Array is not implemented for size 97). The first part of this comment is now a bit redundant, since we are actually allocating that many rules:

gimli/src/read/cfi.rs

Lines 2297 to 2299 in 53c802a

    
           // We tend to have very few register rules: usually only a couple. Even if we 
        
           // have a rule for every register, on x86-64 with SSE and everything we're 
        
           // talking about ~100 rules. So rather than keeping the rules in a hash map, or

Would you like me to change that?

philipc · 2020-04-17T09:25:21Z

Would you mind to link me to it, for some reason I'm not able to find it.

Sure: the register rule table, and the x86-64 and aarch64 numbers. ARM (128) and MIPS (188) are actually even larger.

Would you like me to change that?

Yeah, can you change the comment to mention that it is based on the numbers from libunwind?

jan-auer · 2020-04-20T15:58:18Z

Thanks for the links, @philipc! Was going back and forth with using a smaller array, but ended up increasing size to 192. This bumps sizes of the register map to 6kB and the unwind context to a whopping 7+ pages (30kB).

philipc · 2020-04-21T00:40:10Z

Just out of interest, the benchmark difference on my machine is:

 name                                                       before ns/iter  after ns/iter  diff ns/iter   diff %  speedup 
 cfi::eval_longest_fde_instructions_new_ctx_everytime       498             2,893                 2,395  480.92%   x 0.17 
 cfi::eval_longest_fde_instructions_same_ctx                314             651                     337  107.32%   x 0.48 
 cfi::iterate_entries_and_do_not_parse_any_fde              50,855          49,392               -1,463   -2.88%   x 1.03 
 cfi::iterate_entries_and_parse_every_fde                   387,405         391,387               3,982    1.03%   x 0.99 
 cfi::iterate_entries_and_parse_every_fde_and_instructions  889,495         896,817               7,322    0.82%   x 0.99 
 cfi::iterate_entries_evaluate_every_fde                    1,204,579       2,542,256         1,337,677  111.05%   x 0.47 
 cfi::parse_longest_fde_instructions                        203             209                       6    2.96%   x 0.97

I don't have any ideas for what we can do better, but maybe @fitzgen does? I don't think anyone has spent any serious effort on optimising this yet though, other than #148 (which may no longer be an optimisation).

fitzgen · 2020-04-21T16:35:57Z

@philipc

I wonder how other unwinders handle this.

Some versions of libunwind (there are so many forks floating around) have a specialized mmap-only, lock-free allocator for use inside signal handlers. Given the snippets you found, it looks like that isn't used for register rules, but it is something we could look into.

Other unwinders, that are focused just on fast, common case unwinding for profiling, only ever recover rsp and rbp.

Exposing this limit through generics somehow seems promising and not overly ocean boil-y.

fitzgen · 2020-04-21T16:39:25Z

I don't have any ideas for what we can do better, but maybe @fitzgen does? I don't think anyone has spent any serious effort on optimising this yet though, other than #148 (which may no longer be an optimisation).

The fastest implementations I've seen either compile CFI to machine code ahead of time and replace the CFI, or do it at runtime with a super limited JIT compiler and an associative cache mapping PCs to JIT stubs.

That is, they largely avoid the cost of repeatedly parsing and interpreting CFI rules.

philipc · 2020-04-21T23:40:47Z

Some versions of libunwind (there are so many forks floating around) have a specialized mmap-only, lock-free allocator for use inside signal handlers. Given the snippets you found, it looks like that isn't used for register rules, but it is something we could look into.

Yeah libunwind (at least the primary version of it) uses mmap for DW_CFA_remember_state. The libgcc unwinder does the same thing as libunwind for the register array, and uses alloca for DW_CFA_remember_state.

The fastest implementations I've seen either compile CFI to machine code ahead of time and replace the CFI, or do it at runtime with a super limited JIT compiler and an associative cache mapping PCs to JIT stubs.

I'm aware of https://fzn.fr/projects/frdwarf/, but not sure if anything in production uses that technique.

Anyway, it sounds like this PR is the way to go for now at least.

fitzgen · 2020-04-22T15:23:01Z

but not sure if anything in production uses that technique.

Firefox's profiler's unwinder uses the JIT approach.

jan-auer mentioned this pull request Apr 16, 2020

fix(minidump): Skip TooManyRegisterRules errors in CFI getsentry/symbolic#219

Merged

jan-auer force-pushed the fix/too-many-rules branch from 335fb17 to 52a0c7b Compare April 17, 2020 07:36

jan-auer changed the title ~~Use smallvec to avoid TooManyRegisterRules in CFI~~ Increase size of RegisterRuleMap to avoid TooManyRegisterRules Apr 17, 2020

jan-auer force-pushed the fix/too-many-rules branch 2 times, most recently from 5a2c6a7 to aa6c516 Compare April 20, 2020 15:53

Increase size of RegisterRuleMap to avoid TooManyRegisterRules

e471a94

jan-auer force-pushed the fix/too-many-rules branch from aa6c516 to e471a94 Compare April 20, 2020 15:58

philipc approved these changes Apr 21, 2020

View reviewed changes

philipc merged commit 1e0d07b into gimli-rs:master Apr 21, 2020

jan-auer deleted the fix/too-many-rules branch April 22, 2020 15:25

philipc mentioned this pull request May 11, 2020

Support all CIE versions in eh_frame #493

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase size of RegisterRuleMap to avoid TooManyRegisterRules #487

Increase size of RegisterRuleMap to avoid TooManyRegisterRules #487

jan-auer commented Apr 16, 2020 •

edited

Loading

coveralls commented Apr 16, 2020 •

edited

Loading

philipc commented Apr 16, 2020

jan-auer commented Apr 16, 2020

philipc commented Apr 17, 2020

jan-auer commented Apr 17, 2020

philipc commented Apr 17, 2020

jan-auer commented Apr 20, 2020

philipc commented Apr 21, 2020

fitzgen commented Apr 21, 2020

fitzgen commented Apr 21, 2020

philipc commented Apr 21, 2020

fitzgen commented Apr 22, 2020

Increase size of RegisterRuleMap to avoid TooManyRegisterRules #487

Increase size of RegisterRuleMap to avoid TooManyRegisterRules #487

Conversation

jan-auer commented Apr 16, 2020 • edited Loading

coveralls commented Apr 16, 2020 • edited Loading

philipc commented Apr 16, 2020

jan-auer commented Apr 16, 2020

philipc commented Apr 17, 2020

jan-auer commented Apr 17, 2020

philipc commented Apr 17, 2020

jan-auer commented Apr 20, 2020

philipc commented Apr 21, 2020

fitzgen commented Apr 21, 2020

fitzgen commented Apr 21, 2020

philipc commented Apr 21, 2020

fitzgen commented Apr 22, 2020

jan-auer commented Apr 16, 2020 •

edited

Loading

coveralls commented Apr 16, 2020 •

edited

Loading