-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase size of RegisterRuleMap to avoid TooManyRegisterRules #487
Conversation
I think I wonder how other unwinders handle this. |
Completely missed that comment. What if we keep the error and swap out ArrayVec vs SmallVec based on a feature flag? There may be contexts where signal safety is not required in which case it would be desirable to collect all rules. |
I'm not keen on using a feature flag to switch between ArrayVec and SmallVec. We need to be able to handle all rules, even when signal safety is required. libunwind uses an architecture specific value for its array (33 for x86-64, 97 for aarch64). How would you feel about increasing the fixed size for now, and later making it configurable with const generics? |
335fb17
to
52a0c7b
Compare
Sounds good to me. I should have done the reading in libunwind right away. Would you mind to link me to it, for some reason I'm not able to find it. Increased the limit to 100 now (since Lines 2297 to 2299 in 53c802a
Would you like me to change that? |
Sure: the register rule table, and the x86-64 and aarch64 numbers. ARM (128) and MIPS (188) are actually even larger.
Yeah, can you change the comment to mention that it is based on the numbers from libunwind? |
5a2c6a7
to
aa6c516
Compare
Thanks for the links, @philipc! Was going back and forth with using a smaller array, but ended up increasing size to |
aa6c516
to
e471a94
Compare
Just out of interest, the benchmark difference on my machine is:
I don't have any ideas for what we can do better, but maybe @fitzgen does? I don't think anyone has spent any serious effort on optimising this yet though, other than #148 (which may no longer be an optimisation). |
Some versions of libunwind (there are so many forks floating around) have a specialized mmap-only, lock-free allocator for use inside signal handlers. Given the snippets you found, it looks like that isn't used for register rules, but it is something we could look into. Other unwinders, that are focused just on fast, common case unwinding for profiling, only ever recover rsp and rbp. Exposing this limit through generics somehow seems promising and not overly ocean boil-y. |
The fastest implementations I've seen either compile CFI to machine code ahead of time and replace the CFI, or do it at runtime with a super limited JIT compiler and an associative cache mapping PCs to JIT stubs. That is, they largely avoid the cost of repeatedly parsing and interpreting CFI rules. |
Yeah libunwind (at least the primary version of it) uses mmap for
I'm aware of https://fzn.fr/projects/frdwarf/, but not sure if anything in production uses that technique. Anyway, it sounds like this PR is the way to go for now at least. |
Firefox's profiler's unwinder uses the JIT approach. |
Increases the size of
RegisterRuleMap
to fit more rules in rare cases.