Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symbols in optimized async programs are often not useful #65978

Open
cbiffle opened this issue Oct 30, 2019 · 6 comments
Open

Symbols in optimized async programs are often not useful #65978

cbiffle opened this issue Oct 30, 2019 · 6 comments

Comments

@cbiffle
Copy link

@cbiffle cbiffle commented Oct 30, 2019

(As observed on rustc 1.40.0-nightly (4a8c5b20c 2019-10-23) targeting thumbv7em-none-eabihf; CC @tmandry)

I'm making my first aggressive use of async fn in an application. It's a deeply-embedded performance-sensitive application, and I wind up inspecting the disassembly output a lot (using objdump).

This is complicated by the fact that basically all of my functions are named poll_with_tls_context. (Some of them aren't -- some of them are named after future combinators.)

For example, here is my function called poll_with_tls_context calling another one, also named poll_with_tls_context:

; This is an ARMv-7M Thumb-2 listing.
080003b8 <core::future::poll_with_tls_context>:
 80003b8:       b570            push    {r4, r5, r6, lr}
 80003ba:       4604            mov     r4, r0
   ; irrelevant setup omitted...
 80003f4:       f000 fa3c       bl      8000870 <core::future::poll_with_tls_context> ; note different addr
 80003f8:       2101            movs    r1, #1
 80003fa:       2800            cmp     r0, #0
  ; ...and so on

(The observant reader will note poll_with_tls_context does not appear in libcore. That's correct -- I've hacked async in a #[no_std] environment. I'm pretty sure the hack is not the problem.)

I understand why this is happening: poll_with_tls_context is an implementation detail of the current lowering of async fn, and it is being specialized to the future type it's given, hence many such functions. But I also don't think it's ideal.

(For what it's worth, I can change the situation by forcing poll_with_tls_context to inline, though this produces unacceptable code bloat in my application (and this option isn't available for people who aren't open to using a patched libstd). By default, poll_with_tls_context doesn't inline, but get_task_context does, which seems like the right result for size/speed.)

I am compiling at opt-level = 3 with an override for debug = true in my release profile.

@Mark-Simulacrum

This comment has been minimized.

Copy link
Member

@Mark-Simulacrum Mark-Simulacrum commented Oct 30, 2019

It seems like we might want the lowering to emit into debug info (or symbols? not sure if we can use symbols here, i.e., due to duplicates) in such a way that preserves the name of the original function. That seems like something we want, in general.

@tmandry

This comment has been minimized.

Copy link
Contributor

@tmandry tmandry commented Oct 30, 2019

I think debuginfo should already include line-by-line information of code that was inlined. But when looking at objdump or a backtrace, it might not show up. We have to pick one function name which is going to be shown for every stack frame (or symbol).

In this case, there's one monomorphization per call site. Maybe we can add an attribute to poll_with_tls_context which makes the compiler emit a symbol name which includes the caller (sort of like it was inlined), but also append the name of the poll_with_tls_context function to prevent collisions.

@cbiffle

This comment has been minimized.

Copy link
Author

@cbiffle cbiffle commented Oct 31, 2019

If one were to change which symbol is recorded around poll_with_tls_context, it is the callee, not the caller, that's of interest -- the identity of the Future type it's been handed.

@tmandry

This comment has been minimized.

Copy link
Contributor

@tmandry tmandry commented Oct 31, 2019

Ah, that's right. (I was confused by the fact that #[inline(always)] on poll_with_tls_context happens to do what you want -- it's the secondary effect of that, which causes its callee to be pulled out into its own function -- that you want.)

In that case it's less clear to me how to solve this. poll_with_tls_context can (and does) call multiple functions, and >1 of them can be inlined into it. How would we decide which one has "naming rights" on the symbol?

From the outside, it would look like a caller got inlined into its callee, rather than the other way around.

We could make some annotation that says "when a bunch of functions are inlined into a symbol, don't use the symbol name of this function," and the compiler can pick the name of the first inlined function which doesn't contain this annotation, if any.

Another way is to gather all the functions inlined into one and pick the "most specific function name," i.e. the function that is a candidate for inlining the least number of times. This might work, but it might also have odd unpredictable effects.

I'm not sure how easy these would be to implement.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

@nikomatsakis nikomatsakis commented Dec 3, 2019

Marking as triaged. @Centril notes that #66398 is relevant.

@Aaron1011

This comment has been minimized.

Copy link
Contributor

@Aaron1011 Aaron1011 commented Dec 3, 2019

I think this is partially fixed by the new symbol mangling scheme (-Zsymbol-mangling-version=v0), which includes the instantiated parameters in the symbol name. In this case, each monomorphization of poll_with_tls_context shoud include the specific future type from the generic parameter F.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.