Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fold identical method bodies in the compiler #101969

Merged
merged 9 commits into from
Jun 14, 2024

Conversation

MichalStrehovsky
Copy link
Member

@MichalStrehovsky MichalStrehovsky commented May 7, 2024

Saves up to 5.2% in file size per rt-sz measurements.

Adds a phase before object writing that looks for identical method bodies and deduplicates those that are same.

To keep delegate equivalence working, the compiler also distinguishes between references to symbols and references to symbols that have an observable address identity. When a method is folded that has an observable address identity, the references that require observable address identity go through a unique jump thunk. This means that delegates point to jump thunks and reflection mapping tables point to jump thunks (whenever a method body got folded into a different method body). We do not need the jump thunks for references that are not address exposed (so a call in a method body will no go through a jump thunk).

Since method body folding is still observable with stack trace APIs or debuggers, this is opt in. The user gets opted in by setting StackTraceSupport=false (or using an undocumented switch).

I took a shortcut in a couple places where references that may or may not be address exposed get treated as address exposed. There are TODO comments around those. We may want to fix tracking within the compiler to tighten this. It may not matter much. I also took a shortcut in deduplication - we currently only look at leaf identical method bodies. The method bodies that become identical after first level of folding currently don't get folded. This leaves a bit size on the table still. There's a TODO comment as well. We also don't consider function pointers address exposed since there's no API to compare these. That's also a TODO for whenever we add such API.

Rough expected size savings are at MichalStrehovsky/rt-sz#19 (comment), but this is a run where we were folding irrespective of StackTraceSupport. rt-sz cannot measure the impact of this PR in the current form since rt-sz cannot measure changes that change .targets.

Cc @dotnet/ilc-contrib

Copy link
Contributor

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

@MichalStrehovsky
Copy link
Member Author

/azp run runtime-nativeaot-outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MichalStrehovsky
Copy link
Member Author

/azp run runtime-nativeaot-outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MichalStrehovsky
Copy link
Member Author

/azp run runtime-nativeaot-outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MichalStrehovsky
Copy link
Member Author

@LakshanF, Andy volunteered you to be the reviewer for this. Could you have a look? We can go over things in a Teams call if you have questions.

@MichalStrehovsky
Copy link
Member Author

(It's not actually 900 lines, half of it are new tests.)

@filipnavara
Copy link
Member

Random data point: On my test case it reduces the size of the output by 9,6%, or 22 MB.

@LakshanF LakshanF self-requested a review June 11, 2024 15:11
Copy link
Member

@LakshanF LakshanF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! And it seems reasonable to revisit the non-leaf identical bodies after the first folding, and more granular checks at ExactMethodInstantiationsNode at later times with the TODO comments. Also, any perf hit via the jump thunk seems trivial and given the opt-in nature, any extra build time for this isn't a concern. Couple of very minor points (doesn't need to block merging for these)

  • Explore whether its worthwhile to exclude really small methods from folding, given the extra code for jump thunk needed for the delagate equivalence work
  • [nit, for consistency] The reflection tests in BodyFoldingTest doesn't have the equivalent sanity check

@MichalStrehovsky
Copy link
Member Author

Explore whether its worthwhile to exclude really small methods from folding, given the extra code for jump thunk needed for the delagate equivalence work

I now remember why I didn't do this. We want this folding to happen in general because it's more efficient. It only becomes less efficient if there is a jump thunk (should be the rare case). Design wise I couldn't see an easy way to change our mind about folding once we see a jump thunk. It's probably not worth complicating the code base. Easy fix would be to just not fold, but that leaves code size on the table to accommodate a rare case (when we need a jump thunk).

@MichalStrehovsky
Copy link
Member Author

/azp run runtime-nativeaot-outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MichalStrehovsky
Copy link
Member Author

/ba-g failures in x86 legs that don't test anything related to this PR. x86 legs were fixed in main.

@MichalStrehovsky MichalStrehovsky merged commit ae80f73 into dotnet:main Jun 14, 2024
84 of 90 checks passed
@MichalStrehovsky MichalStrehovsky deleted the folding branch June 14, 2024 07:11
@am11
Copy link
Member

am11 commented Jun 21, 2024

Available in 9.0.100-preview.6.24320.8. 5.6% reduction in webapiaot template with -p:IlcFoldIdenticalMethodBodies=true. 🚀

Since method body folding is still observable with stack trace APIs or debuggers. The user gets opted in by setting StackTraceSupport=false (or using an undocumented switch).

@MichalStrehovsky, is it going to be a permanent state or are there plans to enable it by default? e.g. when we find a folding candidate, save the mappings to give a close-enough experience (which we otherwise get from a g++ -g -O2 app-with-exceptions.cpp compiled apps).

@MichalStrehovsky
Copy link
Member Author

@MichalStrehovsky, is it going to be a permanent state or are there plans to enable it by default? e.g. when we find a folding candidate, save the mappings to give a close-enough experience

I was going back and forth on this. There are a couple options:

  • 1: Current state: enabled with StackTraceSupport=false, and there's an undocumented switch.
  • Add a new documented switch to control this separately:
    • 2a: Switch disabled by default (but still enabled by default if StackTraceSupport=false)
    • 2b: Switch enabled by default

1 is compatible and won't break anyone. 2a is also compatible, but adds a bit of mental overload because there's just more things to read about and think about. 2b would potentially be breaking. Not many people would be broken though, .NET Native for UWP did this by default and there wasn't even a formal opt out.

@dotnet/ilc-contrib anyone have thoughts?

@jkotas
Copy link
Member

jkotas commented Jun 25, 2024

I think that the current default is fine for how it is implemented currently.

  1. a) I do not see much value in documenting the fine-grained switch. Fine-grained switches are niche, people rarely use them.
  2. b) If we were to enable this optimization by default, I think we should hide the names of folded methods in the managed stacktraces to avoid confusion (ie make the method appear inlined).

An interesting middle-ground can be folding of different instantiation of the same generic method. We can use uninstantiated name for those without introducing confusion and reducing diagnosability. I am wondering how much of the size reduction we would get in this mode.

@MichalStrehovsky
Copy link
Member Author

I think that the current default is fine for how it is implemented currently.

Thank you! I've opened #103951 to track the approach with generics so that it's not lost.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants