Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trans: Only instantiate #[inline] functions in codegen units referencing them #36524

Merged
merged 4 commits into from Sep 21, 2016

Conversation

@michaelwoerister
Copy link
Contributor

michaelwoerister commented Sep 16, 2016

This PR changes how #[inline] functions are translated. Before, there was one "master instance" of the function with external linkage and a number of on-demand instances with available_externally linkage in each codegen unit that referenced the function. This had two downsides:

  • Public functions marked with #[inline] would be present in machine code of libraries unnecessarily (see #36280 for an example)
  • LLVM would crash on i686-pc-windows-msvc due to what I suspect to be a bug in LLVM's Win32 exception handling code, because it doesn't like available_externally there (#36309).

This PR changes the behavior, so that there is no master instance and only on-demand instances with internal linkage. The downside of this is potential code-bloat if LLVM does not completely inline away the internal instances because then there'd be N instances of the function instead of 1. However, this can only become a problem when using more than one codegen unit per crate.

cc @rust-lang/compiler

@rust-highfive
Copy link
Collaborator

rust-highfive commented Sep 16, 2016

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

@retep998
Copy link
Member

retep998 commented Sep 16, 2016

I just tested this PR locally, and it appears to work. According to -Ztime-passes llvm passes has gone down by over an order of magnitude for winapi (328ms to 14ms). Translation went down also, but not as significantly (628ms to 433ms). The LLVM IR went down from 2,170KiB to 1KiB.

Here is the full LLVM IR for winapi:

; ModuleID = 'winapi.cgu-0.rs'
source_filename = "winapi.cgu-0.rs"
target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-windows-msvc"

!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!3, !4}

!0 = distinct !DICompileUnit(language: DW_LANG_Rust, file: !1, producer: "rustc version 1.13.0-dev (cf976fe2c 2016-09-15)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2)
!1 = !DIFile(filename: ".\5Csrc\5Clib.rs", directory: "C:\5Cmsys64\5Chome\5CPeter\5Cwinapi-rs")
!2 = !{}
!3 = !{i32 2, !"CodeView", i32 1}
!4 = !{i32 2, !"Debug Info Version", i32 3}

EDIT: Removed claim about inherent methods not working. I just forgot to recompile apparently after fixing a bug with a macro.

@pcwalton
Copy link
Contributor

pcwalton commented Sep 16, 2016

@retep998 Wow! 😲

@michaelwoerister
Copy link
Contributor Author

michaelwoerister commented Sep 16, 2016

@retep998 Phew! And I was already wondering why I couldn't reproduce your results :)

It's to be expected that translation still takes some time, since that also includes the time for exporting metadata.

@brson brson added the relnotes label Sep 16, 2016
// This is coincidental: We could also instantiate something only if it
// is referenced (e.g. a regular, private function) but place it in its
// own codegen unit with "external" linkage.
self.is_instantiated_only_on_demand(tcx)

This comment has been minimized.

Copy link
@nikomatsakis

nikomatsakis Sep 19, 2016

Contributor

Nit: it might be nice to put a few notes here as to why things need_local_copy

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Sep 19, 2016

@bors r+

@bors
Copy link
Contributor

bors commented Sep 19, 2016

📌 Commit cf976fe has been approved by nikomatsakis

@bors
Copy link
Contributor

bors commented Sep 19, 2016

Testing commit cf976fe with merge 4f363c9...

bors added a commit that referenced this pull request Sep 19, 2016
…r=nikomatsakis

trans: Only instantiate #[inline] functions in codegen units referencing them

This PR changes how `#[inline]` functions are translated. Before, there was one "master instance" of the function with `external` linkage and a number of on-demand instances with `available_externally` linkage in each codegen unit that referenced the function. This had two downsides:

* Public functions marked with `#[inline]` would be present in machine code of libraries unnecessarily (see #36280 for an example)
* LLVM would crash on `i686-pc-windows-msvc` due to what I suspect to be a bug in LLVM's Win32 exception handling code, because it doesn't like `available_externally` there (#36309).

This PR changes the behavior, so that there is no master instance and only on-demand instances with `internal` linkage. The downside of this is potential code-bloat if LLVM does not completely inline away the `internal` instances because then there'd be N instances of the function instead of 1. However, this can only become a problem when using more than one codegen unit per crate.

cc @rust-lang/compiler
@bors
Copy link
Contributor

bors commented Sep 20, 2016

💔 Test failed - auto-win-gnu-32-opt-rustbuild

@michaelwoerister
Copy link
Contributor Author

michaelwoerister commented Sep 20, 2016

@bors retry

Spurious? cc @alexcrichton

@bors
Copy link
Contributor

bors commented Sep 21, 2016

Testing commit cf976fe with merge 5cc6c6b...

bors added a commit that referenced this pull request Sep 21, 2016
…r=nikomatsakis

trans: Only instantiate #[inline] functions in codegen units referencing them

This PR changes how `#[inline]` functions are translated. Before, there was one "master instance" of the function with `external` linkage and a number of on-demand instances with `available_externally` linkage in each codegen unit that referenced the function. This had two downsides:

* Public functions marked with `#[inline]` would be present in machine code of libraries unnecessarily (see #36280 for an example)
* LLVM would crash on `i686-pc-windows-msvc` due to what I suspect to be a bug in LLVM's Win32 exception handling code, because it doesn't like `available_externally` there (#36309).

This PR changes the behavior, so that there is no master instance and only on-demand instances with `internal` linkage. The downside of this is potential code-bloat if LLVM does not completely inline away the `internal` instances because then there'd be N instances of the function instead of 1. However, this can only become a problem when using more than one codegen unit per crate.

cc @rust-lang/compiler
1 similar comment
@bors bors merged commit cf976fe into rust-lang:master Sep 21, 2016
2 checks passed
2 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

7 participants
You can’t perform that action at this time.