trans: Only instantiate #[inline] functions in codegen units referencing them #36524

Merged
merged 4 commits into from Sep 21, 2016

Conversation

Projects
None yet
7 participants
@michaelwoerister
Contributor

michaelwoerister commented Sep 16, 2016

This PR changes how #[inline] functions are translated. Before, there was one "master instance" of the function with external linkage and a number of on-demand instances with available_externally linkage in each codegen unit that referenced the function. This had two downsides:

  • Public functions marked with #[inline] would be present in machine code of libraries unnecessarily (see #36280 for an example)
  • LLVM would crash on i686-pc-windows-msvc due to what I suspect to be a bug in LLVM's Win32 exception handling code, because it doesn't like available_externally there (#36309).

This PR changes the behavior, so that there is no master instance and only on-demand instances with internal linkage. The downside of this is potential code-bloat if LLVM does not completely inline away the internal instances because then there'd be N instances of the function instead of 1. However, this can only become a problem when using more than one codegen unit per crate.

cc @rust-lang/compiler

@rust-highfive

This comment has been minimized.

Show comment
Hide comment
@rust-highfive

rust-highfive Sep 16, 2016

Collaborator

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

Collaborator

rust-highfive commented Sep 16, 2016

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

@retep998

This comment has been minimized.

Show comment
Hide comment
@retep998

retep998 Sep 16, 2016

Member

I just tested this PR locally, and it appears to work. According to -Ztime-passes llvm passes has gone down by over an order of magnitude for winapi (328ms to 14ms). Translation went down also, but not as significantly (628ms to 433ms). The LLVM IR went down from 2,170KiB to 1KiB.

Here is the full LLVM IR for winapi:

; ModuleID = 'winapi.cgu-0.rs'
source_filename = "winapi.cgu-0.rs"
target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-windows-msvc"

!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!3, !4}

!0 = distinct !DICompileUnit(language: DW_LANG_Rust, file: !1, producer: "rustc version 1.13.0-dev (cf976fe2c 2016-09-15)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2)
!1 = !DIFile(filename: ".\5Csrc\5Clib.rs", directory: "C:\5Cmsys64\5Chome\5CPeter\5Cwinapi-rs")
!2 = !{}
!3 = !{i32 2, !"CodeView", i32 1}
!4 = !{i32 2, !"Debug Info Version", i32 3}

EDIT: Removed claim about inherent methods not working. I just forgot to recompile apparently after fixing a bug with a macro.

Member

retep998 commented Sep 16, 2016

I just tested this PR locally, and it appears to work. According to -Ztime-passes llvm passes has gone down by over an order of magnitude for winapi (328ms to 14ms). Translation went down also, but not as significantly (628ms to 433ms). The LLVM IR went down from 2,170KiB to 1KiB.

Here is the full LLVM IR for winapi:

; ModuleID = 'winapi.cgu-0.rs'
source_filename = "winapi.cgu-0.rs"
target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-windows-msvc"

!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!3, !4}

!0 = distinct !DICompileUnit(language: DW_LANG_Rust, file: !1, producer: "rustc version 1.13.0-dev (cf976fe2c 2016-09-15)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2)
!1 = !DIFile(filename: ".\5Csrc\5Clib.rs", directory: "C:\5Cmsys64\5Chome\5CPeter\5Cwinapi-rs")
!2 = !{}
!3 = !{i32 2, !"CodeView", i32 1}
!4 = !{i32 2, !"Debug Info Version", i32 3}

EDIT: Removed claim about inherent methods not working. I just forgot to recompile apparently after fixing a bug with a macro.

@pcwalton

This comment has been minimized.

Show comment
Hide comment
@pcwalton

pcwalton Sep 16, 2016

Contributor

@retep998 Wow! 😲

Contributor

pcwalton commented Sep 16, 2016

@retep998 Wow! 😲

@michaelwoerister

This comment has been minimized.

Show comment
Hide comment
@michaelwoerister

michaelwoerister Sep 16, 2016

Contributor

@retep998 Phew! And I was already wondering why I couldn't reproduce your results :)

It's to be expected that translation still takes some time, since that also includes the time for exporting metadata.

Contributor

michaelwoerister commented Sep 16, 2016

@retep998 Phew! And I was already wondering why I couldn't reproduce your results :)

It's to be expected that translation still takes some time, since that also includes the time for exporting metadata.

@brson brson added the relnotes label Sep 16, 2016

+ // This is coincidental: We could also instantiate something only if it
+ // is referenced (e.g. a regular, private function) but place it in its
+ // own codegen unit with "external" linkage.
+ self.is_instantiated_only_on_demand(tcx)

This comment has been minimized.

@nikomatsakis

nikomatsakis Sep 19, 2016

Contributor

Nit: it might be nice to put a few notes here as to why things need_local_copy

@nikomatsakis

nikomatsakis Sep 19, 2016

Contributor

Nit: it might be nice to put a few notes here as to why things need_local_copy

@nikomatsakis

This comment has been minimized.

Show comment
Hide comment
Contributor

nikomatsakis commented Sep 19, 2016

@bors r+

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Sep 19, 2016

Contributor

📌 Commit cf976fe has been approved by nikomatsakis

Contributor

bors commented Sep 19, 2016

📌 Commit cf976fe has been approved by nikomatsakis

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Sep 19, 2016

Contributor

⌛️ Testing commit cf976fe with merge 4f363c9...

Contributor

bors commented Sep 19, 2016

⌛️ Testing commit cf976fe with merge 4f363c9...

bors added a commit that referenced this pull request Sep 19, 2016

Auto merge of #36524 - michaelwoerister:trans-inline-only-on-demand, …
…r=nikomatsakis

trans: Only instantiate #[inline] functions in codegen units referencing them

This PR changes how `#[inline]` functions are translated. Before, there was one "master instance" of the function with `external` linkage and a number of on-demand instances with `available_externally` linkage in each codegen unit that referenced the function. This had two downsides:

* Public functions marked with `#[inline]` would be present in machine code of libraries unnecessarily (see #36280 for an example)
* LLVM would crash on `i686-pc-windows-msvc` due to what I suspect to be a bug in LLVM's Win32 exception handling code, because it doesn't like `available_externally` there (#36309).

This PR changes the behavior, so that there is no master instance and only on-demand instances with `internal` linkage. The downside of this is potential code-bloat if LLVM does not completely inline away the `internal` instances because then there'd be N instances of the function instead of 1. However, this can only become a problem when using more than one codegen unit per crate.

cc @rust-lang/compiler
@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Sep 20, 2016

Contributor

💔 Test failed - auto-win-gnu-32-opt-rustbuild

Contributor

bors commented Sep 20, 2016

💔 Test failed - auto-win-gnu-32-opt-rustbuild

@michaelwoerister

This comment has been minimized.

Show comment
Hide comment
@michaelwoerister

michaelwoerister Sep 20, 2016

Contributor

@bors retry

Spurious? cc @alexcrichton

Contributor

michaelwoerister commented Sep 20, 2016

@bors retry

Spurious? cc @alexcrichton

@bors

This comment has been minimized.

Show comment
Hide comment
@bors

bors Sep 21, 2016

Contributor

⌛️ Testing commit cf976fe with merge 5cc6c6b...

Contributor

bors commented Sep 21, 2016

⌛️ Testing commit cf976fe with merge 5cc6c6b...

bors added a commit that referenced this pull request Sep 21, 2016

Auto merge of #36524 - michaelwoerister:trans-inline-only-on-demand, …
…r=nikomatsakis

trans: Only instantiate #[inline] functions in codegen units referencing them

This PR changes how `#[inline]` functions are translated. Before, there was one "master instance" of the function with `external` linkage and a number of on-demand instances with `available_externally` linkage in each codegen unit that referenced the function. This had two downsides:

* Public functions marked with `#[inline]` would be present in machine code of libraries unnecessarily (see #36280 for an example)
* LLVM would crash on `i686-pc-windows-msvc` due to what I suspect to be a bug in LLVM's Win32 exception handling code, because it doesn't like `available_externally` there (#36309).

This PR changes the behavior, so that there is no master instance and only on-demand instances with `internal` linkage. The downside of this is potential code-bloat if LLVM does not completely inline away the `internal` instances because then there'd be N instances of the function instead of 1. However, this can only become a problem when using more than one codegen unit per crate.

cc @rust-lang/compiler

@bors bors merged commit cf976fe into rust-lang:master Sep 21, 2016

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details

@Boddlnagg Boddlnagg referenced this pull request in contextfree/winrt-rust Sep 22, 2016

Open

Improve compile time (or at least don't regress it) #26

@retep998 retep998 referenced this pull request in rust-lang/blog.rust-lang.org Sep 29, 2016

Merged

Rust 1.12 release announcement #126

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment