Skip to content

Commit

Permalink
Store LLVM bitcode in object files, not compressed
Browse files Browse the repository at this point in the history
This commit is an attempted resurrection of #70458 where LLVM bitcode
emitted by rustc into rlibs is stored into object file sections rather
than in a separate file. The main rationale for doing this is that when
rustc emits bitcode it will no longer use a custom compression scheme
which makes it both easier to interoperate with existing tools and also
cuts down on compile time since this compression isn't happening.

The blocker for this in #70458 turned out to be that native linkers
didn't handle the new sections well, causing the sections to either
trigger bugs in the linker or actually end up in the final linked
artifact. This commit attempts to address these issues by ensuring that
native linkers ignore the new sections by inserting custom flags with
module-level inline assembly.

Note that this does not currently change the API of the compiler at all.
The pre-existing `-C bitcode-in-rlib` flag is co-opted to indicate
whether the bitcode should be present in the object file or not.

Finally, note that an important consequence of this commit, which is also
one of its primary purposes, is to enable rustc's `-Clto` bitcode
loading to load rlibs produced with `-Clinker-plugin-lto`. The goal here
is that when you're building with LTO Cargo will tell rustc to skip
codegen of all intermediate crates and only generate LLVM IR. Today
rustc will generate both object code and LLVM IR, but the object code is
later simply thrown away, wastefully.
  • Loading branch information
alexcrichton committed Apr 29, 2020
1 parent 413a129 commit ef89cc8
Show file tree
Hide file tree
Showing 18 changed files with 189 additions and 266 deletions.
46 changes: 26 additions & 20 deletions src/doc/rustc/src/codegen-options/index.md
Expand Up @@ -7,6 +7,32 @@ a version of this list for your exact compiler by running `rustc -C help`.

This option is deprecated and does nothing.

## bitcode-in-rlib

This flag controls whether or not the compiler puts LLVM bitcode into generated
rlibs. It takes one of the following values:

* `y`, `yes`, `on`, or no value: put bitcode in rlibs (the default).
* `n`, `no`, or `off`: omit bitcode from rlibs.

LLVM bitcode is only needed when link-time optimization (LTO) is being
performed, but it is enabled by default for backwards compatibility reasons.

The use of `-C bitcode-in-rlib=no` can significantly improve compile times and
reduce generated file sizes. For these reasons, Cargo uses `-C
bitcode-in-rlib=no` whenever possible. Likewise, if you are building directly
with `rustc` we recommend using `-C bitcode-in-rlib=no` whenever you are not
using LTO.

If combined with `-C lto`, `-C bitcode-in-rlib=no` will cause `rustc` to abort
at start-up, because the combination is invalid.

> **Note**: the implementation of this flag today is to enable the
> `-Zembed-bitcode` option. When bitcode is embedded into an rlib then all
> object files within the rlib will have a special section (typically named
> `.llvmbc`, depends on the platform though) which contains LLVM bytecode. This
> section of the object file will not appear in the final linked artifact.
## code-model

This option lets you choose which code model to use.
Expand Down Expand Up @@ -387,26 +413,6 @@ This also supports the feature `+crt-static` and `-crt-static` to control
Each target and [`target-cpu`](#target-cpu) has a default set of enabled
features.

## bitcode-in-rlib

This flag controls whether or not the compiler puts compressed LLVM bitcode
into generated rlibs. It takes one of the following values:

* `y`, `yes`, `on`, or no value: put bitcode in rlibs (the default).
* `n`, `no`, or `off`: omit bitcode from rlibs.

LLVM bitcode is only needed when link-time optimization (LTO) is being
performed, but it is enabled by default for backwards compatibility reasons.

The use of `-C bitcode-in-rlib=no` can significantly improve compile times and
reduce generated file sizes. For these reasons, Cargo uses `-C
bitcode-in-rlib=no` whenever possible. Likewise, if you are building directly
with `rustc` we recommend using `-C bitcode-in-rlib=no` whenever you are not
using LTO.

If combined with `-C lto`, `-C bitcode-in-rlib=no` will cause `rustc` to abort
at start-up, because the combination is invalid.

[option-emit]: ../command-line-arguments.md#option-emit
[option-o-optimize]: ../command-line-arguments.md#option-o-optimize
[profile-guided optimization]: ../profile-guided-optimization.md
Expand Down
6 changes: 3 additions & 3 deletions src/librustc_codegen_llvm/back/archive.rs
Expand Up @@ -10,7 +10,7 @@ use std::str;
use crate::llvm::archive_ro::{ArchiveRO, Child};
use crate::llvm::{self, ArchiveKind};
use rustc_codegen_ssa::back::archive::{find_library, ArchiveBuilder};
use rustc_codegen_ssa::{looks_like_rust_object_file, METADATA_FILENAME, RLIB_BYTECODE_EXTENSION};
use rustc_codegen_ssa::{looks_like_rust_object_file, METADATA_FILENAME};
use rustc_session::Session;
use rustc_span::symbol::Symbol;

Expand Down Expand Up @@ -129,8 +129,8 @@ impl<'a> ArchiveBuilder<'a> for LlvmArchiveBuilder<'a> {
let obj_start = name.to_owned();

self.add_archive(rlib, move |fname: &str| {
// Ignore bytecode/metadata files, no matter the name.
if fname.ends_with(RLIB_BYTECODE_EXTENSION) || fname == METADATA_FILENAME {
// Ignore metadata files, no matter the name.
if fname == METADATA_FILENAME {
return true;
}

Expand Down
141 changes: 0 additions & 141 deletions src/librustc_codegen_llvm/back/bytecode.rs

This file was deleted.

48 changes: 32 additions & 16 deletions src/librustc_codegen_llvm/back/lto.rs
@@ -1,4 +1,3 @@
use crate::back::bytecode::DecodedBytecode;
use crate::back::write::{
self, save_temp_bitcode, to_llvm_opt_settings, with_llvm_pmb, DiagnosticHandlers,
};
Expand All @@ -10,7 +9,7 @@ use rustc_codegen_ssa::back::lto::{LtoModuleCodegen, SerializedModule, ThinModul
use rustc_codegen_ssa::back::symbol_export;
use rustc_codegen_ssa::back::write::{CodegenContext, FatLTOInput, ModuleConfig};
use rustc_codegen_ssa::traits::*;
use rustc_codegen_ssa::{ModuleCodegen, ModuleKind, RLIB_BYTECODE_EXTENSION};
use rustc_codegen_ssa::{looks_like_rust_object_file, ModuleCodegen, ModuleKind};
use rustc_data_structures::fx::{FxHashMap, FxHashSet};
use rustc_errors::{FatalError, Handler};
use rustc_hir::def_id::LOCAL_CRATE;
Expand Down Expand Up @@ -111,29 +110,46 @@ fn prepare_lto(
}

let archive = ArchiveRO::open(&path).expect("wanted an rlib");
let bytecodes = archive
let obj_files = archive
.iter()
.filter_map(|child| child.ok().and_then(|c| c.name().map(|name| (name, c))))
.filter(|&(name, _)| name.ends_with(RLIB_BYTECODE_EXTENSION));
for (name, data) in bytecodes {
let _timer =
cgcx.prof.generic_activity_with_arg("LLVM_lto_load_upstream_bitcode", name);
info!("adding bytecode {}", name);
let bc_encoded = data.data();

let (bc, id) = match DecodedBytecode::new(bc_encoded) {
Ok(b) => Ok((b.bytecode(), b.identifier().to_string())),
Err(e) => Err(diag_handler.fatal(&e)),
}?;
let bc = SerializedModule::FromRlib(bc);
upstream_modules.push((bc, CString::new(id).unwrap()));
.filter(|&(name, _)| looks_like_rust_object_file(name));
for (name, child) in obj_files {
info!("adding bitcode from {}", name);
match get_bitcode_slice_from_object_data(child.data()) {
Ok(data) => {
let module = SerializedModule::FromRlib(data.to_vec());
upstream_modules.push((module, CString::new(name).unwrap()));
}
Err(msg) => return Err(diag_handler.fatal(&msg)),
}
}
}
}

Ok((symbol_white_list, upstream_modules))
}

fn get_bitcode_slice_from_object_data(obj: &[u8]) -> Result<&[u8], String> {
let mut len = 0;
let data =
unsafe { llvm::LLVMRustGetBitcodeSliceFromObjectData(obj.as_ptr(), obj.len(), &mut len) };
if !data.is_null() {
assert!(len != 0);
let bc = unsafe { slice::from_raw_parts(data, len) };

// `bc` must be a sub-slice of `obj`.
assert!(obj.as_ptr() <= bc.as_ptr());
assert!(bc[bc.len()..bc.len()].as_ptr() <= obj[obj.len()..obj.len()].as_ptr());

Ok(bc)
} else {
assert!(len == 0);
let msg = llvm::last_error().unwrap_or_else(|| "unknown LLVM error".to_string());
Err(format!("failed to get bitcode from object file for LTO ({})", msg))
}
}

/// Performs fat LTO by merging all modules into a single one and returning it
/// for further optimization.
pub(crate) fn run_fat(
Expand Down
66 changes: 50 additions & 16 deletions src/librustc_codegen_llvm/back/write.rs
@@ -1,5 +1,4 @@
use crate::attributes;
use crate::back::bytecode;
use crate::back::lto::ThinBuffer;
use crate::back::profiling::{
selfprofile_after_pass_callback, selfprofile_before_pass_callback, LlvmSelfProfiler,
Expand All @@ -16,7 +15,7 @@ use crate::ModuleLlvm;
use log::debug;
use rustc_codegen_ssa::back::write::{BitcodeSection, CodegenContext, EmitObj, ModuleConfig};
use rustc_codegen_ssa::traits::*;
use rustc_codegen_ssa::{CompiledModule, ModuleCodegen, RLIB_BYTECODE_EXTENSION};
use rustc_codegen_ssa::{CompiledModule, ModuleCodegen};
use rustc_data_structures::small_c_str::SmallCStr;
use rustc_errors::{FatalError, Handler};
use rustc_fs_util::{link_or_copy, path_to_c_string};
Expand Down Expand Up @@ -669,19 +668,6 @@ pub(crate) unsafe fn codegen(
);
embed_bitcode(cgcx, llcx, llmod, Some(data));
}

if config.emit_bc_compressed {
let _timer = cgcx.prof.generic_activity_with_arg(
"LLVM_module_codegen_emit_compressed_bitcode",
&module.name[..],
);
let dst = bc_out.with_extension(RLIB_BYTECODE_EXTENSION);
let data = bytecode::encode(&module.name, data);
if let Err(e) = fs::write(&dst, data) {
let msg = format!("failed to write bytecode to {}: {}", dst.display(), e);
diag_handler.err(&msg);
}
}
} else if config.emit_obj == EmitObj::ObjectCode(BitcodeSection::Marker) {
embed_bitcode(cgcx, llcx, llmod, None);
}
Expand Down Expand Up @@ -792,7 +778,6 @@ pub(crate) unsafe fn codegen(
Ok(module.into_compiled_module(
config.emit_obj != EmitObj::None,
config.emit_bc,
config.emit_bc_compressed,
&cgcx.output_filenames,
))
}
Expand Down Expand Up @@ -847,6 +832,55 @@ unsafe fn embed_bitcode(
let section = if is_apple { "__LLVM,__cmdline\0" } else { ".llvmcmd\0" };
llvm::LLVMSetSection(llglobal, section.as_ptr().cast());
llvm::LLVMRustSetLinkage(llglobal, llvm::Linkage::PrivateLinkage);

// We're adding custom sections to the output object file, but we definitely
// do not want these custom sections to make their way into the final linked
// executable. The purpose of these custom sections is for tooling
// surrounding object files to work with the LLVM IR, if necessary. For
// example rustc's own LTO will look for LLVM IR inside of the object file
// in these sections by default.
//
// To handle this is a bit different depending on the object file format
// used by the backend, broken down into a few different categories:
//
// * Mach-O - this is for macOS. Inspecting the source code for the native
// linker here shows that the `.llvmbc` and `.llvmcmd` sections are
// automatically skipped by the linker. In that case there's nothing extra
// that we need to do here.
//
// * Wasm - the native LLD linker is hard-coded to skip `.llvmbc` and
// `.llvmcmd` sections, so there's nothing extra we need to do.
//
// * COFF - if we don't do anything the linker will by default copy all
// these sections to the output artifact, not what we want! To subvert
// this we want to flag the sections we inserted here as
// `IMAGE_SCN_LNK_REMOVE`. Unfortunately though LLVM has no native way to
// do this. Thankfully though we can do this with some inline assembly,
// which is easy enough to add via module-level global inline asm.
//
// * ELF - this is very similar to COFF above. One difference is that these
// sections are removed from the output linked artifact when
// `--gc-sections` is passed, which we pass by default. If that flag isn't
// passed though then these sections will show up in the final output.
// Additionally the flag that we need to set here is `SHF_EXCLUDE`.
if is_apple
|| cgcx.opts.target_triple.triple().starts_with("wasm")
|| cgcx.opts.target_triple.triple().starts_with("asmjs")
{
// nothing to do here
} else if cgcx.opts.target_triple.triple().contains("windows") {
let asm = "
.section .llvmbc,\"n\"
.section .llvmcmd,\"n\"
";
llvm::LLVMRustAppendModuleInlineAsm(llmod, asm.as_ptr().cast(), asm.len());
} else {
let asm = "
.section .llvmbc,\"e\"
.section .llvmcmd,\"e\"
";
llvm::LLVMRustAppendModuleInlineAsm(llmod, asm.as_ptr().cast(), asm.len());
}
}

pub unsafe fn with_llvm_pmb(
Expand Down

0 comments on commit ef89cc8

Please sign in to comment.