Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustdoc crashing on linux-sparc64 due to unaligned access #56927

Closed
glaubitz opened this issue Dec 17, 2018 · 15 comments
Closed

rustdoc crashing on linux-sparc64 due to unaligned access #56927

glaubitz opened this issue Dec 17, 2018 · 15 comments
Assignees
Labels
P-high High priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@glaubitz
Copy link
Contributor

Since version 1.31.0, rustdoc crashes when running the test src/test/run-make-fulldeps/rustdoc-error-lines:

(sid_sparc64-dchroot)glaubitz@kyoto:~/rust$ cd src/test/run-make-fulldeps/rustdoc-error-lines
(sid_sparc64-dchroot)glaubitz@kyoto:~/rust/src/test/run-make-fulldeps/rustdoc-error-lines$ LD_LIBRARY_PATH="/home/glaubitz/rust/build/sparc64-unknown-linux-gnu/test/run-make-fulldeps/rustdoc-error-lines/rustdoc-error-lines:/home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage2/lib:/home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage0-bootstrap-tools/sparc64-unknown-linux-gnu/release/deps:/usr/lib:" '/home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage2/bin/rustdoc' -L /home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage2/lib/rustlib/sparc64-unknown-linux-gnu/lib --test input.rs
Bus error
(sid_sparc64-dchroot)glaubitz@kyoto:~/rust/src/test/run-make-fulldeps/rustdoc-error-lines$
(sid_sparc64-dchroot)glaubitz@kyoto:~/rust/src/test/run-make-fulldeps/rustdoc-error-lines$ LD_LIBRARY_PATH="/home/glaubitz/rust/build/sparc64-unknown-linux-gnu/test/run-make-fulldeps/rustdoc-error-lines/rustdoc-error-lines:/home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage2/lib:/home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage0-bootstrap-tools/sparc64-unknown-linux-gnu/release/deps:/usr/lib:" '/home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage2/bin/rustdoc' -L /home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage2/lib/rustlib/sparc64-unknown-linux-gnu/lib --test input.rs
Bus error
(sid_sparc64-dchroot)glaubitz@kyoto:~/rust/src/test/run-make-fulldeps/rustdoc-error-lines$ gdb /home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage2/bin/rustdoc       GNU gdb (Debian 8.2-1) 8.2                                                                                                                                                  Copyright (C) 2018 Free Software Foundation, Inc.                                                                                                                           License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "sparc64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage2/bin/rustdoc...(no debugging symbols found)...done.
(gdb) set environment LD_LIBRARY_PATH "/home/glaubitz/rust/build/sparc64-unknown-linux-gnu/test/run-make-fulldeps/rustdoc-error-lines/rustdoc-error-lines:/home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage2/lib:/home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage0-bootstrap-tools/sparc64-unknown-linux-gnu/release/deps:/usr/lib:"
(gdb) r -L /home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage2/lib/rustlib/sparc64-unknown-linux-gnu/lib --test input.rs
Starting program: /home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage2/bin/rustdoc -L /home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage2/lib/rustlib/sparc64-unknown-linux-gnu/lib --test input.rs
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/sparc64-linux-gnu/libthread_db.so.1".
[New Thread 0xffff80010793e100 (LWP 16987)]

Thread 2 "rustdoc" received signal SIGBUS, Bus error.
[Switching to Thread 0xffff80010793e100 (LWP 16987)]
0x000001000038a48c in pulldown_cmark::parse::RawParser::new_with_links_and_callback ()
(gdb) bt
#0  0x000001000038a48c in pulldown_cmark::parse::RawParser::new_with_links_and_callback ()
#1  0x00000100003855c0 in pulldown_cmark::passes::Parser::new_with_broken_link_callback ()
#2  0x000001000038546c in pulldown_cmark::passes::Parser::new ()
#3  0x000001000014a8a4 in rustdoc::html::markdown::find_testable_code ()
#4  0x00000100003357a4 in <rustdoc::test::HirCollector<'a, 'hir> as rustc::hir::intravisit::Visitor<'hir>>::visit_item ()
#5  0x00000100002e2178 in <scoped_tls::ScopedKey<T>>::set ()
#6  0x00000100003304d4 in rustdoc::test::run ()
#7  0x000001000009b3fc in rustdoc::main_args ()
#8  0x00000100002e0b68 in <scoped_tls::ScopedKey<T>>::set ()
#9  0x0000010000082ae4 in syntax::with_globals ()
#10 0x000001000015b600 in std::sys_common::backtrace::__rust_begin_short_backtrace ()
#11 0x000001000015bf40 in std::panicking::try::do_call ()
#12 0xffff8001044fdf58 in __rust_maybe_catch_panic () from /home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage2/lib/libstd-3b39188da0a8cce7.so
#13 0x0000010000108be4 in <F as alloc::boxed::FnBox<A>>::call_box ()
#14 0xffff8001044eddc8 in std::sys_common::thread::start_thread () from /home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage2/lib/libstd-3b39188da0a8cce7.so
#15 0xffff8001044b6300 in std::sys::unix::thread::Thread::new::thread_start ()
   from /home/glaubitz/rust/build/sparc64-unknown-linux-gnu/stage2/lib/libstd-3b39188da0a8cce7.so
#16 0xffff80010478fe68 in start_thread () from /lib/sparc64-linux-gnu/libpthread.so.0
#17 0xffff800104aa8274 in ?? () from /lib/sparc64-linux-gnu/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)

I have not worked out yet whether this is a regression in the Rust compiler itself such that it generates code with unaligned access (we had this before) or whether rustdoc itself contains broken code.

CC @jrtc27 @psumbera @karcherm

@glaubitz
Copy link
Contributor Author

glaubitz commented Dec 17, 2018

Looking at the build log (https://buildd.debian.org/status/fetch.php?pkg=rustc&arch=sparc64&ver=1.31.0%2Bdfsg1-2&stamp=1545047706&raw=0), it looks like it's affecting rustdoc only. And it looks like an issue in the crate https://github.com/raphlinus/pulldown-cmark.

@nagisa
Copy link
Member

nagisa commented Dec 18, 2018

Would it be possible to disassemble the function in question? Does the failure occur with the official builds? If so, bisecting with cargo-bisect-rustc might help to quickly narrow down the cause.

I would do it myself, but I’ve been unsuccessful in setting up sparc VMs (installed debian fails to boot, and I haven’t gotten around to dealing with it yet).


pulldown-cmark itself has no unsafe code so the bug is definitely in either:

  1. The compiler;
  2. Libraries;
  3. Dependencies (but there aren’t many of them…)

@nagisa nagisa added regression-from-stable-to-stable Performance or correctness regression from one stable version to another. I-nominated T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 18, 2018
@glaubitz
Copy link
Contributor Author

glaubitz commented Dec 18, 2018

I am currently bisecting this, I'm almost there :).

As for SPARC machines, we have set up a SPARC T5 VM in the gcc compile farm which every developer can use who applies for a free account at the gcc compile farm, see: https://gcc.gnu.org/wiki/CompileFarm There are both SPARC machines running Linux as well as Solaris available.

@glaubitz
Copy link
Contributor Author

Found the commit which introduced the bug:

06b8c3ee5b52dc849259655578afca70cb32dda3 is the first bad commit
commit 06b8c3ee5b52dc849259655578afca70cb32dda3
Author: Colin Pronovost <colin.pron@live.com>
Date:   Mon Sep 24 23:09:44 2018 -0400

    Rely only on base alignment and offset for computing field alignment
    
    Fix #54028

:040000 040000 b5af5c83972da8e9ff9f10e923608400e82f2964 439ea89453a897b6c984eb58b7cd43836b299e3e M      src

@nagisa
Copy link
Member

nagisa commented Dec 19, 2018

This smellls similar to #56267, although the fixes for that particular problem have been backported and are available in stable and the lint pass does not complain about pulldown_cmark in any way...

@pnkfelix
Copy link
Member

discussed at T-compiler meeting. P-high, since this may be symptom of broader UB beyond the case outlined here.

@pnkfelix pnkfelix added P-high High priority and removed I-nominated labels Dec 20, 2018
@nagisa
Copy link
Member

nagisa commented Dec 20, 2018

cc @eddyb do you have any idea in what way could the bisected PR have broken things related to alignment?

The relevant PR here is #54547

@infinity0
Copy link
Contributor

Since you guys don't run tests on sparc64, is there a way to write a test for this that crashes on x86-64? I originally told @glaubitz to ignore the sparc64 failures but it looks like it was good that he chased it down!

@nagisa
Copy link
Member

nagisa commented Dec 20, 2018

Since you guys don't run tests on sparc64, is there a way to write a test for this that crashes on x86-64?

We have codegen tests which do not depend on test being run (checks the generated LLVM-IR/Assembly/etc), and so can be run on any host.

@jrtc27
Copy link

jrtc27 commented Dec 21, 2018

Ok, here's a reduced test case that shows the incorrect output produced by rustc:

pub struct RawParser {
    // Unused but necessary; only reproduces when active_tab isn't the
    // first field.
    off: usize,
    active_tab: [u8; 256],
}

impl RawParser {
    pub fn init_active(&mut self) {
        for &c in b"\x00\t\n\r_\\&*[!`<" {
            self.active_tab[c as usize] = 1;
        }
    }
}

This then generates the following IR (you need -O to unroll the loop):

; lib::RawParser:: init_active
; Function Attrs: nounwind uwtable
define void @_ZN3lib9RawParser11init_active17hda1bc094c972134fE(%RawParser* nocapture dereferenceable(264) %self) unnamed_addr #0 {
start:
  %0 = getelementptr inbounds %RawParser, %RawParser* %self, i64 0, i32 3, i64 0
  store i8 1, i8* %0, align 8
  %1 = getelementptr inbounds %RawParser, %RawParser* %self, i64 0, i32 3, i64 9
  store i8 1, i8* %1, align 8
  %2 = getelementptr inbounds %RawParser, %RawParser* %self, i64 0, i32 3, i64 10
  store i8 1, i8* %2, align 8
  %3 = getelementptr inbounds %RawParser, %RawParser* %self, i64 0, i32 3, i64 13
  store i8 1, i8* %3, align 8
  %4 = getelementptr inbounds %RawParser, %RawParser* %self, i64 0, i32 3, i64 95
  store i8 1, i8* %4, align 8
  %5 = getelementptr inbounds %RawParser, %RawParser* %self, i64 0, i32 3, i64 92
  store i8 1, i8* %5, align 8
  %6 = getelementptr inbounds %RawParser, %RawParser* %self, i64 0, i32 3, i64 38
  store i8 1, i8* %6, align 8
  %7 = getelementptr inbounds %RawParser, %RawParser* %self, i64 0, i32 3, i64 42
  store i8 1, i8* %7, align 8
  %8 = getelementptr inbounds %RawParser, %RawParser* %self, i64 0, i32 3, i64 91
  store i8 1, i8* %8, align 8
  %9 = getelementptr inbounds %RawParser, %RawParser* %self, i64 0, i32 3, i64 33
  store i8 1, i8* %9, align 8
  %10 = getelementptr inbounds %RawParser, %RawParser* %self, i64 0, i32 3, i64 96
  store i8 1, i8* %10, align 8
  %11 = getelementptr inbounds %RawParser, %RawParser* %self, i64 0, i32 3, i64 60
  store i8 1, i8* %11, align 8
  ret void
}

These elements are obviously not all 8-byte aligned, but the result is that LLVM combines adjacent stores, specifically 9/10, 91/92 and 95/96, into store i16 257, i16* ... (well, actually it seems to happen at the first instruction selection step). We're unlucky and it turns out none of these are 2-byte aligned as each of them has an odd first half of the pair.

I believe the problem is in project_index, which does:

    pub fn project_index<Bx: BuilderMethods<'a, 'tcx, Value = V>>(
        &self,
        bx: &mut Bx,
        llindex: V
    ) -> Self {
        PlaceRef {
            llval: bx.inbounds_gep(self.llval, &[bx.cx().const_usize(0), llindex]),
            llextra: None,
            layout: self.layout.field(bx.cx(), 0),
            align: self.align
        }
    }

Specifically, the array itself is aligned to 8 bytes and so the mentioned PR is correct to not force the array to be assumed aligned only to 1 byte (as was happening before with the extra mins), but it has exposed this flaw in project_index. Specifically, for constant/immediate llindex, it can do self.align.restrict_for_offset(index * element_size), but in general if llindex is an arbitrary non-constant expression it needs to be self.align.restrict_for_offset(element_size), i.e. assuming the worst-case of index == 1.

Finally, here's a more minimal test case that doesn't try to look like the code from pulldown-cmark:

pub struct S {
    pad: usize,
    arr: [u8; 3],
}

impl S {
    pub fn f(&mut self) {
        self.arr[1] = 0;
        self.arr[2] = 0;
    }
}

This gives (still with -O, otherwise LLVM won't merge the loads):

; min::S::f
; Function Attrs: norecurse nounwind uwtable
define void @_ZN3min1S1f17h1ac21da989830415E(%S* nocapture dereferenceable(16) %self) unnamed_addr #0 {
start:
  %0 = getelementptr inbounds %S, %S* %self, i64 0, i32 3, i64 1
  store i8 0, i8* %0, align 8
  %1 = getelementptr inbounds %S, %S* %self, i64 0, i32 3, i64 2
  store i8 0, i8* %1, align 8
  ret void
}

@nikic
Copy link
Contributor

nikic commented Dec 21, 2018

Ugh, we already fixed two very similar issues, and this is the third. Reviewing other code that generates GEPs, we also have the same issue with Repeat handling:

#[repr(align(64))]
pub struct Foo([i32; 16]);

pub fn test(x: &mut Foo) {
    x.0 = [17; 16];
}

Generates:

define void @_ZN6test184test17hdf9fe1a59f5e8f01E(%Foo* dereferenceable(64) %x) unnamed_addr #0 {
start:
  %0 = bitcast %Foo* %x to [16 x i32]*
  %1 = getelementptr inbounds [16 x i32], [16 x i32]* %0, i64 0, i64 0
  %2 = getelementptr inbounds [16 x i32], [16 x i32]* %0, i64 0, i64 16
  br label %repeat_loop_header

repeat_loop_header:                               ; preds = %repeat_loop_body, %start
  %3 = phi i32* [ %1, %start ], [ %5, %repeat_loop_body ]
  %4 = icmp ne i32* %3, %2
  br i1 %4, label %repeat_loop_body, label %repeat_loop_next

repeat_loop_body:                                 ; preds = %repeat_loop_header
  store i32 17, i32* %3, align 64
  %5 = getelementptr inbounds i32, i32* %3, i64 1
  br label %repeat_loop_header

repeat_loop_next:                                 ; preds = %repeat_loop_header
  ret void
}

Notably, that store is not going to be 64 byte aligned for all array indexes...

@jrtc27
Copy link

jrtc27 commented Dec 21, 2018

I think something like this is what you want:

    pub fn project_index<Bx: BuilderMethods<'a, 'tcx, Value = V>>(
        &self,
        bx: &mut Bx,
        llindex: V
    ) -> Self {
        let index = if bx.cx().is_const_integral(llindex) { bx.cx().const_to_uint(llindex) } else { 1 };
        let layout = self.layout.field(bx.cx(), 0);
        PlaceRef {
            llval: bx.inbounds_gep(self.llval, &[bx.cx().const_usize(0), llindex]),
            llextra: None,
            layout: layout,
            align: self.align.restrict_for_offset(index * layout.size)
        }
    }

@nikic
Copy link
Contributor

nikic commented Dec 21, 2018

I've submitted #57053 to fix both issues.

Centril added a commit to Centril/rust that referenced this issue Dec 22, 2018
Fix alignment for array indexing

We need to reduce the alignment with the used offset. If the offset isn't known, use the element size, as this will yield the minimum possible alignment.

This handles both direct array indexing, and array repeat expressions.

Fixes rust-lang#56927.

r? @nagisa
Centril added a commit to Centril/rust that referenced this issue Dec 23, 2018
Fix alignment for array indexing

We need to reduce the alignment with the used offset. If the offset isn't known, use the element size, as this will yield the minimum possible alignment.

This handles both direct array indexing, and array repeat expressions.

Fixes rust-lang#56927.

r? @nagisa
Centril added a commit to Centril/rust that referenced this issue Dec 23, 2018
Fix alignment for array indexing

We need to reduce the alignment with the used offset. If the offset isn't known, use the element size, as this will yield the minimum possible alignment.

This handles both direct array indexing, and array repeat expressions.

Fixes rust-lang#56927.

r? @nagisa
Centril added a commit to Centril/rust that referenced this issue Dec 23, 2018
Fix alignment for array indexing

We need to reduce the alignment with the used offset. If the offset isn't known, use the element size, as this will yield the minimum possible alignment.

This handles both direct array indexing, and array repeat expressions.

Fixes rust-lang#56927.

r? @nagisa
@nikic
Copy link
Contributor

nikic commented Dec 24, 2018

Reopening to track beta backport.

@nikic nikic reopened this Dec 24, 2018
@nikic
Copy link
Contributor

nikic commented Jan 4, 2019

Beta backport landed as part of #57305.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P-high High priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants