Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upSegfault in safe Rust when compiling with optimisations #30081
Comments
This comment has been minimized.
This comment has been minimized.
|
Reproduction without box patterns: #[derive(Debug, Clone)]
pub enum Instruction {
Increment {
amount: ::std::num::Wrapping<i8>,
},
Loop(Vec<Box<Instruction>>),
}
/// Defines a method on iterators to map a function over all loop bodies.
trait MapLoopsExt: Iterator<Item=Box<Instruction>> {
fn map_loops<F>(&mut self, f: F) -> Vec<Box<Instruction>>
where F: Fn(Vec<Box<Instruction>>) -> Vec<Box<Instruction>>
{
self.map(|instr| {
match *instr {
Instruction::Loop(body) => Box::new( Instruction::Loop(f(body)) ),
other => Box::new(other)
}
}).collect()
}
}
impl<I> MapLoopsExt for I where I: Iterator<Item=Box<Instruction>> { }
/// Remove any loops where we know the current cell is zero.
pub fn remove_dead_loops(instrs: Vec<Box<Instruction>>) -> Vec<Box<Instruction>> {
instrs.clone()
.into_iter()
.enumerate()
.map(|(_, instr)| instr)
.map_loops(remove_dead_loops)
}
fn main() {
let mut instrs = vec![];
instrs = remove_dead_loops(instrs);
println!("{:?}", instrs);
}Running in valgrind (
|
This comment has been minimized.
This comment has been minimized.
|
Two changes that have eliminated the crash:
self.map(|mut instr| {
if let Instruction::Loop(ref mut body) = *instr {
*body = f( ::std::mem::replace(body, Vec::new()) );
}
instr
}).collect()
|
This comment has been minimized.
This comment has been minimized.
|
A little simpler: pub enum Instruction {
Increment(i8),
Loop(Vec<Box<Instruction>>),
}
/// Defines a method on iterators to map a function over all loop bodies.
fn map_loops<F, I>(it: I, f: F) -> Vec<Box<Instruction>>
where F: Fn(Vec<Box<Instruction>>) -> Vec<Box<Instruction>>,
I: Iterator<Item=Box<Instruction>>
{
it.map(|instr| {
match *instr {
Instruction::Loop(body) => Box::new( Instruction::Loop(f(body)) ),
other => Box::new(other)
}
}).collect()
}
/// Remove any loops where we know the current cell is zero.
pub fn remove_dead_loops(instrs: Vec<Box<Instruction>>) -> Vec<Box<Instruction>> {
map_loops(instrs.into_iter()
.enumerate()
.map(|(_, instr)| instr), remove_dead_loops)
}
fn main() {
let mut instrs = vec![];
instrs = remove_dead_loops(instrs);
println!("{:?}", instrs.len());
} |
This comment has been minimized.
This comment has been minimized.
pub enum Instruction {
Increment(i8),
Loop(Box<Box<()>>),
}
fn main() {
let instrs: Option<(u8, Box<Instruction>)> = None;
instrs.into_iter()
.map(|(_, instr)| instr)
.map(|instr| match *instr { _other => {} })
.last();
} |
huonw
added
the
I-crash
label
Nov 27, 2015
This comment has been minimized.
This comment has been minimized.
|
@mitaa - That code leads to main being just
|
Manishearth
added
I-nominated
A-iterators
T-compiler
I-unsound 💥
labels
Nov 28, 2015
This comment has been minimized.
This comment has been minimized.
|
cc @dotdash |
arielb1
added
I-wrong
and removed
I-unsound 💥
labels
Nov 29, 2015
This comment has been minimized.
This comment has been minimized.
|
@mitaa's example doesn't lead directly to |
This comment has been minimized.
This comment has been minimized.
|
It's our null check elimination pass. At some point, the optimizer replaces something like %cmp = icmp eq i8* %ptr, null
%thing = getelementptr i8, i8* %ptr, i64 0
%ptr = select i1 %cmp, i8* null, i8* %thing
%cond = icmp eq i8* %ptr, null
br i1 %cond, label %none, label %somewith just %ptr = getelementptr inbounds i8, i8* %ptr, i64 0
%cond = icmp eq i8* %ptr, null
br i1 %cond, label %none, label %someAnd then the null check elimination pass turns that into %ptr = getelementptr inbounds i8, i8* %ptr, i64 0
%cond = icmp eq i8* %ptr, null
br i1 false, label %none, label %some |
This comment has been minimized.
This comment has been minimized.
|
Interesting! I'd personally be ok just removing that pass from our LLVM build until we can get around to fixing this. |
This comment has been minimized.
This comment has been minimized.
|
Same for me. I wonder how much the pass even does for us now. |
This comment has been minimized.
This comment has been minimized.
|
As I'm too lazy to perform a detailed analysis of the changes, here's a plain size comparison: With the pass enabled:
With the pass disabled:
Which suggests that the effects of the pass are pretty minor. It was probably worth more back when it was introduced because we had fewer means of marking things as non-null in LLVM. Now that we're using thing like So I almost feel inclined to just drop the pass completely instead of just disabling it. Opinions? |
dotdash
added a commit
to dotdash/rust
that referenced
this issue
Dec 2, 2015
This comment has been minimized.
This comment has been minimized.
|
This happens on the beta channel too. |
brson
added a commit
to brson/rust
that referenced
this issue
Dec 2, 2015
bors
added a commit
that referenced
this issue
Dec 2, 2015
bors
closed this
in
#30165
Dec 3, 2015
This comment has been minimized.
This comment has been minimized.
|
@dotdash both transformations look legitimate to me: How could I try to reproduce your results? Is it sufficient to compile without optimisation and to run them manually with |
This comment has been minimized.
This comment has been minimized.
|
@ranma42 Yes, that should work to reproduce it, if you use I'm not sure where you're seeing a dereference there though. In the original IR, the deref only happens when the pointer is non-null. There's a slight mistake in the IR samples above, because I reused the name %cmp = icmp eq i8* %ptr, null
%thing = getelementptr inbounds i8, i8* %ptr, i64 0
%ptr2 = select i1 %cmp, i8* null, i8* %thing
%cond = icmp eq i8* %ptr2, null
br i1 %cond, label %none, label %someHere, Now, after the first transformation we have: %ptr2 = getelementptr inbounds i8, i8* %ptr, i64 0
%cond = icmp eq i8* %ptr2, null
br i1 %cond, label %none, label %someAnd that happened because it assumed that |
This comment has been minimized.
This comment has been minimized.
|
Sorry, you're right, there is no dereference; also, with the full names it is definitely easier to read :) I am not completely sure about this, but I am afraid it might be the expected behaviour of LLVM poison values :\ In the first code, if %cmp = false
%thing = %ptr
%ptr2 = %ptr
%cond = false
br label %someIf %cmp = true
%thing = poison
%ptr2 = poison ; "Values other than phi nodes depend on their operands." :\
%cond = poison
br i1 %cond, label %none, label %someIn particular, I am unsure whether " |
Wilfred commentedNov 27, 2015
Compiling and running it: