Skip to content
This repository has been archived by the owner on Sep 2, 2018. It is now read-only.

Assertion failed: SrcReg and DstReg cannot be the same #229

Closed
shepmaster opened this issue Oct 9, 2016 · 19 comments
Closed

Assertion failed: SrcReg and DstReg cannot be the same #229

shepmaster opened this issue Oct 9, 2016 · 19 comments

Comments

@shepmaster
Copy link

../debug/bin/llc -march=avr -mcpu=atmega328p -filetype=obj < bugpoint-reduced-simplified.ll
Assertion failed: (DstReg != SrcReg && "SrcReg and DstReg cannot be the same"), function expand, file /Users/shep/Projects/avr-llvm/lib/Target/AVR/AVRExpandPseudoInsts.cpp, line 817.
Stack dump:
0.  Program arguments: ../debug/bin/llc -march=avr -mcpu=atmega328p -filetype=obj
1.  Running pass 'Function Pass Manager' on module '<stdin>'.
2.  Running pass 'AVR pseudo instruction expansion pass' on function '@__udivmodsi4'
Abort trap: 6

testcase:

; ModuleID = 'bugpoint-reduced-simplified.bc'
source_filename = "bugpoint-output-595719d.bc"
target datalayout = "e-p:16:8:8-i8:8:8-i16:8:8-i32:8:8-i64:8:8-f32:8:8-f64:8:8-n8"
target triple = "avr-atmel-none"

; Function Attrs: norecurse nounwind readnone uwtable
declare void @rust_eh_personality() unnamed_addr #0

; Function Attrs: nounwind uwtable
define void @__udivmoddi4(i64, i64) unnamed_addr #1 personality i32 (...)* bitcast (void ()* @rust_eh_personality to i32 (...)*) {
entry-block:
  %2 = lshr i64 %0, 32
  %3 = trunc i64 %2 to i32
  %4 = trunc i64 %0 to i32
  %5 = add i64 %1, -1
  br label %bb135

bb133.loopexit:                                   ; preds = %bb135
  ret void

bb135:                                            ; preds = %bb135, %entry-block
  %carry.0120 = phi i64 [ 0, %entry-block ], [ %phitmp, %bb135 ]
  %q.sroa.12.1119 = phi i32 [ %4, %entry-block ], [ %q.sroa.12.0.extract.trunc, %bb135 ]
  %q.sroa.0.1118 = phi i32 [ 0, %entry-block ], [ %q.sroa.0.0.extract.trunc, %bb135 ]
  %r.sroa.0.1116 = phi i32 [ %3, %entry-block ], [ undef, %bb135 ]
  %r.sroa.0.0.insert.ext62 = zext i32 %r.sroa.0.1116 to i64
  %r.sroa.0.0.insert.insert64 = or i64 0, %r.sroa.0.0.insert.ext62
  %6 = shl nuw nsw i64 %r.sroa.0.0.insert.ext62, 1
  %q.sroa.12.0.insert.ext101 = zext i32 %q.sroa.12.1119 to i64
  %q.sroa.12.0.insert.shift102 = shl nuw i64 %q.sroa.12.0.insert.ext101, 32
  %q.sroa.0.0.insert.ext87 = zext i32 %q.sroa.0.1118 to i64
  %q.sroa.0.0.insert.insert89 = or i64 %q.sroa.12.0.insert.shift102, %q.sroa.0.0.insert.ext87
  %7 = lshr i64 %q.sroa.12.0.insert.ext101, 31
  %8 = lshr i64 %r.sroa.0.0.insert.insert64, 31
  %9 = shl nuw nsw i64 %q.sroa.0.0.insert.ext87, 1
  %10 = or i64 %9, %carry.0120
  %q.sroa.0.0.extract.trunc = trunc i64 %10 to i32
  %11 = lshr i64 %q.sroa.0.0.insert.insert89, 31
  %q.sroa.12.0.extract.trunc = trunc i64 %11 to i32
  %r.sroa.13.0.insert.shift72 = shl i64 %8, 32
  %.masked114 = and i64 %6, 4294967294
  %r.sroa.0.0.insert.ext57 = or i64 %7, %.masked114
  %r.sroa.0.0.insert.insert59 = or i64 %r.sroa.0.0.insert.ext57, %r.sroa.13.0.insert.shift72
  %12 = sub i64 %5, %r.sroa.0.0.insert.insert59
  %13 = ashr i64 %12, 63
  %phitmp = and i64 %13, 1
  %14 = icmp ult i32 undef, 32
  br i1 %14, label %bb135, label %bb133.loopexit
}

attributes #0 = { norecurse nounwind readnone uwtable }
attributes #1 = { nounwind uwtable }

!llvm.module.flags = !{!0}

!0 = !{i32 1, !"PIE Level", i32 2}
@dylanmckay
Copy link
Member

Cleaned up and added a test case in tree in 948b820.

@dylanmckay
Copy link
Member

dylanmckay commented Oct 11, 2016

It looks like we are selecting

%R29R28<def> = LDDWRdYQ %R29R28, 1

@dylanmckay
Copy link
Member

As shown in the datasheet, this instruction is undefined

The result of these combinations is undefined:
LD r28, Y+
LD r29, Y+
LD r28, -Y
LD r29, -Y

@dylanmckay
Copy link
Member

This relates to #28

@dylanmckay
Copy link
Member

I don't think LLVM has any way currently to handle this.

@dylanmckay
Copy link
Member

Easiest and shittiest fix is to create a register class with only the X and Z instructions, and restrict the LDD instruction to only using those two. This would mean that if we had a frame pointer, we would only have a single register to load and store from, which probably wouldn't work.

@dylanmckay
Copy link
Member

I think this will require a lot of effort to fix correctly. I imagine that the best way to implement this would be to add a new hook that returns the set of available registers to select on given a machine instruction.

This is similar to the AVRRegisterInfo::getReservedRegs, although that operates on a function rather than a single instruction.

We could add a new virtual function the the TargetRegisterInfo class also called getReservedRegs but takes an instruction as input. We could then extend the register allocator to use the new method for each instruction.

@dylanmckay
Copy link
Member

llc -march=avr -mcpu=atmega328p ~/projects/llvm/git-svn/test/CodeGen/AVR/error-srcreg-destreg-same.ll -print-after-all 2>&1 | grep -E "IR Dump|LDD"
# *** IR Dump After Slot index numbering ***:
# *** IR Dump After Live Interval Analysis ***:
# *** IR Dump After Simple Register Coalescing ***:
# *** IR Dump After Rename Disconnected Subregister Components ***:
# *** IR Dump After Machine Instruction Scheduler ***:
# *** IR Dump After Debug Variable Analysis ***:
# *** IR Dump After Live Stack Slot Analysis ***:
# *** IR Dump After Virtual Register Map ***:
# *** IR Dump After Live Register Matrix ***:
# *** IR Dump After Virtual Register Rewriter ***:
2096B           %R29R28<def> = LDDWRdYQ <fi#1>, 0; mem:LD2[FixedStack1](align=1)
2128B           %R27R26<def> = LDDWRdYQ <fi#0>, 0; mem:LD2[FixedStack0](align=1)
# *** IR Dump After Stack Slot Coloring ***:
        %R29R28<def> = LDDWRdYQ <fi#1>, 0; mem:LD2[FixedStack1](align=1)
        %R27R26<def> = LDDWRdYQ <fi#0>, 0; mem:LD2[FixedStack0](align=1)
# *** IR Dump After Machine Loop Invariant Code Motion ***:
        %R29R28<def> = LDDWRdYQ <fi#1>, 0; mem:LD2[FixedStack1](align=1)
        %R27R26<def> = LDDWRdYQ <fi#0>, 0; mem:LD2[FixedStack0](align=1

It looks like the 'virtual register rewriter' pass is inserting the registers.

@dylanmckay
Copy link
Member

It looks like the root of this is in our implementation of AVRInstrInfo::loadRegFromStackSlot.

@dr-m
Copy link

dr-m commented Oct 23, 2016

I think I hit this same issue when compiling a C program of mine that uses 64-bit arithmetics and recursion: http://www.iki.fi/~msmakela/software/pentomino/pentomino.c (I thought it would make a nice test case). The poisonous combination is -DNDEBUG -O. With -O3 alone it will compile, as long as I do not remove any assert() statements from the functions fill() and prune().

@shepmaster
Copy link
Author

@dylanmckay any ideas on how we can work on this?

@dylanmckay
Copy link
Member

I did a small spike to check how long it would take to properly fix this in the register allocator a while back, it looks quite difficult because at the register allocator layer, there isn't really any visibility of the machine instructions that are being selected. There will be a way, but it's not an idiom I could see being used.

@dylanmckay
Copy link
Member

I'm looking into this again.

I've raised a bug on the LLVM bug tracker here, feel free to watchlist!

@dylanmckay
Copy link
Member

I've just fixed this in LLVM master!

r288897

@dr-m
Copy link

dr-m commented Dec 25, 2016

It appears to me that the fix has not yet been merged to avr-llvm.
As of this writing, the latest update to the avr-support branch is commit 01683bc, dated December 5, 2016, 2 days before the commit to upstream. I can repeat the failure with that commit. I think that it would be better to not close the affected before the fix has been merged from upstream.

@dylanmckay
Copy link
Member

Unless there's a compelling reason to merge upstream back into this branch, I don't think it's worth it.

Now that all of the code from here is merged into LLVM master, and development occurs there, you should be using that instead of this fork.

@dr-m
Copy link

dr-m commented Dec 25, 2016

Thanks for the clarification! That is even better. Not being aware of the merge to the LLVM master, I was wondering why there are AVR-specific changes in the upstream repository.
Perhaps https://github.com/avr-llvm/llvm/wiki/Getting%20Started could be revised, replacing "git clone" commands with "svn checkout" commands.

@dr-m
Copy link

dr-m commented Dec 25, 2016

It looks like README.md should point to http://clang.llvm.org/get_started.html instead of https://github.com/avr-llvm/llvm/wiki/Getting%20Started (which should also point to the upstream page).

@dylanmckay
Copy link
Member

I've added a big note to the wiki page you linked saying to use upstream LLVM instead.

I wouldn't link to the clang getting started page as the patch to add AVR support to clang is still in code review.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants