LLVM can't target the Playstation CPU (MIPS-I) #1

simias · 2015-05-18T20:41:32Z

I've been struggling for the past few days with extremely erratic behaviour from my playstation code. Simple code would work correctly but when things get more complicated (in particular function calls and BIOS calls) the program would appear to behave semi-randomly (bus errors, freezes, functions get executed several times in a row when they shouldn't etc...)

After some intense hair pulling I figured out the cause of the problem: the MIPS-I architecture has load delay slots. It means that when the CPU loads a value from memory (through a lw instruction for instance) the next instruction will run before the value is put in the target register. That means that you have to wait one cycle (using a nop or some independant instruction) before you can use the value.

The problem is that LLVM doesn't support MIPS-I at the moment and MIPS-II doesn't have this constraint. So for instance when the compiler generates the function epilogue that pops the value of ra (the return address) from the stack and jumps to it we end up with something like:

lw      ra,20(sp)
jr      ra

Unfortunately this code doesn't work on the Playstation's R2000 CPU: the lw doesn't have the time to finish before the jr is called (the jr is in the load delay slot), so the jump takes whatever value was in ra before the lw instruction. And that explains the very erratic behaviour.

Unfortunately as far as I can tell LLVM doesn't support MIPS-I at all, if I attempt to put "cpu": "mips1" in target.json I get this error:

LLVM ERROR: Code generation for MIPS-I is not implemented

Unfortunately there's no workaround. The solution would be to add MIPS-I support to LLVM but that's of course a pretty ambitious project and my plate is full at the moment.

As long as this issue is not fixed there's no point working on this SDK, nothing but the most trivial code will run as intended on the console.

The text was updated successfully, but these errors were encountered:

lf94 · 2018-04-18T17:13:42Z

If it's mostly the same, why not copy whatever MIPS-II does and change that behavior? Would be much easier than writing it all from scratch.

simias · 2018-05-08T16:39:00Z

I'm not at all familiar with LLVM so I didn't pursue this further. I assume it can't be too complicated but I really have no idea.

asiekierka · 2020-01-12T23:54:59Z

There is a patch available at impiaaa/llvm-project@f2ecffe but according to the developer it could use additional testing.

simias · 2020-01-17T15:05:25Z

That's great news although I must say that I wouldn't even know where to start to generate a rustc using a custom LLVM so I'm probably going to wait until it (hopefully) ends up in a stable release.

A few weeks ago I saw that there was a GCC backend for rustc in the works: https://github.com/sapir/gcc-rust

That might be an other way to get MIPSI support in Rust.

XaviDCR92 · 2020-06-08T09:12:24Z

I wanted to play around with the Zig programming language (see https://github.com/XaviDCR92/psx-zig), whose compiler is also based on LLVM, so merged the patch into the release/10.x branch of https://github.com/XaviDCR92/llvm-project . It patch seems to work correctly, so I encourage you to resume your work on psx-sdk-rs. Well-known issues with Zig's packed structs are preventing further development, but I think should be now fine with Rust.

ayrtonm · 2020-10-27T21:07:37Z

I recently got rustc working with LLVM 11 patched to support load delay slots if there's still interest in this. My fork has instructions in the README along with @impiaaa's LLVM patch (with a tiny fix) and a patch to build target.json and psx.ld into rustc. I checked that both rustc and llc by itself handle load delay slots correctly, though the tests weren't very extensive and I've only run programs on ePSXe.

I think it would've been convenient to use the JSON target, but it seems building the target into rustc is required since libcore won't compile unless the Atomic* API is disabled (by setting max_atomic_width = Some(0) in the target spec). This is because MIPS I doesn't have a sync instruction, unlike MIPS II. Either way rustc has to be compiled using the patched LLVM so it wouldn't have saved much time in building the compiler anyway.

Also I threw in a stripped-down copy of @simias's Makefile for the linker and merged libbios into a submodule of libpsx to make the SDK more user-friendly.

XaviDCR92 · 2020-10-31T01:54:04Z

@ayrtonm , the delay slot patch seems to work well for me too (I also had to apply the tiny fix to avoid a compile-time error).

However, I am having a curious issue that's making my Zig/C application crash by jumping into an invalid address. The root cause of this issue is $ra being assigned back to a value in $sp, 52, but $sp holds an invalid value and thus undefined behavior occurs.

After some investigation, it looks like bad generated code from the compiler side (I can't tell so far whether it's caused by either the LLVM backend or the Zig frontend). Consider the following lines of C code from PSXSDK:

for(a = 0; a < l; a+=2)
    GPU_DATA_PORT = image[a]|(image[a+1]<<16);

Among other low-level instructions, the following branch instruction is generated:

80012eb8:       14200017        bnez    at,80012f18 <LoadImage+0x1ac>

And this is what address 80012f18 contains (added some lines for context):

80012f00:       03c0e825        move    sp,s8
80012f04:       8fbe0000        lw      s8,0(sp)
80012f08:       8fbf0004        lw      ra,4(sp)
80012f0c:       00000000        nop
80012f10:       03e00008        jr      ra
80012f14:       27bd0008        addiu   sp,sp,8
        while(!(GPU_CONTROL_PORT & (1<<0x1c)));
80012f18:       0000000d        break

80012f1c <GsSetDrawEnv>:
{
80012f1c:       27bdfff8        addiu   sp,sp,-8

Therefore, 80012f18 is, much likely, not the address the program should be jumping to. Instead of leaving $sp to its original value and jumping to $ra, break is executed and GsSetDrawEnv is accidentally executed, causing undefined behaviour.

The compiler patch has added 18 nop instructions in LoadImage. I am not sure whether it's related or not, but 80012f18h - (18d * 4) = 80012ed0h, which casually belongs to the start of the loop.

        for(a = 0; a < l; a+=2)
80012ed0:       24e80002        addiu   t0,a3,2

Would it be possible that the LLVM backend is miscalculating destination addresses in branch instructions such as bnez since the patch is (apparently silently) adding several nop between instruction? I have not investigated other functions yet, but I must admit the application has already had erratic behavior several times now and this might explain it.

simias · 2020-10-31T21:35:30Z

I haven't touched this code in forever but I'm glad to see that there's progress!

@XaviDCR92 Have you tried to see the difference the various optimization levels make? Regardless of the miscalculated branch addresses it would be odd to end up with a bunch of useless NOPs in optimized code. What I'm getting at here is that a bunch of leftover NOPs with optimizations enabled seems like it would be a good sign that something messes up in the compiler.

XaviDCR92 · 2020-11-01T21:31:24Z

@simias Zig provides a --release-small flag roughly equivalent to gcc's -Os, but my application would not work at all with these optimizations enabled. I still have to find a reason for this.

OTOH, these apparently useless nop instructions must be there because of the load delay slot problem on MIPS I that you reported on your first message. There is no workaround other than replacing them with useful instructions that do not depend on their previous instruction, but that's not a serious issue.

Surprisingly though, I have found a possible reason why the code above was inadvertently jumping to a break instruction - there is a buffer overflow caused by this loop when called from this function, as pal is only 2 elements long and LoadImage is attempting to access pal[0 .. 15].

gcc would not issue any diagnostic message on this, but zig instead adds runtime bound checks when building in debug mode, thus executing the break instruction once an out-of-bound access occurs. As a quick workaround, I expanded pal from 2 to 16 elements and now GsLoadFont executes successfully.

More worryingly, I have also found other places where zig generates wrong code with break instructions here and there e.g.: GsSetDrawEnv:

80012f1c <GsSetDrawEnv>:
{
80012f1c:       27bdfff8        addiu   sp,sp,-8
80012f20:       afbf0004        sw      ra,4(sp)
80012f24:       afbe0000        sw      s8,0(sp)
80012f28:       03a0f025        move    s8,sp
        draw_mode_packet = (0xe1<<24)|(drawenv->draw_on_display>=1)<<10|
80012f2c:       0000000d        break

80012f30 <gpu_data_ctrl>:
{
80012f30:       27bdfff8        addiu   sp,sp,-8
80012f34:       afbf0004        sw      ra,4(sp)

Compared to the implementation by mipsel-unknown-elf-gcc:

80034bc0 <GsSetDrawEnv>:
{
80034bc0:       27bdffe8        addiu   sp,sp,-24
80034bc4:       afbf0014        sw      ra,20(sp)
        draw_mode_packet = (0xe1<<24)|(drawenv->draw_on_display>=1)<<10|
80034bc8:       90820001        lbu     v0,1(a0)
80034bcc:       00000000        nop
80034bd0:       1440003a        bnez    v0,80034cbc <GsSetDrawEnv+0xfc>
80034bd4:       00802825        move    a1,a0
80034bd8:       3c02e100        lui     v0,0xe100
                (drawenv->dither>=1)<<9;
80034bdc:       90a30000        lbu     v1,0(a1)
        gpu_data_ctrl(0xe3, (drawenv->y<<10)|drawenv->x);
80034be0:       84a40004        lh      a0,4(a1)
                (drawenv->dither>=1)<<9;
80034be4:       0003182b        sltu    v1,zero,v1
80034be8:       00031a40        sll     v1,v1,0x9
        draw_mode_packet = (0xe1<<24)|(drawenv->draw_on_display>=1)<<10|
80034bec:       00431025        or      v0,v0,v1
80034bf0:       3c038008        lui     v1,0x8008
80034bf4:       ac625d78        sw      v0,23928(v1)
        GPU_CONTROL_PORT = 0x01000000;
80034bf8:       3c060100        lui     a2,0x100
80034bfc:       3c031f80        lui     v1,0x1f80
80034c00:       ac661814        sw      a2,6164(v1)
        gpu_data_ctrl(0xe3, (drawenv->y<<10)|drawenv->x);
80034c04:       84a80002        lh      t0,2(a1)
...

In the meantime, I'm improving my fork of PCSX-r so pcsxr notifies whenever it executes a break instruction when debugging with mipsel-unknown-elf-gdb.

ayrtonm · 2020-11-23T00:29:41Z

After playing around with rust on the playstation for a while, I haven't been able to reproduce @XaviDCR92's issue with out of place break and my demos are behaving as expected on emulators.

I haven't used zig extensively, but I think the issue may be runtime checks or UB on shift overflow. For example, drawenv->draw_on_display is a u8 and I'm not sure if it gets implicitly cast to bool after the >= or if it just becomes a 0 or 1 as a u8. But either way, it then gets shifted more than 8 bits which might be a problem depending on how strict zig's type system is.

I also tried compiling some C with zig (0.6.0) myself and found that unsigned int z = 0xe1 << 24 caused me to hit an illegal instruction while unsigned int z = (unsigned int)0xe1 << 24 didn't.

int f(unsigned char x, unsigned char y) {
  2014e0:       55                      push   %rbp
  2014e1:       48 89 e5                mov    %rsp,%rbp
    unsigned int z = (0xe1 << 24);
  2014e4:       0f 0b                   ud2

int f(unsigned char x, unsigned char y) {
  201560:       55                      push   %rbp
  201561:       48 89 e5                mov    %rsp,%rbp
    unsigned int z = ((unsigned int)0xe1 << 24);
    return z;
  201564:       b8 00 00 00 e1          mov    $0xe1000000,%eax
  201569:       5d                      pop    %rbp
  20156a:       c3                      retq
  20156b:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)

It seems to me that zig uses the smallest type possible to represent integer literals, but I couldn't find any specific documentation on that. Granted that was all on x86 not MIPS, but I think the issue might be in that general area.

sajattack · 2021-12-03T04:38:46Z

Not sure how relevant it is to you (does PSX have an FPU?), but floating point comparisons also have a delay slot rust-lang/rust#91442 (comment)

ayrtonm · 2021-12-03T04:55:43Z

nah the PSX doesn't have a floating point unit. It supports fixed point but that's all custom coprocessor opcodes from inline assembly so llvm's not an issue there.

@impiaaa

LLVM so far has only supported the MIPS-II and above architectures. MIPS-II is pretty close to MIPS-I, the major difference being that "load" instructions always take one extra instruction slot to propogate to registers. This patch adds support for MIPS-I by adding hazard handling for load delay slots, alongside MIPSR6 forbidden slots and FPU slots, inserting a NOP instruction between a load and any instruction immediately following that reads the load's destination register. I also included a simple regression test. Since no existing tests target MIPS-I, those all still pass. Issue ref: simias/psx-sdk-rs#1 I also tested by building a simple demo app with Clang and running it in an emulator. Patch by: @impiaaa Differential Revision: https://reviews.llvm.org/D122427

netthier · 2022-04-13T16:55:08Z

Seems initial support for load delay slots is now part of the latest LLVM release if I understand that commit correctly.
Is this issue still relevant?

ayrtonm · 2022-04-13T17:45:14Z

It's not in upstream rust yet, but I was going to cherry-pick that commit onto the rustc branch and open a PR to add the psx target.

impiaaa · 2022-04-13T17:50:29Z

Also, filling load delay slots in code generation is only one part of full MIPS-I support. Remaining tasks include:

load delay slots in .set reorder
sync instruction
64-bit coprocessor transfers
Trap-on-condition instructions

ayrtonm · 2022-04-13T17:59:50Z

Yeah those would be definitely be good to upstream for MIPS-I, but the psx target's still fairly usable without them. I think the only really critical issue is the sync instruction since DMA is borderline unusable without it.

@impiaaa

LLVM so far has only supported the MIPS-II and above architectures. MIPS-II is pretty close to MIPS-I, the major difference being that "load" instructions always take one extra instruction slot to propogate to registers. This patch adds support for MIPS-I by adding hazard handling for load delay slots, alongside MIPSR6 forbidden slots and FPU slots, inserting a NOP instruction between a load and any instruction immediately following that reads the load's destination register. I also included a simple regression test. Since no existing tests target MIPS-I, those all still pass. Issue ref: simias/psx-sdk-rs#1 I also tested by building a simple demo app with Clang and running it in an emulator. Patch by: @impiaaa Differential Revision: https://reviews.llvm.org/D122427

ayrtonm · 2022-06-06T03:56:40Z

This target's now usable with just a target JSON on nightly, so building the rust compiler isn't necessary anymore. I updated the crate's instructions in case anyone wants to try it out.

@impiaaa

LLVM so far has only supported the MIPS-II and above architectures. MIPS-II is pretty close to MIPS-I, the major difference being that "load" instructions always take one extra instruction slot to propogate to registers. This patch adds support for MIPS-I by adding hazard handling for load delay slots, alongside MIPSR6 forbidden slots and FPU slots, inserting a NOP instruction between a load and any instruction immediately following that reads the load's destination register. I also included a simple regression test. Since no existing tests target MIPS-I, those all still pass. Issue ref: simias/psx-sdk-rs#1 I also tested by building a simple demo app with Clang and running it in an emulator. Patch by: @impiaaa Differential Revision: https://reviews.llvm.org/D122427

ayrtonm · 2022-10-05T03:38:19Z

I opened a PR adding a built-in target to rustc so hopefully the target JSON will soon be unnecessary.

@impiaaa

LLVM so far has only supported the MIPS-II and above architectures. MIPS-II is pretty close to MIPS-I, the major difference being that "load" instructions always take one extra instruction slot to propogate to registers. This patch adds support for MIPS-I by adding hazard handling for load delay slots, alongside MIPSR6 forbidden slots and FPU slots, inserting a NOP instruction between a load and any instruction immediately following that reads the load's destination register. I also included a simple regression test. Since no existing tests target MIPS-I, those all still pass. Issue ref: simias/psx-sdk-rs#1 I also tested by building a simple demo app with Clang and running it in an emulator. Patch by: @impiaaa Differential Revision: https://reviews.llvm.org/D122427

simias · 2022-10-10T21:10:39Z

That's amazing work!

Now I have to find a way to get my test setup working again...

simias added bug help wanted labels May 18, 2015

simias mentioned this issue Dec 12, 2016

I was thinking... simias/rustation#34

Open

geofft mentioned this issue Jul 24, 2019

Architecture support (tracking issue) fishinabarrel/linux-kernel-module-rust#112

Open

cr1901 mentioned this issue Dec 18, 2019

MIPS1 Load Delay Slot support in LLVM Rust-Retro/RWG#6

Open

ayrtonm mentioned this issue Apr 15, 2022

[cherry-pick] [MIPS] Initial support for MIPS-I load delay slots rust-lang/llvm-project#135

Merged

ayrtonm mentioned this issue May 1, 2022

Linking fails on nightly rustc due to mismatched ABI flags ayrtonm/psx-sdk-rs#9

Closed

ayrtonm mentioned this issue Jun 5, 2022

Allow overriding linker scripts in built-in targets rust-lang/rust#97749

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLVM can't target the Playstation CPU (MIPS-I) #1

LLVM can't target the Playstation CPU (MIPS-I) #1

simias commented May 18, 2015

lf94 commented Apr 18, 2018

simias commented May 8, 2018

asiekierka commented Jan 12, 2020

simias commented Jan 17, 2020

XaviDCR92 commented Jun 8, 2020

ayrtonm commented Oct 27, 2020

XaviDCR92 commented Oct 31, 2020

simias commented Oct 31, 2020

XaviDCR92 commented Nov 1, 2020

ayrtonm commented Nov 23, 2020 •

edited

sajattack commented Dec 3, 2021 •

edited

ayrtonm commented Dec 3, 2021

netthier commented Apr 13, 2022

ayrtonm commented Apr 13, 2022

impiaaa commented Apr 13, 2022

ayrtonm commented Apr 13, 2022

ayrtonm commented Jun 6, 2022 •

edited

ayrtonm commented Oct 5, 2022

simias commented Oct 10, 2022

LLVM can't target the Playstation CPU (MIPS-I) #1

LLVM can't target the Playstation CPU (MIPS-I) #1

Comments

simias commented May 18, 2015

lf94 commented Apr 18, 2018

simias commented May 8, 2018

asiekierka commented Jan 12, 2020

simias commented Jan 17, 2020

XaviDCR92 commented Jun 8, 2020

ayrtonm commented Oct 27, 2020

XaviDCR92 commented Oct 31, 2020

simias commented Oct 31, 2020

XaviDCR92 commented Nov 1, 2020

ayrtonm commented Nov 23, 2020 • edited

sajattack commented Dec 3, 2021 • edited

ayrtonm commented Dec 3, 2021

netthier commented Apr 13, 2022

ayrtonm commented Apr 13, 2022

impiaaa commented Apr 13, 2022

ayrtonm commented Apr 13, 2022

ayrtonm commented Jun 6, 2022 • edited

ayrtonm commented Oct 5, 2022

simias commented Oct 10, 2022

ayrtonm commented Nov 23, 2020 •

edited

sajattack commented Dec 3, 2021 •

edited

ayrtonm commented Jun 6, 2022 •

edited