M1 Max jit_write_protect issue #1470

geohot · 2021-10-27T18:44:26Z

Moved to Unicorn 2 for M1 support. Getting this ~50% of the time

MIPS CPU, with Go bindings if it matters. Perhaps someone has an idea.

Process 34232 stopped
* thread #10, stop reason = EXC_BAD_ACCESS (code=2, address=0x280000098)
    frame #0: 0x0000000101040a14 libunicorn.2.dylib`tb_gen_code_mips + 304
libunicorn.2.dylib`tb_gen_code_mips:
->  0x101040a14 <+304>: str    x8, [x9, #0x18]
    0x101040a18 <+308>: ldur   w8, [x29, #-0x14]
    0x101040a1c <+312>: ldur   x9, [x29, #-0x38]
    0x101040a20 <+316>: str    w8, [x9]
Target 0: (mipsevm.test) stopped.
(lldb) bt
* thread #10, stop reason = EXC_BAD_ACCESS (code=2, address=0x280000098)
  * frame #0: 0x0000000101040a14 libunicorn.2.dylib`tb_gen_code_mips + 304
    frame #1: 0x000000010102b74c libunicorn.2.dylib`tb_find + 92
    frame #2: 0x000000010102b1b0 libunicorn.2.dylib`cpu_exec_mips + 244
    frame #3: 0x0000000100fd8ce0 libunicorn.2.dylib`tcg_cpu_exec + 76
    frame #4: 0x0000000100fd8c0c libunicorn.2.dylib`resume_all_vcpus_mips + 96
    frame #5: 0x0000000100fd8dfc libunicorn.2.dylib`vm_start_mips + 24
    frame #6: 0x0000000100fc676c libunicorn.2.dylib`uc_emu_start + 352
    frame #7: 0x00000001002f7f04 mipsevm.test`_cgo_81152a5834e5_Cfunc_uc_emu_start + 44

Speed isn't that important to me, is there any way to disable threading?

The text was updated successfully, but these errors were encountered:

wtdcode · 2021-10-27T18:53:38Z

Hello, unicorn is designed to be single-thread internally. It's probably some null pointer dereference. Unfortunately, I don't have an M1 machine for testing purpose currently.

geohot · 2021-10-27T19:06:11Z

It has to be something with threads, it only happens 50% of the time

wtdcode · 2021-10-27T19:10:13Z

It has to be something with threads, it only happens 50% of the time

I'm 100% sure Unicorn2 internally is single-threaded unless you are using timeout option in uc_emu_start. When I added support for M1, that also happened and turned out that it was some undefined behavior. You may build a debug version and paste the source line and I may help.

geohot · 2021-10-27T19:17:28Z

Built with debug

Process 51353 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=2, address=0x280000098)
    frame #0: 0x00000001012109b4 libunicorn.2.dylib`tb_gen_code_mips(cpu=0x00000001300c8000, pc=0, cs_base=0, flags=268435632, cflags=-16777216) at translate-all.c:1512:16
   1509     }
   1510
   1511     gen_code_buf = tcg_ctx->code_gen_ptr;
-> 1512     tb->tc.ptr = gen_code_buf;
   1513     tb->pc = pc;
   1514     tb->cs_base = cs_base;
   1515     tb->flags = flags;
Target 0: (mipsevm.test) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=2, address=0x280000098)
  * frame #0: 0x00000001012109b4 libunicorn.2.dylib`tb_gen_code_mips(cpu=0x00000001300c8000, pc=0, cs_base=0, flags=268435632, cflags=-16777216) at translate-all.c:1512:16
    frame #1: 0x00000001011fb6ec libunicorn.2.dylib`tb_find(cpu=0x00000001300c8000, last_tb=0x0000000000000000, tb_exit=0, cf_mask=0) at cpu-exec.c:252:14
    frame #2: 0x00000001011fb150 libunicorn.2.dylib`cpu_exec_mips(uc=0x0000000129008200, cpu=0x00000001300c8000) at cpu-exec.c:566:18
    frame #3: 0x00000001011a8c80 libunicorn.2.dylib`tcg_cpu_exec(uc=0x0000000129008200) at cpus.c:95:17
    frame #4: 0x00000001011a8bac libunicorn.2.dylib`resume_all_vcpus_mips(uc=0x0000000129008200) at cpus.c:183:13
    frame #5: 0x00000001011a8d9c libunicorn.2.dylib`vm_start_mips(uc=0x0000000129008200) at cpus.c:203:5
    frame #6: 0x000000010119670c libunicorn.2.dylib`uc_emu_start(uc=0x0000000129008200, begin=0, until=1588396036, timeout=0, count=0) at uc.c:734:5
    frame #7: 0x00000001002f7f04 mipsevm.test`_cgo_81152a5834e5_Cfunc_uc_emu_start + 44

tb does have a value, it's not null

(lldb) print tb;
(TranslationBlock *) $0 = 0x0000000280000080

wtdcode · 2021-10-27T19:24:26Z

~~I see. You are getting unalignment access.~~
~~See this line:~~
~~> stop reason = EXC_BAD_ACCESS (code=2, address=0x280000098)~~

Sorry I get it wrong.

wtdcode · 2021-10-27T19:32:09Z

Looks like getting an OOB access, pretty strange.

geohot · 2021-10-27T19:33:51Z

It's super weird, the issue seems to be in alloc_code_gen_buffer, I can't memset the buffer to 0

wtdcode · 2021-10-27T19:36:31Z

It's super weird, the issue seems to be in alloc_code_gen_buffer, I can't memset the buffer to 0

It's W^X protection on Apple Silicon. A JIT buffer can't be granted write and execute permission at the same time. I guess it's some allocation problem but again unfortunately I'm unable to test. Maybe you could trace how that buffer is mmap-ed.

geohot · 2021-10-27T19:39:55Z

Yes, it seems like this! The buffer isn't writable...but the weirdest thing is that it works sometimes...

I added an mprotect that fails after.

This https://developer.apple.com/documentation/apple-silicon/porting-just-in-time-compilers-to-apple-silicon

wtdcode · 2021-10-27T19:40:49Z

Yes, it seems like this! The buffer isn't writable...but the weirdest thing is that it works sometimes...

I added an mprotect that fails after.

You couldn't use protect. Apple has a private API.

geohot · 2021-10-27T19:42:48Z

Ahh, pthread_jit_write_protect_np I see. Okay, I think I can trace this down. It's somewhat supported, I think it's just being called at the wrong time and that's the race.

wtdcode · 2021-10-27T19:45:29Z

Ahh, pthread_jit_write_protect_np I see. Okay, I think I can trace this down. It's somewhat supported, I think it's just being called at the wrong time and that's the race.

That patch is ported from upstream qemu (and from UTM in fact). Maybe from Unicorn we have to add those calls somewhere else. Anyway, you have this function:

static inline void jit_write_protect(int enabled)
{
    return pthread_jit_write_protect_np(enabled);
}

geohot · 2021-10-27T19:57:22Z

This fixes it, though I don't really understand why. It seemed like it was called with false earlier.

geohot@bd90068

wtdcode · 2021-10-27T19:58:50Z

This fixes it, though I don't really understand why. It seemed like it was called with false earlier.

geohot@bd90068

Okay I know the root cause and would post a fix once I have M1 environment to test the fix.

wtdcode · 2021-10-27T21:27:28Z

This fixes it, though I don't really understand why. It seemed like it was called with false earlier.

geohot@bd90068

btw, could you post a piece of simple reproduction code?

geohot · 2021-10-27T23:35:42Z

The built in examples reproduce it on my machine.

marysaka · 2021-10-29T21:38:15Z

This fixes it, though I don't really understand why. It seemed like it was called with false earlier.

geohot@bd90068

Same issue here tho this doesn't fixed it for me on MBA M1 and macOS 12.0.1.

Also testing with an aarch64 context

wtdcode · 2021-10-29T21:39:45Z

This fixes it, though I don't really understand why. It seemed like it was called with false earlier.
geohot@bd90068

Same issue here tho this doesn't fixed it for me on MBA M1 and macOS 12.0.1.

Also testing with an aarch64 context

Any reproduction code?

marysaka · 2021-10-30T11:36:09Z

This fixes it, though I don't really understand why. It seemed like it was called with false earlier.
geohot@bd90068

Same issue here tho this doesn't fixed it for me on MBA M1 and macOS 12.0.1.
Also testing with an aarch64 context

Any reproduction code?

So after a bit more of researches, it seems to fix the issue for Unicorn itself.
However it should be noted that this break C# usage of it as it seems to make MAP_JIT regions allocated by other JIT break from what I can see.

My wild guess is that the patch sent here result in all MAP_JIT pages from current thread to be RW-, resulting in a crash when coming back to JITed code.

wtdcode · 2021-10-30T11:37:34Z

This fixes it, though I don't really understand why. It seemed like it was called with false earlier.
geohot@bd90068

Same issue here tho this doesn't fixed it for me on MBA M1 and macOS 12.0.1.
Also testing with an aarch64 context

Any reproduction code?

So after a bit more of researches, it seems to fix the issue for Unicorn itself. However it should be noted that this break C# usage of it as it seems to make MAP_JIT regions allocated by other JIT break from what I can see.

My wild guess is that the patch sent here result in all MAP_JIT pages from current thread to be RW-, resulting in a crash when coming back to JITed code.

This of course is a quick and dirty fix. I need some production code to publish a real fix.

marysaka · 2021-10-30T11:45:34Z

This fixes it, though I don't really understand why. It seemed like it was called with false earlier.
geohot@bd90068

Same issue here tho this doesn't fixed it for me on MBA M1 and macOS 12.0.1.
Also testing with an aarch64 context

Any reproduction code?

So after a bit more of researches, it seems to fix the issue for Unicorn itself. However it should be noted that this break C# usage of it as it seems to make MAP_JIT regions allocated by other JIT break from what I can see.
My wild guess is that the patch sent here result in all MAP_JIT pages from current thread to be RW-, resulting in a crash when coming back to JITed code.

This of course is a quick and dirty fix. I need some production code to publish a real fix.

I don't have much production code to share in public atm.
So for a quick simple reproducer running on .NET 6 RC 2:

using System;
using System.Runtime.InteropServices;

namespace testing
{
    public class Test
    {
        [DllImport("unicorn")]
        public static extern uint uc_version(out uint major, out uint minor);

        private const uint UC_ARCH_ARM64 = 2;
        private const uint UC_MODE_LITTLE_ENDIAN = 0;

        [DllImport("unicorn")]
        public static extern int uc_open(uint arch, uint mode, out IntPtr uc);

        [DllImport("unicorn")]
        public static extern int uc_close(IntPtr uc);

        public static void Main(string[] args)
        {
            Console.WriteLine("uc_version");

            uc_version(out uint major, out uint minor);

            Console.WriteLine($"Unicorn v{major}.{minor}");

            Console.WriteLine("uc_open");

            int err = uc_open(UC_ARCH_ARM64, UC_MODE_LITTLE_ENDIAN, out IntPtr uc);

            Console.WriteLine("Crashed?");

            if (err == 0)
            {
                uc_close(uc);
            }

            Console.WriteLine("Done.");
        }
    }
}

Console output:

uc_version
Unicorn v2.0
uc_open
[1]    11258 bus error  ./test_debug_unicorn

wtdcode · 2021-10-30T11:47:52Z

This fixes it, though I don't really understand why. It seemed like it was called with false earlier.
geohot@bd90068

Same issue here tho this doesn't fixed it for me on MBA M1 and macOS 12.0.1.
Also testing with an aarch64 context

Any reproduction code?

So after a bit more of researches, it seems to fix the issue for Unicorn itself. However it should be noted that this break C# usage of it as it seems to make MAP_JIT regions allocated by other JIT break from what I can see.
My wild guess is that the patch sent here result in all MAP_JIT pages from current thread to be RW-, resulting in a crash when coming back to JITed code.

This of course is a quick and dirty fix. I need some production code to publish a real fix.

I don't have much production code to share in public atm. So for a quick simple reproducer running on .NET 6 RC 2:
using System;
using System.Runtime.InteropServices;

namespace testing
{
    public class Test
    {
        [DllImport("unicorn")]
        public static extern uint uc_version(out uint major, out uint minor);

        private const uint UC_ARCH_ARM64 = 2;
        private const uint UC_MODE_LITTLE_ENDIAN = 0;

        [DllImport("unicorn")]
        public static extern int uc_open(uint arch, uint mode, out IntPtr uc);

        [DllImport("unicorn")]
        public static extern int uc_close(IntPtr uc);

        public static void Main(string[] args)
        {
            Console.WriteLine("uc_version");

            uc_version(out uint major, out uint minor);

            Console.WriteLine($"Unicorn v{major}.{minor}");

            Console.WriteLine("uc_open");

            int err = uc_open(UC_ARCH_ARM64, UC_MODE_LITTLE_ENDIAN, out IntPtr uc);

            Console.WriteLine("Crashed?");

            if (err == 0)
            {
                uc_close(uc);
            }

            Console.WriteLine("Done.");
        }
    }
}
Console output:
uc_version
Unicorn v2.0
uc_open
[1]    11258 bus error  ./test_debug_unicorn

Sry it's a typo. I mean 'reproduction' code. Could you make a double check if the equivalent C code also produces the same crash?

marysaka · 2021-10-30T11:55:42Z

This fixes it, though I don't really understand why. It seemed like it was called with false earlier.
geohot@bd90068

Same issue here tho this doesn't fixed it for me on MBA M1 and macOS 12.0.1.
Also testing with an aarch64 context

Any reproduction code?

So after a bit more of researches, it seems to fix the issue for Unicorn itself. However it should be noted that this break C# usage of it as it seems to make MAP_JIT regions allocated by other JIT break from what I can see.
My wild guess is that the patch sent here result in all MAP_JIT pages from current thread to be RW-, resulting in a crash when coming back to JITed code.

This of course is a quick and dirty fix. I need some production code to publish a real fix.

I don't have much production code to share in public atm. So for a quick simple reproducer running on .NET 6 RC 2:
using System;
using System.Runtime.InteropServices;

namespace testing
{
    public class Test
    {
        [DllImport("unicorn")]
        public static extern uint uc_version(out uint major, out uint minor);

        private const uint UC_ARCH_ARM64 = 2;
        private const uint UC_MODE_LITTLE_ENDIAN = 0;

        [DllImport("unicorn")]
        public static extern int uc_open(uint arch, uint mode, out IntPtr uc);

        [DllImport("unicorn")]
        public static extern int uc_close(IntPtr uc);

        public static void Main(string[] args)
        {
            Console.WriteLine("uc_version");

            uc_version(out uint major, out uint minor);

            Console.WriteLine($"Unicorn v{major}.{minor}");

            Console.WriteLine("uc_open");

            int err = uc_open(UC_ARCH_ARM64, UC_MODE_LITTLE_ENDIAN, out IntPtr uc);

            Console.WriteLine("Crashed?");

            if (err == 0)
            {
                uc_close(uc);
            }

            Console.WriteLine("Done.");
        }
    }
}
Console output:
uc_version
Unicorn v2.0
uc_open
[1]    11258 bus error  ./test_debug_unicorn
Sry it's a typo. I mean 'reproduction' code. Could you make a double check if the equivalent C code also produces the same crash?

No crash with the following C variant:

#include <unicorn/unicorn.h>
#include <stdio.h>


int main (int ac, char **av) {
    unsigned int major;
    unsigned int minor;

    uc_version(&major, &minor);

    printf("Unicorn v%d.%d\n", major, minor);

    printf("uc_open\n");

    uc_engine *engine = NULL;

    int res = uc_open(UC_ARCH_ARM64, UC_MODE_LITTLE_ENDIAN, &engine);

    printf("Crashed?\n");

    if (res == 0)
    {
        uc_close(engine);
    }

    printf("Done.\n");

    return 0;
}

Console output:

Unicorn v2.0
uc_open
Crashed?
Done.

wtdcode · 2021-10-30T11:57:37Z

This fixes it, though I don't really understand why. It seemed like it was called with false earlier.
geohot@bd90068

Same issue here tho this doesn't fixed it for me on MBA M1 and macOS 12.0.1.
Also testing with an aarch64 context

Any reproduction code?

So after a bit more of researches, it seems to fix the issue for Unicorn itself. However it should be noted that this break C# usage of it as it seems to make MAP_JIT regions allocated by other JIT break from what I can see.
My wild guess is that the patch sent here result in all MAP_JIT pages from current thread to be RW-, resulting in a crash when coming back to JITed code.

This of course is a quick and dirty fix. I need some production code to publish a real fix.

I don't have much production code to share in public atm. So for a quick simple reproducer running on .NET 6 RC 2:
using System;
using System.Runtime.InteropServices;

namespace testing
{
    public class Test
    {
        [DllImport("unicorn")]
        public static extern uint uc_version(out uint major, out uint minor);

        private const uint UC_ARCH_ARM64 = 2;
        private const uint UC_MODE_LITTLE_ENDIAN = 0;

        [DllImport("unicorn")]
        public static extern int uc_open(uint arch, uint mode, out IntPtr uc);

        [DllImport("unicorn")]
        public static extern int uc_close(IntPtr uc);

        public static void Main(string[] args)
        {
            Console.WriteLine("uc_version");

            uc_version(out uint major, out uint minor);

            Console.WriteLine($"Unicorn v{major}.{minor}");

            Console.WriteLine("uc_open");

            int err = uc_open(UC_ARCH_ARM64, UC_MODE_LITTLE_ENDIAN, out IntPtr uc);

            Console.WriteLine("Crashed?");

            if (err == 0)
            {
                uc_close(uc);
            }

            Console.WriteLine("Done.");
        }
    }
}
Console output:
uc_version
Unicorn v2.0
uc_open
[1]    11258 bus error  ./test_debug_unicorn
Sry it's a typo. I mean 'reproduction' code. Could you make a double check if the equivalent C code also produces the same crash?

No crash with the following C variant:

#include <unicorn/unicorn.h>
#include <stdio.h>


int main (int ac, char **av) {
    unsigned int major;
    unsigned int minor;

    uc_version(&major, &minor);

    printf("Unicorn v%d.%d\n", major, minor);

    printf("uc_open\n");

    uc_engine *engine = NULL;

    int res = uc_open(UC_ARCH_ARM64, UC_MODE_LITTLE_ENDIAN, &engine);

    printf("Crashed?\n");

    if (res == 0)
    {
        uc_close(engine);
    }

    printf("Done.\n");

    return 0;
}

Console output:

Unicorn v2.0
uc_open
Crashed?
Done.

Okay, then it seems so indeed. I would have a look once I have some M1 machine to debug.

marysaka · 2021-10-30T12:24:12Z

So I removed the original patch by @geohot, did a full clean build and it seems to not have effect on my side in the end.

I might have missed to check switching to the dev branch and not applying the patch at first sorry 😅

Maybe it might be worth making another issue for the C# binding issues as it is starting to be a bit out of topic?

wtdcode · 2021-10-30T12:29:46Z

So I removed the original patch by @geohot, did a full clean build and it seems to not have effect on my side in the end.

I might have missed to check switching to the dev branch and not applying the patch at first sorry 😅

Maybe it might be worth making another issue for the C# binding issues as it is starting to be a bit out of topic?

Nevermind, just go head. I would check if it doesn't work on UC2.

wtdcode · 2021-11-05T21:42:05Z

The built in examples reproduce it on my machine.

I tested sample.go with go bindings and I couldn't reproduce the crash

wtdcode · 2021-11-05T22:07:25Z

The built in examples reproduce it on my machine.

I can't reproduce your crash on an M1 machine. Is it a bug caused by M1 MAX?

So I removed the original patch by @geohot, did a full clean build and it seems to not have effect on my side in the end.

I might have missed to check switching to the dev branch and not applying the patch at first sorry 😅

Maybe it might be worth making another issue for the C# binding issues as it is starting to be a bit out of topic?

I also tried switch write protection before function calls but didn't get crash.

geohot · 2021-11-06T01:11:40Z

The code I've been working on is open source. https://github.com/geohot/cannon

Try "go test" in mipsevm using upstream unicorn (change https://github.com/geohot/cannon/blob/master/build_unicorn.sh)

I can try to post a minimum repro tomorrow. Though I'm 90% sure the mips sample crashed too.

The crash was not 100% of the time, in my app maybe 80%.

wtdcode · 2021-11-06T21:29:42Z

The code I've been working on is open source. https://github.com/geohot/cannon

Try "go test" in mipsevm using upstream unicorn (change https://github.com/geohot/cannon/blob/master/build_unicorn.sh)

I can try to post a minimum repro tomorrow. Though I'm 90% sure the mips sample crashed too.

The crash was not 100% of the time, in my app maybe 80%.

I tried your cannon exactly yesterday, both on your latest commit and the one when you fired this issue. Both worked fine.

A minimum reproduction script would be of great help.

geohot · 2021-11-07T17:33:14Z

So I tried a simple repro and couldn't get it. I can repro in cannon though

# clone and build cannon
./build_unicorn.sh
cd unicorn2
git revert bd90068fa014bb082c0b7ef6d20f7bccd3f581e0 # my fix
make -j8
cd mipsevm
go test -run TestCompareUnicornEvm

Still working on a min repro with upstream

My bad, you can't do it with the examples. I was getting confused with unicorn (not 2), which the examples do trigger it. This seems way more subtle.

wtdcode · 2021-11-07T17:38:23Z

make -j8

AFAIK, unicorn2 doesn't have a Makefile so make -j8 won't work.

geohot · 2021-11-07T17:39:31Z

The build unicorn script does cmake in the directory first. That's not a full repro, that's a change. Updated and still working on minimal repro

wtdcode · 2021-11-07T17:45:40Z

The build unicorn script does cmake in the directory first. That's not a full repro, that's a change. Updated and still working on minimal repro

I get:

2021/11/08 01:45:23 open ../artifacts/contracts/MIPS.sol/MIPS.json: no such file or directory
exit status 1

geohot · 2021-11-07T17:49:04Z

Yea you have to build it. Working on a simpler repro. It's subtle, seems to depend on the heap state. If I remove a completely unrelated map write it doesn't crash.

geohot · 2021-11-07T17:50:41Z

Okay pushed a repro that doesn't depend on that:

cd mipsevm
go test -run TestUnicornCrash

wtdcode · 2021-11-07T17:51:56Z

Okay pushed a repro that doesn't depend on that:
cd mipsevm
go test -run TestUnicornCrash

--- FAIL: TestUnicornCrash (0.00s)
panic: runtime error: slice bounds out of range [:192] with capacity 0 [recovered]
	panic: runtime error: slice bounds out of range [:192] with capacity 0

Looks like it crashes inside go code?

geohot · 2021-11-07T17:53:19Z

You have to run it from the mipsevm dir. Sorry, cannon really isn't built for this.

kafka@tubby:~/fun/cannon/mipsevm$ go test -run TestUnicornCrash
fatal error: unexpected signal during runtime execution
[signal SIGBUS: bus error code=0x1 addr=0x280000098 pc=0x105322a60]

runtime stack:
runtime.throw({0x10453096f, 0x2a})
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/runtime/panic.go:1198 +0x54
runtime.sigpanic()
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/runtime/signal_unix.go:719 +0x230

goroutine 4 [syscall]:
runtime.cgocall(0x10451efe8, 0x1400005ad78)
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/runtime/cgocall.go:156 +0x50 fp=0x1400005ad30 sp=0x1400005acf0 pc=0x104227ac0
github.com/unicorn-engine/unicorn/bindings/go/unicorn._Cfunc_uc_emu_start(0x115008200, 0x0, 0x5ead0004, 0x0, 0x0)
        _cgo_gotypes.go:249 +0x44 fp=0x1400005ad70 sp=0x1400005ad30 pc=0x10433ab34
github.com/unicorn-engine/unicorn/bindings/go/unicorn.(*uc).StartWithOptions.func1(0x1400006db00, 0x0, 0x5ead0004, 0x1400005ae48)
        /Users/kafka/fun/cannon/unicorn2/bindings/go/unicorn/unicorn.go:112 +0x8c fp=0x1400005add0 sp=0x1400005ad70 pc=0x10433c89c
github.com/unicorn-engine/unicorn/bindings/go/unicorn.(*uc).StartWithOptions(0x1400006db00, 0x0, 0x5ead0004, 0x1400005ae48)
        /Users/kafka/fun/cannon/unicorn2/bindings/go/unicorn/unicorn.go:112 +0x40 fp=0x1400005ae10 sp=0x1400005add0 pc=0x10433c7c0
github.com/unicorn-engine/unicorn/bindings/go/unicorn.(*uc).Start(0x1400006db00, 0x0, 0x5ead0004)
        /Users/kafka/fun/cannon/unicorn2/bindings/go/unicorn/unicorn.go:117 +0x48 fp=0x1400005ae60 sp=0x1400005ae10 pc=0x10433c908
mipsevm.TestUnicornCrash(0x14000125380)
        /Users/kafka/fun/cannon/mipsevm/unicorn_crash_test.go:36 +0x260 fp=0x1400005af70 sp=0x1400005ae60 pc=0x10451e160
testing.tRunner(0x14000125380, 0x104637bf8)
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/testing/testing.go:1259 +0x104 fp=0x1400005afc0 sp=0x1400005af70 pc=0x1042ff9f4
runtime.goexit()
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/runtime/asm_arm64.s:1133 +0x4 fp=0x1400005afc0 sp=0x1400005afc0 pc=0x104290794
created by testing.(*T).Run
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/testing/testing.go:1306 +0x328

goroutine 1 [chan receive]:
testing.(*T).Run(0x140001251e0, {0x10452440e, 0x10}, 0x104637bf8)
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/testing/testing.go:1307 +0x344
testing.runTests.func1(0x140001251e0)
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/testing/testing.go:1598 +0x80
testing.tRunner(0x140001251e0, 0x14000169d18)
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/testing/testing.go:1259 +0x104
testing.runTests(0x1400000e768, {0x10495b4c0, 0x9, 0x9}, {0xc05a231226e4c980, 0x8bb2e27012, 0x104962100})
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/testing/testing.go:1596 +0x3ec
testing.(*M).Run(0x14000192000)
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/testing/testing.go:1504 +0x4fc
main.main()
        _testmain.go:59 +0x17c
exit status 2
FAIL    mipsevm 0.116s

And 1 in 10 times it'll pass

wtdcode · 2021-11-07T17:54:13Z

You have to run it from the mipsevm dir. Sorry, cannon really isn't built for this.

kafka@tubby:~/fun/cannon/mipsevm$ go test -run TestUnicornCrash
fatal error: unexpected signal during runtime execution
[signal SIGBUS: bus error code=0x1 addr=0x280000098 pc=0x105322a60]

runtime stack:
runtime.throw({0x10453096f, 0x2a})
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/runtime/panic.go:1198 +0x54
runtime.sigpanic()
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/runtime/signal_unix.go:719 +0x230

goroutine 4 [syscall]:
runtime.cgocall(0x10451efe8, 0x1400005ad78)
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/runtime/cgocall.go:156 +0x50 fp=0x1400005ad30 sp=0x1400005acf0 pc=0x104227ac0
github.com/unicorn-engine/unicorn/bindings/go/unicorn._Cfunc_uc_emu_start(0x115008200, 0x0, 0x5ead0004, 0x0, 0x0)
        _cgo_gotypes.go:249 +0x44 fp=0x1400005ad70 sp=0x1400005ad30 pc=0x10433ab34
github.com/unicorn-engine/unicorn/bindings/go/unicorn.(*uc).StartWithOptions.func1(0x1400006db00, 0x0, 0x5ead0004, 0x1400005ae48)
        /Users/kafka/fun/cannon/unicorn2/bindings/go/unicorn/unicorn.go:112 +0x8c fp=0x1400005add0 sp=0x1400005ad70 pc=0x10433c89c
github.com/unicorn-engine/unicorn/bindings/go/unicorn.(*uc).StartWithOptions(0x1400006db00, 0x0, 0x5ead0004, 0x1400005ae48)
        /Users/kafka/fun/cannon/unicorn2/bindings/go/unicorn/unicorn.go:112 +0x40 fp=0x1400005ae10 sp=0x1400005add0 pc=0x10433c7c0
github.com/unicorn-engine/unicorn/bindings/go/unicorn.(*uc).Start(0x1400006db00, 0x0, 0x5ead0004)
        /Users/kafka/fun/cannon/unicorn2/bindings/go/unicorn/unicorn.go:117 +0x48 fp=0x1400005ae60 sp=0x1400005ae10 pc=0x10433c908
mipsevm.TestUnicornCrash(0x14000125380)
        /Users/kafka/fun/cannon/mipsevm/unicorn_crash_test.go:36 +0x260 fp=0x1400005af70 sp=0x1400005ae60 pc=0x10451e160
testing.tRunner(0x14000125380, 0x104637bf8)
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/testing/testing.go:1259 +0x104 fp=0x1400005afc0 sp=0x1400005af70 pc=0x1042ff9f4
runtime.goexit()
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/runtime/asm_arm64.s:1133 +0x4 fp=0x1400005afc0 sp=0x1400005afc0 pc=0x104290794
created by testing.(*T).Run
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/testing/testing.go:1306 +0x328

goroutine 1 [chan receive]:
testing.(*T).Run(0x140001251e0, {0x10452440e, 0x10}, 0x104637bf8)
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/testing/testing.go:1307 +0x344
testing.runTests.func1(0x140001251e0)
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/testing/testing.go:1598 +0x80
testing.tRunner(0x140001251e0, 0x14000169d18)
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/testing/testing.go:1259 +0x104
testing.runTests(0x1400000e768, {0x10495b4c0, 0x9, 0x9}, {0xc05a231226e4c980, 0x8bb2e27012, 0x104962100})
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/testing/testing.go:1596 +0x3ec
testing.(*M).Run(0x14000192000)
        /opt/homebrew/Cellar/go/1.17.2/libexec/src/testing/testing.go:1504 +0x4fc
main.main()
        _testmain.go:59 +0x17c
exit status 2
FAIL    mipsevm 0.116s

Yes, I did a full clone and ran in mipsevm dir.

geohot · 2021-11-07T17:56:51Z

pushed a change, try it now. idk that seems like it can't find the test file. just pushed another change to make it simpler

wtdcode · 2021-11-07T17:58:55Z

pushed a change, try it now. idk that seems like it can't find the test file. just pushed another change to make it simpler

Great I got your crash locally. I would have a look into it.

geohot · 2021-11-07T18:00:30Z

Nice!

Yea sorry I got confused with the examples, it was the 1.0 examples that failed (for the same reason, but there was no fix in 1.0 at all, not the almost right fix in 2.0)

It's weird, if you remove that stuff with the map in the test, it passes. But that's completely unrelated, it's just heap grooming.

Either way, glad you reproed it.

Confirming it always passes if you leave "bd90068fa014bb082c0b7ef6d20f7bccd3f581e0" in as well

geohot · 2021-11-07T18:07:32Z

This should be a standalone repro, no file required. Now even simpler

package main

import (
	"log"
	"testing"

	uc "github.com/unicorn-engine/unicorn/bindings/go/unicorn"
)

func TestUnicornCrash(t *testing.T) {
	mu, err := uc.NewUnicorn(uc.ARCH_MIPS, uc.MODE_32|uc.MODE_BIG_ENDIAN)
	if err != nil {
		log.Fatal(err)
	}

	// weird heap grooming (doesn't crash without this)
	junk := make(map[uint32](uint32))
	for i := 0; i < 1000000; i += 4 {
		junk[uint32(i)] = 0xaaaaaaaa
	}

	mu.Start(0, 4)
}

wtdcode · 2021-11-07T18:23:32Z

Looks like being related to dotnet/runtime#41991.

My wild guess is that your map allocation triggers golang internal thread scheduler, which brings ffi calls to a new thread.

geohot · 2021-11-07T18:24:45Z

Ahh, I'm not that familiar with golang internals, but this sounds very possible. If you do the heap grooming before the NewUnicorn, it doesn't crash.

Can confirm

runtime.LockOSThread()

before the map fixes it!

What a crazy runtime. I guess Go assumes they are saving and restoring all the relevant context for the OS thread, so they can schedule the goroutines anywhere. But they are not setting and restoring the pthread JIT state. I don't know what Go promises the programmer in this case and if this is supposed to be handled by the runtime or not.

Either way, a fun bug. I tried looking for exactly this at the beginning to see if my tid changed, but I must have missed it (gettid is a weird syscall). The solution is perhaps something like my fix but less hacky, to explicitly set the JIT state right before you expect to write/exec it and not assume it stays between calls.

wtdcode · 2021-11-07T18:46:57Z

Ahh, I'm not that familiar with golang internals, but this sounds very possible. If you do the heap grooming before the NewUnicorn, it doesn't crash.

Can confirm
runtime.LockOSThread()
before the map fixes it!

What a crazy runtime. I guess Go assumes they are saving and restoring all the relevant context for the OS thread, so they can schedule the goroutines anywhere. But they are not setting and restoring the pthread JIT state. I don't know what Go promises the programmer in this case and if this is supposed to be handled by the runtime or not.

Either way, a fun bug. I tried looking for exactly this at the beginning to see if my tid changed, but I must have missed it (gettid is a weird syscall). The solution is perhaps something like my fix, to explicitly set the JIT state right before you expect to write/exec it and not assume it stays between calls.

I did some debugging. The thread id doesn't change indeed but the state of JIT protection changes as dotnet/runtime#41991 suggests. I guess that is due to some thread reuse strategy. Anyway, even with LockOSThread we couldn't guarantee every UC API is called in the same thread and the state of JIT protection remains same across ffi calls so the only solution is to explicitly set the state like your hacks. I would push a fix to it.

Thank you for the reproduction script.

geohot · 2021-11-07T19:03:22Z

Cool, I'm not crazy! Because I did think I checked the tid correctly. I guess something must change the pthread_jit_write_protect_np state.

I'm so over security like this that makes programming harder and likely doesn't affect exploiters much at all. When I did browser exploits I loved "mitigations" because they only bothered noobs.

wtdcode · 2021-11-07T19:28:09Z

Fixed in 94a82ed

geohot changed the title ~~Unicorn 2 - M1 Max Race Condition~~ Unicorn 2 - M1 Max jit_write_protect issue Oct 27, 2021

geohot changed the title ~~Unicorn 2 - M1 Max jit_write_protect issue~~ M1 Max jit_write_protect issue Oct 27, 2021

wtdcode added the bug label Oct 27, 2021

wtdcode closed this as completed Nov 7, 2021

norswap mentioned this issue Mar 2, 2022

Use upstream Unicorn ethereum-optimism/cannon#55

Closed

cmacdonald mentioned this issue Nov 16, 2023

Crashes on Apple M1 kivy/pyjnius#667

Closed

xhyumiracle mentioned this issue Jul 4, 2024

Unicorn Crash on Mac M1 / Should use upstream Unicorn ora-io/opml#11

Open

M1 Max jit_write_protect issue #1470

M1 Max jit_write_protect issue #1470

Comments

geohot commented Oct 27, 2021 • edited Loading

wtdcode commented Oct 27, 2021

geohot commented Oct 27, 2021

wtdcode commented Oct 27, 2021 • edited Loading

geohot commented Oct 27, 2021 • edited Loading

wtdcode commented Oct 27, 2021 • edited Loading

wtdcode commented Oct 27, 2021

geohot commented Oct 27, 2021

wtdcode commented Oct 27, 2021

geohot commented Oct 27, 2021 • edited Loading

wtdcode commented Oct 27, 2021

geohot commented Oct 27, 2021 • edited Loading

wtdcode commented Oct 27, 2021

geohot commented Oct 27, 2021

wtdcode commented Oct 27, 2021

wtdcode commented Oct 27, 2021

geohot commented Oct 27, 2021 • edited Loading

marysaka commented Oct 29, 2021

wtdcode commented Oct 29, 2021

marysaka commented Oct 30, 2021

wtdcode commented Oct 30, 2021

marysaka commented Oct 30, 2021 • edited Loading

wtdcode commented Oct 30, 2021

marysaka commented Oct 30, 2021

wtdcode commented Oct 30, 2021

marysaka commented Oct 30, 2021

wtdcode commented Oct 30, 2021

wtdcode commented Nov 5, 2021

wtdcode commented Nov 5, 2021

geohot commented Nov 6, 2021 • edited Loading

wtdcode commented Nov 6, 2021 • edited Loading

geohot commented Nov 7, 2021 • edited Loading

wtdcode commented Nov 7, 2021 • edited Loading

geohot commented Nov 7, 2021 • edited Loading

wtdcode commented Nov 7, 2021

geohot commented Nov 7, 2021

geohot commented Nov 7, 2021

wtdcode commented Nov 7, 2021

geohot commented Nov 7, 2021 • edited Loading

wtdcode commented Nov 7, 2021

geohot commented Nov 7, 2021 • edited Loading

wtdcode commented Nov 7, 2021

geohot commented Nov 7, 2021 • edited Loading

geohot commented Nov 7, 2021 • edited Loading

wtdcode commented Nov 7, 2021

geohot commented Nov 7, 2021 • edited Loading

wtdcode commented Nov 7, 2021 • edited Loading

geohot commented Nov 7, 2021 • edited Loading

wtdcode commented Nov 7, 2021

geohot commented Oct 27, 2021 •

edited

Loading

wtdcode commented Oct 27, 2021 •

edited

Loading

geohot commented Oct 27, 2021 •

edited

Loading

wtdcode commented Oct 27, 2021 •

edited

Loading

geohot commented Oct 27, 2021 •

edited

Loading

geohot commented Oct 27, 2021 •

edited

Loading

geohot commented Oct 27, 2021 •

edited

Loading

marysaka commented Oct 30, 2021 •

edited

Loading

geohot commented Nov 6, 2021 •

edited

Loading

wtdcode commented Nov 6, 2021 •

edited

Loading

geohot commented Nov 7, 2021 •

edited

Loading

wtdcode commented Nov 7, 2021 •

edited

Loading

geohot commented Nov 7, 2021 •

edited

Loading

geohot commented Nov 7, 2021 •

edited

Loading

geohot commented Nov 7, 2021 •

edited

Loading

geohot commented Nov 7, 2021 •

edited

Loading

geohot commented Nov 7, 2021 •

edited

Loading

geohot commented Nov 7, 2021 •

edited

Loading

wtdcode commented Nov 7, 2021 •

edited

Loading

geohot commented Nov 7, 2021 •

edited

Loading