Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

frida_libafl links multiple versions of capstone #1018

Closed
WorksButNotTested opened this issue Jan 26, 2023 · 4 comments
Closed

frida_libafl links multiple versions of capstone #1018

WorksButNotTested opened this issue Jan 26, 2023 · 4 comments

Comments

@WorksButNotTested
Copy link
Collaborator

WorksButNotTested commented Jan 26, 2023

Summary

When updating the version of the frida-rust dependency, it was observed that moving from frida-gum-sys 0.5 to 0.6 things stopped working on ARM64 MacOS. This issue serves to both explain the defect but also document the means to analyse the defect such that future problems can be more easily diagnosed.

I am broker!!.                                                                                                                                                                                       
Recieved connection from UnixSocketAddr("Unnamed")                                                                                                                                                   
New connection: 127.0.0.1:57914/127.0.0.1:57914                                                                                                                                                      
Recieved connection from UnixSocketAddr("Unnamed")                                                
[Timeout     #0]  (GLOBAL) run time: 0h-0m-1s, clients: 0, corpus: 0, objectives: 0, executions: 0, exec/sec: 0.000                                                                                  
                  (CLIENT) corpus: 0, objectives: 0, executions: 0, exec/sec: 0.000               
[Timeout     #0]  (GLOBAL) run time: 0h-0m-2s, clients: 1, corpus: 0, objectives: 0, executions: 0, exec/sec: 0.000                                                                                  
                  (CLIENT) corpus: 0, objectives: 0, executions: 0, exec/sec: 0.000               
[Timeout     #0]  (GLOBAL) run time: 0h-0m-3s, clients: 1, corpus: 0, objectives: 0, executions: 0, exec/sec: 0.000                                                                                  
                  (CLIENT) corpus: 0, objectives: 0, executions: 0, exec/sec: 0.000

Modifying frida-rust

To use a local copy of frida-rust rather than one from crates.io, you can apply the following patch and use the following commands:

% git diff HEAD
diff --git a/fuzzers/frida_libpng/Cargo.toml b/fuzzers/frida_libpng/Cargo.toml
index d3cb3de5..f3043926 100644
--- a/fuzzers/frida_libpng/Cargo.toml
+++ b/fuzzers/frida_libpng/Cargo.toml
@@ -28,7 +28,7 @@ reqwest = { version = "0.11.4", features = ["blocking"] }
 [dependencies]
 libafl = { path = "../../libafl/", features = [ "std", "llmp_compression", "llmp_bind_public", "frida_cli" ] } #,  "llmp_small_maps", "llmp_debug"]}
 capstone = "0.11.0"
-frida-gum = { version = "0.8.1", features = [ "auto-download", "event-sink", "invocation-listener"] }
+frida-gum = { path = "../../../frida-rust/frida-gum/", features = [ "auto-download", "event-sink", "invocation-listener", "stalker-observer", "stalker-params"] }
 libafl_frida = { path = "../../libafl_frida", features = ["cmplog"] }
 libafl_targets = { path = "../../libafl_targets", features = ["sancov_cmplog"] }
 libc = "0.2"
diff --git a/libafl_frida/Cargo.toml b/libafl_frida/Cargo.toml
index 0ff023cd..b2cce1b5 100644
--- a/libafl_frida/Cargo.toml
+++ b/libafl_frida/Cargo.toml
@@ -27,8 +27,8 @@ libc = "0.2"
 hashbrown = "0.12"
 libloading = "0.7"
 rangemap = "1.0"
-frida-gum-sys = { version = "0.4.1", features = [ "auto-download", "event-sink", "invocation-listener"] }
-frida-gum = { version = "0.8.1", features = [ "auto-download", "event-sink", "invocation-listener"] }
+frida-gum-sys = { path = "../../frida-rust/frida-gum-sys/", features = [ "auto-download", "event-sink", "invocation-listener", "stalker-observer", "stalker-params"] }
+frida-gum = { path = "../../frida-rust/frida-gum/", features = [ "auto-download", "event-sink", "invocation-listener", "stalker-observer", "stalker-params"] }
 regex = "1"
 dynasmrt = "1.2"
 capstone = "0.11.0"
cargo update -p frida-gum --precise=0.10.0
cargo update -p frida-gum-sys --precise=0.6.0

Modifying FRIDA Gum

FRIDA rust unsurprisingly depends on FRIDA itself, more specifically the frida-gum-devkit. In bumping the version of frida-gum-sys from 0.5 to 0.6, we end up in turn moving from FRIDA 16.0.2 to 16.0.7. More careful analysis showed that using frida-gum-sys 0.5 but manually bumping the version of FRIDA it depended on from 16.0.2 to 16.0.3 again introduced the regression.

Now in order to investigate further we need to be able to use a custom version of FRIDA. We can do this by removing the "auto-download" feature and instead copying the built frida-gum-devkit into the frida-gum-sys folder of frida-rust. But to do that we first need to build the devkit.

Building the frida-gum-devkit

We can build the devkit using the following commands...

git clone https://github.com/frida/frida.git
cd frida
rm -rf dk/ build/ deps/
git submodule update --init
make -f Makefile.sdk.mk FRIDA_V8=disabled
make gum-macos FRIDA_V8=disabled
./releng/devkit.py frida-gum macos-arm64 ./dk

The just copy the library and header from the ./dk directory into the frida-gum-sys folder of frida-rust.

The step to build the SDK is optional (since it will download a pre-built binary one if omitted, but as we'll see later, we want to rule out issues with that). We disable the V8 javascript engine because we don't need it (it's part of the frida-gumjs-devkit instead) and it takes ages to build.

To the printf mobile

We know the issue is with the stalker engine, by adding some println! to LibAFL (and running with LIBAFL_DEBUG_OUTPUT=1), we can see out call to stalker's follow_me function never returns. Now, code generated on the fly by dynamic binary instrumentation engines and debuggers are not a recipe for success when used together. So we need some good old fashioned printfs.

Now the other issue is that the instrumentation engine from which we want to print our debug information runs in the same thread as the one executing the code. If that thread was running printf and we are duly instrumenting and executing it, and we call printf ourselves in the middle of all this, we end up re-entering the function in libc on the same thread. Whilst printf is designed to work when called by multiple threads concurrently, it isn't intended to be called by the same thread, half way through executing itself. This is a very painful lesson which you only need to learn once. We therefore use the following snippet of code to handle our printfs which we add to the FRIDA engine.

static void gum_printf(char *format, ...) {

  va_list ap;
  char    buffer[4096] = {0};
  int     ret;
  int     len;

  va_start(ap, format);
  ret = vsnprintf(buffer, sizeof(buffer) - 1, format, ap);
  va_end(ap);

  if (ret < 0) { return; }

  len = strnlen(buffer, sizeof(buffer));

  (void)write(STDERR_FILENO, buffer, len);

}

We also need a little helper routine to let us disassemble the code being instrumented and the code which is generated...

static void
gum_disasm (guint8 * code,
            guint size,
            const gchar * prefix)
{
  csh capstone;
  cs_err err;
  cs_insn * insn;
  size_t count, i;

  err = cs_open (CS_ARCH_ARM64, GUM_DEFAULT_CS_ENDIAN, &capstone);
  g_assert (err == CS_ERR_OK);

  cs_option (capstone, CS_OPT_DETAIL, CS_OPT_ON);

  count = cs_disasm (capstone, code, size, GPOINTER_TO_SIZE (code), 0, &insn);
  g_assert (insn != NULL);

  for (i = 0; i != count; i++)
  {
    gum_printf ("%s0x%" G_GINT64_MODIFIER "x\t%s %s\n",
        prefix, insn[i].address, insn[i].mnemonic, insn[i].op_str);
  }

  cs_free (insn, count);

  cs_close (&capstone);
}

We can now add the following few lines to the bottom of function which commits new blocks within the Stalker engine to peek at whats going on. We first print the code of the basic block which was about to be executed, and then print the code which stalker generated to be used in its place.

static void
gum_exec_block_commit (GumExecBlock * block)
{
  GumStalker * stalker = block->ctx->stalker;
  gsize snapshot_size;

  snapshot_size =
      gum_stalker_snapshot_space_needed_for (stalker, block->real_size);
  memcpy (gum_exec_block_get_snapshot_start (block), block->real_start,
      snapshot_size);

  block->capacity = block->code_size + snapshot_size;

  gum_slab_reserve (&block->code_slab->slab, block->capacity);
  gum_stalker_freeze (stalker, block->code_start, block->code_size);

  gum_slab_reserve (&block->slow_slab->slab, block->slow_size);
  gum_stalker_freeze (stalker, block->slow_start, block->slow_size);

  /* PRINT SOME STUFF HERE */
  gum_printf ("\n*** New block: %p\n", block);
  gum_disasm (block->real_start, block->real_size, "[REAL]\t");
  gum_disasm (block->code_start, block->code_size, "[CODE]\t");
}

Contrasting the logs...

We can see with FRIDA 16.0.2 we get this output (cleaned up for simplicity)...

*** New block: 0x2a5514020
[REAL]	0x10082c680	adrp x8, #0x101155000
[REAL]	0x10082c684	add x8, x8, #0x1a0
[REAL]	0x10082c688	mov w9, #1
[REAL]	0x10082c68c	stp x8, x9, [sp, #0x60]
[REAL]	0x10082c690	str xzr, [sp, #0x70]
[REAL]	0x10082c694	stp x27, xzr, [sp, #0x80]
[REAL]	0x10082c698	add x0, sp, #0x60
[REAL]	0x10082c69c	bl #0x100a80184

[CODE]	0x2a54d8268	ldr x8, #0x2a54d82b4
[CODE]	0x2a54d826c	add x8, x8, #0x1a0
[CODE]	0x2a54d8270	mov w9, #1
[CODE]	0x2a54d8274	stp x8, x9, [sp, #0x60]
[CODE]	0x2a54d8278	str xzr, [sp, #0x70]
[CODE]	0x2a54d827c	stp x27, xzr, [sp, #0x80]
[CODE]	0x2a54d8280	add x0, sp, #0x60
[CODE]	0x2a54d8284	b #0x2a54f8000

But with FRIDA 16.0.3 we get the following...

*** New block: 0x2a7414020
[REAL]	0x1028547a0	adrp x8, #0x10317d000
[REAL]	0x1028547a4	add x8, x8, #0x1e0
[REAL]	0x1028547a8	mov w9, #1
[REAL]	0x1028547ac	stp x8, x9, [sp, #0x60]
[REAL]	0x1028547b0	str xzr, [sp, #0x70]
[REAL]	0x1028547b4	stp x27, xzr, [sp, #0x80]
[REAL]	0x1028547b8	add x0, sp, #0x60
[REAL]	0x1028547bc	bl #0x102aa82a4 <--- WTF?
[REAL]	0x1028547c0	adrp x8, #0x1032d6000
[REAL]	0x1028547c4	add x8, x8, #0x18
[REAL]	0x1028547c8	mov x9, sp
[REAL]	0x1028547cc	str x9, [x8, #0x20]
[REAL]	0x1028547d0	str x24, [x8, #0x18]
[REAL]	0x1028547d4	ldp x9, x10, [x24, #0x90]
[REAL]	0x1028547d8	stp x9, x10, [x8, #0x28]
[REAL]	0x1028547dc	str x21, [x8]
[REAL]	0x1028547e0	str x20, [x8, #8]
[REAL]	0x1028547e4	str x22, [x8, #0x10]
[REAL]	0x1028547e8	ldr x8, [x24]
[REAL]	0x1028547ec	ldr x8, [x8]
[REAL]	0x1028547f0	ldr x0, [sp]
[REAL]	0x1028547f4	ldr x1, [sp, #0x10]
[REAL]	0x1028547f8	mov w9, #1
[REAL]	0x1028547fc	stp x9, x0, [sp, #0x60]
[REAL]	0x102854800	str x1, [sp, #0x70]
[REAL]	0x102854804	ldr x8, [x8]
[REAL]	0x102854808	blr x8

[CODE]	0x2a73d8268	adrp x8, #0x2a7d01000
[CODE]	0x2a73d826c	add x8, x8, #0x1e0
[CODE]	0x2a73d8270	mov w9, #1
[CODE]	0x2a73d8274	stp x8, x9, [sp, #0x60]
[CODE]	0x2a73d8278	str xzr, [sp, #0x70]
[CODE]	0x2a73d827c	stp x27, xzr, [sp, #0x80]
[CODE]	0x2a73d8280	add x0, sp, #0x60
[CODE]	0x2a73d8284	bl #0x2a762bd6c <--- WTF?
[CODE]	0x2a73d8288	adrp x8, #0x2a7e5a000
[CODE]	0x2a73d828c	add x8, x8, #0x18
[CODE]	0x2a73d8290	mov x9, sp
[CODE]	0x2a73d8294	str x9, [x8, #0x20]
[CODE]	0x2a73d8298	str x24, [x8, #0x18]
[CODE]	0x2a73d829c	ldp x9, x10, [x24, #0x90]
[CODE]	0x2a73d82a0	stp x9, x10, [x8, #0x28]
[CODE]	0x2a73d82a4	str x21, [x8]
[CODE]	0x2a73d82a8	str x20, [x8, #8]
[CODE]	0x2a73d82ac	str x22, [x8, #0x10]
[CODE]	0x2a73d82b0	ldr x8, [x24]
[CODE]	0x2a73d82b4	ldr x8, [x8]
[CODE]	0x2a73d82b8	ldr x0, [sp]
[CODE]	0x2a73d82bc	ldr x1, [sp, #0x10]
[CODE]	0x2a73d82c0	mov w9, #1
[CODE]	0x2a73d82c4	stp x9, x0, [sp, #0x60]
[CODE]	0x2a73d82c8	str x1, [sp, #0x70]
[CODE]	0x2a73d82cc	ldr x8, [x8]
[CODE]	0x2a73d82d0	b #0x2a73f8000

We can see with 16.0.2, FRIDA generates a new version of the original code, up until the end of the basic block (instructions which are RIP relative are replaced with equivalents whilst the others can simply be copied verbatim). Note that at the end of the block, we don't jump to the original location, but instead back into the Stalker engine to have it instrument the next block and start executing the instrumented copy.

However, with 16.0.3, FRIDA is instead just copying the branch verbatim and continuing on past the end of the basic block!?!

Some more printfs

To work out why we need some more details, so we add some more printfs this time to where the basic block is processed and this time include the bytes of the original code and the instruction ID used by capstone to represent it...

void
gum_stalker_iterator_keep (GumStalkerIterator * self)
{
  GumExecBlock * block = self->exec_block;
  GumGeneratorContext * gc = self->generator_context;
  GumArm64Relocator * rl = gc->relocator;
  const cs_insn * insn = gc->instruction->ci;
  GumVirtualizationRequirements requirements;

  requirements = GUM_REQUIRE_NOTHING;

  gum_printf ("K (0x%08x) 0x%016lx - 0x%08x - \t%s %s\n", insn->id, insn->address, *((guint32 *)insn->address), insn->mnemonic, insn->op_str);
  ..
K (0x0000000d) 0x00000001008bc7a0 - 0xb0004968 - 	adrp x8, #0x1011e9000
K (0x00000004) 0x00000001008bc7a4 - 0x91078108 - 	add x8, x8, #0x1e0
K (0x000001e8) 0x00000001008bc7a8 - 0x52800029 - 	mov w9, #1
K (0x000002f9) 0x00000001008bc7ac - 0xa90627e8 - 	stp x8, x9, [sp, #0x60]
K (0x000002fa) 0x00000001008bc7b0 - 0xf9003bff - 	str xzr, [sp, #0x70]
K (0x000002f9) 0x00000001008bc7b4 - 0xa9087ffb - 	stp x27, xzr, [sp, #0x80]
K (0x00000004) 0x00000001008bc7b8 - 0x910183e0 - 	add x0, sp, #0x60
K (0x0000002e) 0x00000001008bc7bc - 0x94094eba - 	bl #0x100b102a4

We also add the following...

  gum_printf("ARM64_INS_BL: (0x%08X)\n", ARM64_INS_BL);

And see

ARM64_INS_BL: (0x00000044)

We can see that capstone has generated the instruction ID 0x2E whereas GUM thinks the ID using the capstone headers should be 0x44. Looking at Capstone in GitHub, it hasn't changed in years, it doesn't make any sense!

Capstone Versioning

After lots of tedious git bisection, and some other revelations, we can see that rather than using the original capstone project from GitHub, it uses its own fork at https://github.com/frida/capstone, and although the master branch of capstone shows that the enumeration of ARM64 instruction IDs hasn't changed in forever, it is actually the next branch which is under active development and the one which is tracker by FRIDA's fork.

Back to our bisection and we can see that it is this commit which is causing us problems...
frida/frida@bb8c82d

We have taken a long and tedious path to discovering where frida-gum-devkit gets its capstone from. And we can now compare these two versions of capstone. We see in this next branch that capstone is introducing support for ARM v9.2-A and adding new instructions into the enumeration. The enum lists them in alphabetical order and the values are not defined so they are just sequential too. This means that between versions, the values of the enum values will change.

After much playing around with frida-gum-devkit to determine whether it is using different versions of capstone when building its SDK to when it is building FRIDA gum itself, we determine that it is correctly using just one version, the later one. We add a unit test (in frida-gum/tests/core/arch-arm64/arm64relocator.c) and sure enough capstone is correctly disassembling our BL instruction and assigning it the ID 0x44 as we expect, it does.

TESTLIST_BEGIN (arm64relocator)
  ...
  TESTENTRY (bad_branch)
TESTLIST_END ()

TESTCASE (bad_branch)
{
  const guint32 input[] = {
    GUINT32_TO_LE (0x94094eba)  /* bl ???? */
  };
  const uint8_t* code = (const uint8_t*)input;
  size_t size = sizeof(input);
  uint64_t address = 0x00000001048007BC;
  csh capstone;
  cs_insn instruction;

  cs_open (CS_ARCH_ARM64, GUM_DEFAULT_CS_ENDIAN, &capstone);
  cs_option (capstone, CS_OPT_DETAIL, CS_OPT_ON);

  int ret = cs_disasm_iter (capstone, &code, &size, &address, &instruction);
  g_assert_cmpuint (ret, ==, 1);
  g_print ("ARM64_INS_BL: %08x\n", ARM64_INS_BL);
  g_print ("instruction->id: %08x\n", instruction.id);
  g_assert_cmpuint (instruction.id, ==, ARM64_INS_BL);

  cs_close (&capstone);
}
build/tmp-macos-arm64/frida-gum/tests/gum-tests -p /Core/Arm64Relocator/bad_branch

So why when we run in the fuzzer does capstone give us an ID of 0x2E (the value which was used in the old version of capstone)?

Crushing Realisation

So if capstone is giving us the old value for the instruction ID, then we must be using an old version of capstone, not the one which is incorporated in the frida-gum-devkit. So we check out LibAFL and sure enough...

% grep capstone . -r --include "Cargo.toml"
./libafl_qemu/Cargo.toml:capstone = "0.11.0"
./libafl_frida/Cargo.toml:capstone = "0.11.0"
./fuzzers/frida_libpng/Cargo.toml:capstone = "0.11.0"
./fuzzers/frida_gdiplus/Cargo.toml:capstone = "0.11.0"

Capstone Capstone Everywhere

But it's not that simple, using cargo tree, we can see capstone is a dependency of:

  1. frida_libpng (the fuzzer itself)
  2. frida-gum (one of the components of frida-rust)
  3. libafl_frida

Conclusions

Now we have worked out the root cause of our issues, we have the following unanswered questions:

  • Q. Should capstone have maintained the value of the constants used in its enumerations between versions as new instructions get introduced (fair enough its our fault for mixing up the versions in the first place, but this could have saved us)?
  • Q. Should more use be made of symbol versioning as used by glibc on Linux?
  • Q. How do we avoid having so many copies of capstone in our dependency tree?
  • Q. Why doesn't the linker warn us of the colliding symbols when we build our fuzzer?
@domenukk
Copy link
Member

If capstone-rust/capstone-rs#140 gets merged we should have a fix

@WorksButNotTested
Copy link
Collaborator Author

WorksButNotTested commented Feb 3, 2023

I have raised this ticket against frida-rust. frida/frida-rust#81. If we patch capstone to be aware of FRIDA, it feels like we create a bi-directional dependency between the two. e.g capstone is dependent on FRIDAs capstone fork, then FRIDAs rust bindings are dependent on capstone. This feels a little messy to me?

@domenukk
Copy link
Member

Did we get a new frida release yet? cc @s1341

@domenukk
Copy link
Member

I think this got resolved in the meantime?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants