Skip to content
Travis Goodspeed edited this page Apr 13, 2019 · 8 revisions

Reversing MD380 Firmware with GHIDRA

Recently, the NSA released binaries and source code for the GHIDRA reverse engineering framework. While many professionals use this as an opportunity to air their many grievances against IDA Pro's purchasing department, and many students use this as an opportunity to complain about IDA Pro's pricing, we should kindly focus instead on what nifty things might be done with this tool.

The Tytera MD380, also sold under different brands and related model numbers, is a handheld radio popular with the digital DMR protocol in the amateur radio community. A few years back, our merry little gang dumped this radio's firmware, reverse engineered it, and built the md380tools project to patch the firmware with additional features, such as a global user directory, promiscuous mode, and additional USB functions.

These notes will show you how to load the MD380 radio's firmware into GHIDRA, then import symbols from the md380tools reverse engineering project.

Before You Begin

To begin, you should have the MD380Tools source code, having compiled it to produce intermediate files, such as symbols and decrypted firmware images. See the README if you don't already have an embedded ARM development toolchain.

% git clone https://github.com/travisgoodspeed/md380tools
% cd md380tools
% make release

Loading Firmware

GHIDRA seems to have been designed with firmware in mind, and it happily supports loading raw binaries, so long as you know a trick or two.

Let's begin by File/New Project to create a new Non-Shared project. (If you begin this as a Shared project, you will need to have an exclusive checkout while changing the memory layout in later steps.)

Once your project as been opened, load md380tools/firmware/unwrapped/D013.020.img or any other unwrapped firmware image. (These are produced by the Makefile scripts from poorly encrypted firmware updates on the Internet.)

We must set the Language to little-endian ARM Cortex (ARM:LE:32:Cortex), and use the Options pane to set the proper loading address. These images contain Flash memory beginning after the 48k bootloader, so we will need a loading address of 0x0800C000.

After the image has been loaded, we now have Flash loaded to the proper address, but have we have no RAM, and neither IO nor function symbols. Let's begin by finding the function entry points with the auto-analyzer, then importing their proper names.

Initial Analysis

When you first double-click the D013.020.img image after loading it, GHIDRA will prompt you to run the auto-analyzer with any number of default options. By choosing the default options and waiting a minute, you will find hundreds of functions accurately identified for you.

Unfortunately, not everything can be instantly identified. For example, the table located at the beginning (0x0800C000) is the Interrupt Vector Table, but it is not located at the default address. GHIDRA is smart enough to recognize many of these entries, and the exceptions are things like the initial stack address (at 0x0800C000) and the RESET vector (at 0x0800C004) that are never called by the applications own code. It's worth chasing down a few of these to see how the auto-analyzer fails, but we won't concern ourselves with them in this tutorial.

You should also take a moment to consider what you haven't had to do. Unlike IDA Pro and Binary Ninja, GHIDRA's concept of a Language setting allows it to know before hand that this binary is entirely Thumb2, with no classical ARM instructions and no need to set a virtual register. You never needed to

Exploring a Few Functions

Before we continue with importing other artifacts from the MD380Tools project, let's take a look at some handy functions and use them to find others.

Let's start with the lowest levels of the SPI Flash driver. The SPI Flash is an external Flash chip that contains the radio's codeplug, defining frequencies and configuration settings. In reverse engineering new functions, it is incredibly helpful to watch them read their settings out of the SPI Flash chip.

Begin by navigating to the md380_spi_sendrecv function at 0x080314bd. Do this by hitting G in the disassembly view, then giving GHIDRA the address, which it will round down to 0x080314bc. (Internally, ARM has odd addresses for all Thumb functions, but GHIDRA does not follow this convention.) Once there, select the function's name and hit the F key to edit the function definition, changing the name to md380_spi_sendrecv. Think of this function as getchar() and putchar(char) wrapped into one; it sends a single byte out of the SPI bug while returning the byte that crossed back at the same time.

Next, navigate to 0x080314bd and look at its decompiled view. You can see that this sends the byte 0x03, followed by the three bytes of the second parameter beginning with the most significant, and then all the bytes pointed to by the first parameter for a count of the third parameter.

void FUN_080314bc(undefined *puParm1,uint uParm2,short sParm3)

{
  undefined uVar1;
  
  FUN_0803152a();
  md380_spi_sendrecv(3);
  md380_spi_sendrecv(uParm2 >> 0x10 & 0xff);
  md380_spi_sendrecv(uParm2 >> 8 & 0xff);
  md380_spi_sendrecv(uParm2 & 0xff);
  while( true ) {
    if (sParm3 == 0) break;
    uVar1 = md380_spi_sendrecv(0xa5);
    *puParm1 = uVar1;
    puParm1 = puParm1 + 1;
    sParm3 = sParm3 + -1;
  }
  FUN_08031546();
  return;
}

From the SPI Flash's datasheet, we know that 0x03 is the command to read data bytes form the chip. The real name of this function is md380_spiflash_read, and chasing down the addresses that it reads allows us to match functions in the firmware with their meaning in the codeplug. Since we know that the chip must be selected before the read, and deselected after, we can guess that FUN_0803152a is really md380_spiflash_enable and FUN_08031546 is really md380_spiflash_disable without even having to read their code.

Using the L key in the decompiled view to define parameter, function, and variable names, we can come up with a much cleaning decompilation. The ; key allows you to mark comments.

void md380_spiflash_read(undefined *buffer,uint adr,short length)

{
  undefined currbyte;
  
  md380_spiflash_enable();
                    /* 0x03 = READ DATA BYTES command */
  md380_spi_sendrecv(3);
  md380_spi_sendrecv(adr >> 0x10 & 0xff);
  md380_spi_sendrecv(adr >> 8 & 0xff);
  md380_spi_sendrecv(adr & 0xff);
  while( true ) {
    if (length == 0) break;
    currbyte = md380_spi_sendrecv(0xa5);
    *buffer = currbyte;
    buffer = buffer + 1;
    length = length + -1;
  }
  md380_spiflash_disable();
  return;
}

And having these fine functions, we can chase down related ones. Right-click on the function name in the disassembly view, and chose Show References to to show every caller of the function. (Or Ctrl+Shift+F.) Doing this for md380_spiflash_enable() gives us a half dozen functions that interact with the SPI Flash, such as md380_spiflash_write(), md380_spiflash_sektor_erase4k(), and md380_spiflash_block_erase64k(). Doing this for md380_spiflash_read() gives us every function that reads from SPI Flash, and from those addresses we can know what they are reading.

For example, consider this unknown function which reads twenty bytes from 0x2040.

undefined4 FUN_080226c0(void)

{
  undefined4 unaff_r7;
  
  md380_spiflash_read(DAT_08023398,0x2040,0x14);
  return unaff_r7;
}

In the incomplete CHIRP driver at md380tools/chirp/md380.py, we see that 0x2000 is a configuration structure containing two twenty-byte strings for the startup text. Sure enough, our mystery function is Get_Welcome_Line1_from_spi_flash and we can hook or patch it to change what text is displayed at startup!

#seekto 0x2000;
struct {
    u8 unknownff;
    bbcd prog_yr[2];
    bbcd prog_mon;
    bbcd prog_day;
    bbcd prog_hour;
    bbcd prog_min;
    bbcd prog_sec;
    u8 unknownver[4];       //Probably version numbers.
    u8 unknownff2[52];      //Maybe unused?  All FF.
    char line1[20];         //Top line of text at startup.
    char line2[20];         //Bottom line of text at startup.
...

Loading a Core Dump of RAM

Now, it's cool that we can trace these settings from SPI Flash images, but it would be nicer if we could just look at 0x08023398 in RAM to see what bytes had been loaded there. To do that, we'll need to create a region for 128k at 0x20000000. (The STM32F405 also has 64k at 0x10000000 but this is rarely used by the linker for static buffers.)

Choose Add To Program from the File menu to add a second image. Chose md380tools/cores/d13020-core.img, which is a live RAM dump made over USB from a booted copy of D13.020. In the options pane, you must set the loading address to 0x20000000.

Now we can navigate to 0x2001E3FC where the string is stored. Define the first type to be wchar16 then press [ to create an array of 10 elements. Doing the same at 0x2001e410 for the second line, you can see the two startup lines of my radio configuration in SRAM!

        2001e3fa 00              ??         00h
        2001e3fb 00              ??         00h
        2001e3fc 4b 00 4b        wchar16[   u"KK4VCZ    "
                 00 34 00 
                 56 00 43 
        2001e410 33 00 31        wchar16[   u"3147092   "
                 00 34 00 
                 37 00 30 

Fixing Literal Pools

As one last little hassle, the decompiler is confused about ARM literal pools. See, ARM doesn't really have 32-bit immediates; instead, it fakes them by having pools of data between functions, which are referenced relative to the program counter.

GHIDRA becomes confused because, in theory, these literal pools might be overwritten with new 32-bit values. Because of this confusion, the decompiler tells you that the variable DAT_080233b0 is passed along, when in fact the literal 0x2001E410 is the only value that will ever be at 0x080233b0.

void Get_Welcome_Line2_from_spi_flash(void){
  md380_spiflash_read(DAT_080233b0,0x2054,0x14);
  return;
}
DAT_080233b0             XREF[1]:     Get_Welcome_Line2_from_spi_flash
080233b0 10 e4 01 20     undefined4 2001E410h

In practice, this Flash memory is rather complicated to change, and code is not directly writable. So we need to tell the decompiler that the Flash page is not writable in Window/Memory Map. Simply unchecking the box and saving the new memory map will correct the decompilation, showing us that the string being passed is "3147092". We can of course give it a clearer name and check for cross-references, in order to know which other functions use the second welcome line.

void Get_Welcome_Line2_from_spi_flash(void){
  md380_spiflash_read((char *)u_3147092_2001e410,0x2054,0x14);
  return;
}

Loading Symbols from GNU LD.

Now that we have learned to find our own functions, it might be a good time to import those that others have found.

Among the many fine example scripts included with GHIDRA is ImportSymbolsScript.py, which takes a flat text file of symbol names and addresses to create labels in the open project.

#Imports a file with lines in the form "symbolName 0xADDRESS"
#@category Data
#@author 
 
f = askFile("Give me a file to open", "Go baby go!")

for line in file(f.absolutePath):  # note, cannot use open(), since that is in GhidraScript
    pieces = line.split()
    address = toAddr(long(pieces[1], 16))
    print "creating symbol", pieces[0], "at address", address
    createLabel(address, pieces[0], False)

In the md380tools project, we use Radare2 symbols to mark what we think about an image and GNU LD scripts to mark those pieces that we actually link against. Converting the GNU LD scripts with symbols2ghidra.py <symbols_d03_020 >ghidrasyms.txt is nice and easy,

Have fun!

From this quick tutorial you should see that it's easy to load MD380 firmware and symbols into GHIDRA, and that the decompiler does an admirable job on the binary.

You should also be careful to mark memory pages correctly, as they are necessary for the decompiler to produce clean, compact code. Users working on shared projects must be sure to exclusively check out a file before changing the memory map or loading new regions.

Cheers from Knoxville,

--Travis KK4VCZ