# Go Stack Strings
> Researching a generic solution to decrypt these stack strings

- toc: true 
- badges: true
- categories: [string decryption,emulation,golang]

## Overview

Based on our previous work on [Garble Go string decryption](https://research.openanalysis.net/garble/go/obfuscation/strings/2023/08/03/garble.html) we have identified another type of obfuscator used for GoLang that creates in-line obfuscated strings (similar to ADV) instead of using functions. The source/name of the obfuscator is currently unknown but the obfuscation pattern in the compiled binary is easy to identify.


## Sample
- [dd124a7b396150e4d8275c473594e47ac24606ef0955e2c13310aac9045554ac](https://www.unpac.me/results/c2d49f20-7bd7-48af-b5f6-4ead73d3790f?hash=dd124a7b396150e4d8275c473594e47ac24606ef0955e2c13310aac9045554ac#/)



## Analysis

### Identify The Encrypted Strings

### Pattern 1 - DWORD MOV Encrypted
The stack strings follow the same basic format using a DWORD mov onto the stack.
```
.text:00610168 C7 44 24 38 65 B5 87 BA                 mov     [esp+0B8h+var_80], 0BA87B565h ; Decoded: Binance/
.text:00610170 C7 44 24 3C C2 2F B3 4B                 mov     [esp+0B8h+var_7C], 4BB32FC2h
.text:00610178 C7 44 24 30 27 DC E9 DB                 mov     [esp+0B8h+var_88], 0DBE9DC27h
.text:00610180 C7 44 24 34 AC 4C D6 64                 mov     [esp+0B8h+var_84], 64D64CACh
```
```
.text:006101A4 72 E6                                   jb      short loc_61018C
.text:006101A6 E9 5E 01 00 00                          jmp     loc_610309
```

We can use this loose sig to identify potential strings `C7 44 24 ?? ?? ?? ?? ?? C7 44 24`

We can also try this sig `C7 84 24 ?? ?? ?? ?? ?? ?? ?? ?? c7`

### Pattern 2 - DWORD MOV Encrypted Overlap

There are two stack strings that overlap before the terminator for the decryption is reached (due to compiled placement of the basic blocks).
```
.text:00569D71 C7 44 24 55 99 CB F4 B7                 mov     [esp+0DCh+var_87], 0B7F4CB99h
.text:00569D79 C7 44 24 58 B7 0B 46 4D                 mov     [esp+0DCh+var_87+3], 4D460BB7h
.text:00569D81 C7 44 24 4E DD 30 5A 18                 mov     [esp+0DCh+var_8E], 185A30DDh
.text:00569D89 C7 44 24 51 18 80 B2 C1                 mov     [esp+0DCh+var_8E+3], 0C1B28018h


.text:00569E18 C7 44 24 64 0E D2 58 A3                 mov     [esp+0DCh+var_78], 0A358D20Eh
.text:00569E20 C7 44 24 68 73 81 14 91                 mov     [esp+0DCh+var_74], 91148173h
.text:00569E28 C7 44 24 5C 5E A0 37 C5                 mov     [esp+0DCh+var_80], 0C537A05Eh
.text:00569E30 C7 44 24 60 1A ED 71 B1                 mov     [esp+0DCh+var_7C], 0B171ED1Ah


.text:0056A04A 0F 8D 48 FD FF FF                       jge     loc_569D98
.text:0056A050 0F B6 5C 14 55                          movzx   ebx, byte ptr [esp+edx+0DCh+var_87]
.text:0056A055 72 E2                                   jb      short loc_56A039
.text:0056A057 EB 2A                                   jmp     short loc_56A083
```

There is also the same pattern but using the extended index.
```
.text:0056AE44 C7 84 24 9D 00 00 00 94 0A 7B 36                             mov     [esp+2C0h+var_223], 367B0A94h
.text:0056AE4F C7 84 24 99 00 00 00 F1 6E 1C 53                             mov     [esp+2C0h+var_227], 531C6EF1h
.text:0056AE5A 31 C0                                                        xor     eax, eax
.text:0056AE5C EB 12                                                        jmp     short loc_56AE70


.text:0056AE7D 72 DF                                                        jb      short loc_56AE5E
.text:0056AE7F E9 FC 16 00 00                                               jmp     loc_56C580
```

#### Pattern 3 - DWORD MOV Plaintext
Some strings are not encrypted at all.
```
.text:005404BC C7 44 24 16 53 65 6C 65                 mov     [esp+38h+var_22], 'eleS'
.text:005404C4 C7 44 24 1A 63 74 20 2A                 mov     [esp+38h+var_1E], '* tc'
.text:005404CC C7 44 24 1E 20 66 72 6F                 mov     [esp+38h+var_1A], 'orf '
.text:005404D4 C7 44 24 22 6D 20 57 69                 mov     [esp+38h+var_16], 'iW m'
.text:005404DC C7 44 24 26 6E 33 32 5F                 mov     [esp+38h+var_12], '_23n'
.text:005404E4 C7 44 24 2A 50 68 79 73                 mov     [esp+38h+var_E], 'syhP'
.text:005404EC C7 44 24 2E 69 63 61 6C                 mov     [esp+38h+var_A], 'laci'
.text:005404F4 C7 44 24 32 4D 65 6D 6F                 mov     [esp+38h+var_6], 'omeM'
.text:005404FC 66 C7 44 24 36 72 79                    mov     [esp+38h+var_2], 'yr'
.text:00540503 C7 04 24 00 00 00 00                    mov     [esp+38h+var_38], 0
```

#### Anti-Pattern 1 - MOV Low Value IMM
The first value moved onto the stack is low as are all of them. We can ignore these.
```
.text:005416C9 C7 44 24 40 00 00 00 00                 mov     [esp+48h+var_8], 0
.text:005416D1 C7 44 24 44 00 00 00 00                 mov     [esp+48h+var_4], 0
```


### Deobfuscate Strings

In [7]:
from unicorn import *
from unicorn.x86_const import *
import struct
from capstone import *
from capstone.x86 import *

code = bytes.fromhex('c744243865b587bac744243cc22fb34bc744243027dce9dbc7442434ac4cd66431d2eb0e0fb66c143031dd9588441430954283fa087d0c0fb65c143872e6e95e0100008984249c000000')

def decode(code):
    uc = Uc(UC_ARCH_X86, UC_MODE_32)

    # Setup the stack
    stack_base = 0x00100000
    stack_size = 0x0010000
    ESP = stack_base + (stack_size // 2)
    uc.mem_map(stack_base, stack_size)
    uc.mem_write(stack_base, b"\x00" * stack_size)

    uc.reg_write(UC_X86_REG_ESP, ESP)

    # Setup code 
    target_base = 0x00400000
    target_size = 0x0010000
    target_end = target_base + len(code)

    uc.mem_map(target_base, target_size, UC_PROT_ALL)
    uc.mem_write(target_base, b"\x00" * target_size)
    uc.mem_write(target_base, code)


    cs = Cs(CS_ARCH_X86, CS_MODE_32)
    cs.detail = True

    def trace(uc, address, size, user_data):
        insn = next(cs.disasm(uc.mem_read(address, size), address))
        #print(f"{address:#010x}:\t{insn.mnemonic}\t{insn.op_str}")

    uc.hook_add(UC_HOOK_CODE, trace, None)
    uc.emu_start(target_base, target_end, 0, 0)

    out = uc.mem_read(stack_base, stack_size).replace(b'\x00', b'')
    return out


def get_stack(code):
        
    uc = Uc(UC_ARCH_X86, UC_MODE_32)

    # Setup the stack
    stack_base = 0x00100000
    stack_size = 0x0010000
    ESP = stack_base + (stack_size // 2)
    uc.mem_map(stack_base, stack_size)
    uc.mem_write(stack_base, b"\x00" * stack_size)

    uc.reg_write(UC_X86_REG_ESP, ESP)

    # Setup code 
    target_base = 0x00400000
    target_size = 0x0010000
    target_end = target_base + len(code)

    uc.mem_map(target_base, target_size, UC_PROT_ALL)
    uc.mem_write(target_base, b"\x00" * target_size)
    uc.mem_write(target_base, code)


    cs = Cs(CS_ARCH_X86, CS_MODE_32)
    cs.detail = True

    def trace(uc, address, size, user_data):
        insn = next(cs.disasm(uc.mem_read(address, size), address))
        #print(f"{address:#010x}:\t{insn.mnemonic}\t{insn.op_str}")
        if insn.mnemonic != 'mov':
            #print("Stopping emulation at end of stack string")
            uc.emu_stop()

    uc.hook_add(UC_HOOK_CODE, trace, None)
    uc.emu_start(target_base, target_end, 0, 0)
    
    offset = uc.reg_read(UC_X86_REG_EIP) - target_base
    out = uc.mem_read(stack_base, stack_size).replace(b'\x00', b'')
    return offset, out


In [14]:
import re
import pefile
import struct

def is_ascii(s):
    return all((c < 127 and c >= 32) for c in s)


# Emulation Start
# C7 44 24 ?? ?? ?? ?? ?? C7 44 24
# C7 84 24 ?? ?? ?? ?? ?? ?? ?? ?? c7

start_egg = rb'\xc7(\x44|\x84)(.{5,8})\xc7'

# Emulation End
# 72 DF                                                        jb      short loc_56AE5E
# E9 FC 16 00 00                                               jmp     loc_56C580
# 
# or just the end of the stack string (non encrypted)

file_data = open('/tmp/test.bin','rb').read()

section_data = None
pe = pefile.PE(data=file_data)

for s in pe.sections:
    if s.Name[:5] == b'.text':
        section_data = s.get_data()

assert section_data is not None


ptr = 0
size = len(section_data)
while ptr < size:
    chunk = section_data[ptr:]
    start_match = re.search(start_egg, chunk)
    if start_match is None:
        print('No more strings')
        break
    emu_start = start_match.start()
    # Test first IMM for a large value
    imm = struct.unpack('<I',start_match.group(2)[-4:])[0]
    if imm < 0xff:
        #print('Skipping non-stack string')
        ptr += 7 + emu_start
        continue
    
    data = chunk[emu_start:emu_start+0x200]
    #print(hex(ptr + emu_start))
    try:
        emu_offset, tmp_stack = get_stack(data)
    except:
        emu_offset = 0
        tmp_stack = '\xff'
    #print(tmp_stack)
    if len(tmp_stack) > 3 and is_ascii(tmp_stack):
        string_offset = ptr + emu_start
        string = tmp_stack.decode('utf-8')
        print(f"{hex(string_offset)}: {string}")
        if emu_offset == 0:
            ptr += 7 + emu_start
        else:
            ptr += emu_offset + emu_start
    else:
        ptr += 7 + emu_start
    


0x2b1c0: ntdll.dll
0x2b335: winmm.dll
0x2b445: ws2_32.dll
0x13f205: d1r&g.,^*%$EF
0x13f28c: Select * from Win32_Processor
0x13f33c: Select * from Win32_OperatingSystem
0x13f3fc: Select * from Win32_VideoController
0x13f4bc: Select * from Win32_PhysicalMemory
0x13f56c: SELECT * FROM Win32_Process
0x1eae79: tNXei:P@bcr
0x1eb8d5: Web Data
0x1eb945: History
0x1eb9bd: Bookmarks
0x1eba45: cookies.sqlite
0x1ebac5: key4.db
0x1ebb3d: logins.json
0x1ebbc5: places.sqlite
0x1ebc44: SELECT target_path, tab_url, total_bytes, start_time, end_time, mime_type FROM downloads
0x1ebd63: SELECT place_id, GROUP_CONCAT(content), url, dateAdded FROM (SELECT * FROM moz_annos INNER JOIN moz_places ON moz_annos.place_id=moz_places.id) t GROUP BY place_id
0x1ebf53: SELECT id, url, type, dateAdded, title FROM (SELECT * FROM moz_bookmarks INNER JOIN moz_places ON moz_bookmarks.fk=moz_places.id)
0x1ec0dc: SELECT name, value, host, path, creationTime, expiry, isSecure, isHttpOnly FROM moz_cookies
0x1ec20c: SELECT ite