# XOR key discovery from weakly implemented XOR encryption

Author: void

## Introduction

This is a guide to decrypting samples containing weak XOR encryption in the wild. During this guide, we're using an old and famous Windows Executable. It will show up as a malicious HackTool, however the sample itself isn't malicious unless used maliciously (it is your responsibility not to be malicious with it).

During this guide, we'll take a look at some of the telltale signs for discovering XOR encryption with a repeating key, and we'll take a look at cracking the sample for ourselves.

## Sample `cannon.bin`

In [None]:
! echo "Checking the file type"
! file samples/cannon.bin
! echo ""
! echo "Checking the first few lines of the file"
! xxd samples/cannon.bin 2>/dev/null | head -n 5

### Analysis (Part 1)

Using the `file` command, we get `cannon.bin: data`. This isn't particularly useful information other than to know that our sample currently has no filetype, it's simply detected as "data".

Taking a look at the file data itself using `xxd`, we can see that the data looks like garbage. But taking a closer look, it seems that there are some characters that repeat between lines in sequence.

```js
00000000: .... ..67 ..31 4426 ..4b 3372 .... 6943
00000010: ..6c 3049 4562 554e ..31 7549 ..74 596b

00000020: .... ..67 ..31 4426 ..4b 3372 .... 6943
00000030: ..6c 3049 4562 554e ..31 7549 ..74 596b
```

From the very start of this sample, it seems that some of the characters in the sample are repeating every 32 characters. This is a good sign that we may be working with some weakly implemented encryption.

In [None]:
! echo "Checking the last few lines of the file"
! xxd samples/cannon.bin 2>/dev/null | tail -n 5

### Analysis (Part 2)

The last 2 lines of the file are exactly the same, a string of random characters.

```js
000213b0: 686c 3049 4562 554e 3731 7549 4374 596b
000213c0: 3138 7267 4931 4426 4c4b 3372 7726 6943

000213d0: 686c 3049 4562 554e 3731 7549 4374 596b
000213e0: 3138 7267 4931 4426 4c4b 3372 7726 6943
```

Given this file is encrypted, having 2 long strings of characters which exactly match is a very useful sign.

Since our key goes over 2 lines it's likely to be one of two options:
- "hl0IEbUN71uICtYk18rgI1D&LK3rw&iC"
- "18rgI1D&LK3rw&iChl0IEbUN71uICtYk"

For this example, we will use `18rgI1D&LK3rw&iChl0IEbUN71uICtYk`.

In [None]:
import os
from pathlib import Path

# XOR
def xor(data: bytes, key: bytes):
    return bytes([
        c ^ key[i % len(key)]
        for i, c in enumerate(data)
    ])

# Main
key = "18rgI1D&LK3rw&iChl0IEbUN71uICtYk"
in_file_path = Path("samples/cannon.bin")
with open(in_file_path, "rb") as f:
    print(f"Reading file [{in_file_path}]")
    data = f.read()
# Decrypt
result = xor(data, key.encode())
# Save
out_file_path = Path("out/cannon.exe")
os.makedirs(out_file_path.parent, exist_ok=True)
with open(out_file_path, "wb") as f:
    print(f"Writing file [{out_file_path}]")
    f.write(result)
print("Done!")

In [None]:
# Check file
! echo "Checking the file type"
! file out/cannon.exe
! echo ""
! echo "Checking the first few lines of the file"
! xxd out/cannon.exe 2>/dev/null | head -n 8
! echo ""
! echo "Checking the SHA256 sum"
! shasum -a 256 out/cannon.exe | awk '{print $1}' 

### Analysis (Part 3)

A windows executable! Just what we were looking for!
Opening it up in Windows, gives us a small window for a program known as LOIC, or Low Orbit Ion Cannon.

![cannon.exe](screenshots/cannon-exe.png)

We can check whether the program is known by checking the SHA256 sum and putting that into VirusTotal.

[VirusTotal for f60a52512773b52def9ba9ce8aad61144d2cf351f6bc04d1c5a13abef8f3b89b](https://www.virustotal.com/gui/file/f60a52512773b52def9ba9ce8aad61144d2cf351f6bc04d1c5a13abef8f3b89b/detection)

Sure enough, there it is, Low Orbit Ion Cannon, or LOIC!

![cannon.exe VirusTotal](screenshots/cannon-exe-vt.png)

### Why is the key being dumped out?

Windows Executables are null padded at the end, meaning all of the data at the end is just 0s. XORing a value with 0, gives the value itself, and likewise XORing a value with itself gives 0. Therefore, when XORing the key with null padding, the result is the key being repeated over and over again (provided the key is significantly smaller than the null padding space).

## Sample `cannon-100.bin`

### Upping the difficulty

Ok, so, we did it, we've cracked a XOR encrypted version of LOIC, we got the key, we got the loot, hooray!

Let's now look at a slightly more difficult example.

In [None]:
! echo "Checking the file type"
! file samples/cannon-100.bin
! echo ""
! echo "Checking the first few lines of the file"
! xxd samples/cannon-100.bin 2>/dev/null | head -n 10
! echo ""
! echo "Checking the last few lines of the file"
! xxd samples/cannon-100.bin 2>/dev/null | tail -n 10

### Analysis (Part 1)

Doing the same analysis on this sample as the previous sample, our `xxd` hexdump doesn't reveal an awful lot or give us any immediate clues. Instead of hexdump, let's just try using `cat`.


In [None]:
! cat samples/cannon-100.bin | tail -n 1 | fold -w 50

### Analysis (Part 2)

Looking very closely at the lines, we can now see there are some repeating sections, which is exciting.

For example, the short string `q5GbPPnFeh` pops up twice in the text, once on the first line, and once on the third line.

This is a great discovery, but manually figuring out what this key is could be quite tedious, so let's automate it!

Our keys is a repeating string of characters, so we can check for adjacent strings of varying lengths to see if any of them match, and might be a key.

In [None]:
# Key rotation
def rotate_key(b: bytes):
    return [b[i:] + b[:i] for i in range(len(b))]

print("Key Rotation Demo")
print("\n".join(rotate_key("test abcde")))
print("")

# Read sample
input_path = Path("samples/cannon-100.bin")
with open(input_path, "rb") as f:
    print(f"Reading file [{input_path}]")
    data = f.read()
# Start with low key length
print("Searching for keys")
keys = {}
for key_length in range(2, 0xFF):
    for i in range(len(data) - key_length):
        # Adjacent chunks
        chunk_a = data[i:i + key_length]
        chunk_b = data[i + key_length:i + 2 * key_length]
        if chunk_a == chunk_b:
            # Check if key is new
            key = chunk_a
            if not key_length in keys:
                keys[key_length] = []
            current_keys = keys[key_length]
            if any(_key in rotate_key(key) for _key in current_keys):
                continue
            # Save key
            print(f"{key_length} | {key}")
            current_keys.append(key)
# Done
print("Finished")


### Analysis (Part 3)

So we have some keys!

There are 3 short keys found, although these are less likely to be of value

```js
2 | b'h7'
2 | b'\x06\x0f'
2 | b'*&'
```

There were also 2 larger keys found, of size 99, and of size 198

```js
99 | b'gH6bg...Vh7b7'
198 | b'dsWOd...gH6bgdsWOd...'
```

As you can see, the 198 key is twice the length of 99, and it repeats, so actually we're not interested in that key.
For work in the field, one might test the larger key against the smaller key, to see if it is just a smaller key repeated.

Next, with our new found key, we'll need to decrypt the file. Let's pull up our code from when we were decrypting `cannon.bin`.
**Note the key may not (read: will not) work, so keep reading after you've done this**

For this example, we'll try using the key `gH6bgdsWOd^OtUB7@&B44nL#1FOf5K!q5GbPPnFehMbFuTXu4H!vswP2ETGjA74z379yrUJzS@SUz*XgPUn8TJYDHmlzZ8Vh7b7`



In [None]:
# Main
key = "gH6bgdsWOd^OtUB7@&B44nL#1FOf5K!q5GbPPnFehMbFuTXu4H!vswP2ETGjA74z379yrUJzS@SUz*XgPUn8TJYDHmlzZ8Vh7b7"
in_file_path = Path("samples/cannon-100.bin")
with open(in_file_path, "rb") as f:
    print(f"Reading file [{in_file_path}]")
    data = f.read()
# Decrypt
result = xor(data, key.encode())
# Save
out_file_path = Path("out/cannon-100.exe")
with open(out_file_path, "wb") as f:
    print(f"Writing file [{out_file_path}]")
    f.write(result)
print("Done!")

In [None]:
# Check file
! echo "Checking the file type"
! file out/cannon-100.exe
! echo ""
! echo "Checking the first few lines of the file"
! xxd out/cannon-100.exe 2>/dev/null | head -n 8
! echo ""
! echo "Checking the SHA256 sum"
! shasum -a 256 out/cannon-100.exe | awk '{print $1}' 

### Analysis (Part 4)

As we can see, this has not worked 😱
This is because our key is offset, we've found the key, but we don't know where the key starts.
That's not a problem though, because we can spin the key, and try to find where it starts!

In [None]:
# Main
key = "gH6bgdsWOd^OtUB7@&B44nL#1FOf5K!q5GbPPnFehMbFuTXu4H!vswP2ETGjA74z379yrUJzS@SUz*XgPUn8TJYDHmlzZ8Vh7b7"
in_file_path = Path("samples/cannon-100.bin")
with open(in_file_path, "rb") as f:
    print(f"Reading file [{in_file_path}]")
    data = f.read()
# Decrypt
result = None
for _key in rotate_key(key.encode()):
    decrypted = xor(data, _key)
    # Check for crib
    found = []
    if b'\x00'*8 in decrypted:
        found.append("Found null padding")
    if decrypted.startswith(b'MZ'):
        found.append("Found magic bytes")
    if 0 < len(found):
        print("\n".join(found))
        print(f"Decrypted using key [{_key[:5]}...{_key[-5:]}]")
        result = decrypted
        break
# Save
out_file_path = "out/cannon-100.exe"
with open(out_file_path, "wb") as f:
    print(f"Writing file [{out_file_path}]")
    f.write(result)
print("Done!")

In [None]:
# Check file
! echo "Checking the file type"
! file out/cannon-100.exe
! echo ""
! echo "Checking the first few lines of the file"
! xxd out/cannon-100.exe 2>/dev/null | head -n 8
! echo ""
! echo "Checking the SHA256 sum"
! shasum -a 256 out/cannon-100.exe | awk '{print $1}' 

### Summary of 'cannon-100.exe`

Ok super! Same hash as before, same file, so I'll skip the screenshots, and let you play with your newly decrypted executable! Try put it into virustotal, and you can learn about LOIC online.

[Virus Total](https://www.virustotal.com/gui/home/upload)

As for the tutorial, that's mostly it.

**If you're here because you got bored and skipped to the end to see what's there, hi, welcome.**

```md
Summary:
lots of files use null padding (\x00), and when XORed, the null padding dumps out the key.
By searching through the file for reapeating sections, you identify encrypted null padding, which dumps the raw key.
Taking this repeating data, cycling it, and using it as a key for XOR, then checking against null padding or a known crib, the file can be decoded.
```

This tool is based off a script I wrote during an real world incident in 2022, and which has come in handy several times since then.
Alongside this tutorial is a recreated version of the tool for public use.
I hope you enjoyed, and thank you for reading!

> Void