# API Hash Resolution (Rainbow Table)

This notebook **does not execute** the generator script. It only:
- loads a prebuilt nested rainbow JSON (`api_hash_rainbow_nested.json`)
- loads hash constants exported from IDA (`ida_hashes.txt`)
- resolves hashes to `DLL!Export` names and renders a Markdown table

## Expected Files (in this `notebooks/` folder)
- `api_hash_rainbow_nested.json`
- `ida_hashes.txt`

## Scope
This rainbow table resolves the **loader/log.dll hashing scheme** (Rapid7: FNV-1a + Murmur-like finalizer + salt knobs).
The **main-module** hash routine described later in the Rapid7 post is different and is not covered here.


In [1]:
from __future__ import annotations
import pandas as pd
from IPython.display import display

import re
import sys
from pathlib import Path

# Ensure repo root is on sys.path so we can import chrysalis_notebook_lib.py
_cwd = Path.cwd().resolve()
_repo = None
for _d in [_cwd, *_cwd.parents]:
    if (_d / "scripts" / "chrysalis_notebook_lib.py").exists():
        sys.path.insert(0, str(_d / "scripts"))
        _repo = _d
        break

from chrysalis_notebook_lib import ApiHashRainbow, find_repo_root, loader_seed_from_host_image

NB_DIR = Path.cwd().resolve()
REPO_ROOT = find_repo_root(NB_DIR)

RAINBOW_JSON = NB_DIR / "api_hash_rainbow_nested.json"
IDA_HASHES_TXT = NB_DIR / "ida_hashes.txt"

assert RAINBOW_JSON.exists(), f"Missing {RAINBOW_JSON}"
assert IDA_HASHES_TXT.exists(), f"Missing {IDA_HASHES_TXT}"

# If your IDA addresses are from main_module_patched.exe, ImageBase is typically 0x400000.
# If they are from log.dll, ImageBase is typically 0x10000000.
IMAGE_BASE = 0x10000000

HOST_EXE = REPO_ROOT / "input" / "BluetoothService.exe"
if HOST_EXE.exists():
    seed = loader_seed_from_host_image(HOST_EXE.read_bytes(), seed_len=0x100)
    print(f"Derived loader seed from {HOST_EXE}[:0x100] = 0x{seed:08X}")
    print("This seed MUST match what log.dll computed in the real chain (sideloading host EXE).")
else:
    seed = None

print("REPO_ROOT:", REPO_ROOT)
print("RAINBOW_JSON:", RAINBOW_JSON)
print("IDA_HASHES_TXT:", IDA_HASHES_TXT)
print("IMAGE_BASE:", hex(IMAGE_BASE))


Derived loader seed from /Users/taogoldi/Downloads/Chrysalis/input/BluetoothService.exe[:0x100] = 0x114DDB33
This seed MUST match what log.dll computed in the real chain (sideloading host EXE).
REPO_ROOT: /Users/taogoldi/Downloads/Chrysalis
RAINBOW_JSON: /Users/taogoldi/Downloads/Chrysalis/notebooks/api_hash_rainbow_nested.json
IDA_HASHES_TXT: /Users/taogoldi/Downloads/Chrysalis/notebooks/ida_hashes.txt
IMAGE_BASE: 0x10000000


In [2]:
rainbow = ApiHashRainbow.from_nested_json(RAINBOW_JSON)
print("Loaded rainbow table.")

# Sanity check: in the Rapid7 sample, log.dll contains 0x47C204CA (VirtualProtect).
# If your rainbow doesn't contain it, it was generated with the wrong algorithm and/or seed.
probe = 0x47C204CA
matches = rainbow.lookup(probe)
print(f"Probe 0x{probe:08X} matches:", matches[:5], "..." if len(matches) > 5 else "")
if not matches:
    print("[!] 0 matches for probe hash 0x47C204CA.")
    print("    Your api_hash_rainbow_nested.json was likely generated with the wrong knobs.")
    print("    Rebuild it with the updated generator using the correct host seed.")
    if seed is not None:
        print(f"    Recommended seed: 0x{seed:08X}")
        print("Example (Windows):")
        print("python api_hash_rainbow.py --dll-dir C:\\Windows\\System32 --recursive \\")
        print(f"--seed 0x{seed:08X} --output-mode constant --format nested --out api_hash_rainbow_nested.json")

Loaded rainbow table.
Probe 0x47C204CA matches: [('IumSdk.dll', 'VirtualProtect'), ('KernelBase.dll', 'VirtualProtect'), ('api-ms-win-core-memory-l1-1-0.dll', 'VirtualProtect'), ('kernel32.dll', 'VirtualProtect'), ('tprtdll.dll', 'VirtualProtect')] ...


In [3]:
pat = re.compile(r"^0x([0-9a-fA-F]{1,8})\s*@\s*0x([0-9a-fA-F]{1,8})\s*$")

items = []  # (hash, va)
for line in IDA_HASHES_TXT.read_text().splitlines():
    line = line.strip()
    if not line or line.startswith("#"): 
        continue
    m = pat.match(line)
    if not m:
        continue
    hv = int(m.group(1), 16)
    va = int(m.group(2), 16)
    items.append((hv, va))

print("Parsed", len(items), "IDA immediates.")
print("First 10:", items[:10])


Parsed 285 IDA immediates.
First 10: [(65537, 268446126), (67264, 268446417), (71335, 268442481), (131040, 268489062), (132704, 268446424), (132720, 268446431), (143360, 268442537), (147456, 268442545), (181712, 268442497), (184320, 268442561)]


In [4]:
rows = []
for hv, va in items:
    matches = rainbow.lookup(hv)
    if not matches:
        continue
    rva = va - IMAGE_BASE
    for dll, exp in matches:
        rows.append({"VA": va, "RVA": rva, "Hash": hv, "DLL": dll, "Export": exp})

df = pd.DataFrame(rows)
if not df.empty:
    df = df.sort_values(["RVA", "VA", "Hash"], kind="mergesort").reset_index(drop=True)

print(f"Resolved {len(df)} hash matches.")

# Pretty display + keep hex formatting
df_show = df.head(300).copy()
df_show["VA"] = df_show["VA"].map(lambda x: f"0x{x:08X}")
df_show["RVA"] = df_show["RVA"].map(lambda x: f"0x{x:08X}")
df_show["Hash"] = df_show["Hash"].map(lambda x: f"0x{x:08X}")

display(df_show)


Resolved 21 hash matches.


Unnamed: 0,VA,RVA,Hash,DLL,Export
0,0x10001A6F,0x00001A6F,0xE2F5E21B,KernelBase.dll,GetModuleFileNameA
1,0x10001A6F,0x00001A6F,0xE2F5E21B,api-ms-win-core-libraryloader-l1-1-0.dll,GetModuleFileNameA
2,0x10001A6F,0x00001A6F,0xE2F5E21B,kernel32.dll,GetModuleFileNameA
3,0x10001A9D,0x00001A9D,0xFE1A4618,KernelBase.dll,CreateFileA
4,0x10001A9D,0x00001A9D,0xFE1A4618,api-ms-win-core-file-l1-1-0.dll,CreateFileA
5,0x10001A9D,0x00001A9D,0xFE1A4618,kernel32.dll,CreateFileA
6,0x10001ACA,0x00001ACA,0x053FAAA4,KernelBase.dll,ReadFile
7,0x10001ACA,0x00001ACA,0x053FAAA4,api-ms-win-core-file-l1-1-0.dll,ReadFile
8,0x10001ACA,0x00001ACA,0x053FAAA4,kernel32.dll,ReadFile
9,0x10001AF1,0x00001AF1,0xD6410922,IumSdk.dll,CloseHandle


## Notes

- If you see **0 matches**, it's usually one of:
  - your `ida_hashes.txt` includes lots of non-hash immediates (common); only a few are API hashes
  - your rainbow JSON was generated for a different hashing scheme (e.g., main-module hashing)
  - **most commonly for this sample:** your rainbow JSON was generated without using the correct loader seed

### Loader Seed Gotcha (Why This Matters)

For this Chrysalis `log.dll`, the loader seed is derived from the **first 0x100 bytes of the sideloading host EXE** (GetModuleHandleA(NULL)).
In Rapid7's chain this host EXE is `BluetoothService.exe`. The API resolver compares:

`api_hash(export_name) == seed + target_constant`

So the constants embedded in `log.dll` (and exported from IDA) are `api_hash(export) - seed` (mod 2^32).
If your rainbow table was generated with `seed=0` (or the wrong seed), **you will resolve 0 hashes**.

### ImageBase

Set `IMAGE_BASE` to match the module you exported addresses from in IDA:
- `log.dll`: `0x10000000`
- `main_module_patched.exe`: `0x400000`
