Streaming string extraction and pattern hunting for memory dumps and binary files. Like strings | grep but smarter, stream-safe for files larger than RAM, with built-in regex categories, an AES key-schedule scanner, and optional YARA integration.
- A fast triage tool: dump → "anything interesting here?"
- A streaming engine that works on files larger than RAM (real Windows memory dumps are 4–32 GB).
- An AES key recovery tool — finds AES-128 and AES-256 master keys by verifying that the bytes following a candidate key match its computed schedule expansion. Same technique as
aeskeyfind/findaes. - A YARA-rule runner with bundled example rules for malware indicators.
- Not Volatility. For process listing, kernel introspection, DLL unpacking, registry recovery, malware unpacking, network connection enumeration → use Volatility3. Don't put "memory forensics" in your portfolio description; put "string and pattern hunter."
- Not a "find any password" tool. There's no pattern that says "this is a password" in memory. We catch credentials only when they sit next to giveaway tokens (
password=,Authorization: Bearer,aws_secret_access_key=). - Not a generic file carver. It finds patterns in printable strings, not embedded file structures.
python memstrings.py memory.raw --regex emails,ips --yara rules/# Single category
python memstrings.py memory.raw --regex urls
# Multiple categories
python memstrings.py memory.raw --regex emails,ips,aws_key,jwt
# Everything in one shot
python memstrings.py memory.raw --regex all --aes --yara rules/
# List what categories exist
python memstrings.py --list-patterns
# Export full results to JSON
python memstrings.py memory.raw --regex all --json report.json
# Tune for huge files
python memstrings.py huge_dump.raw --regex emails --min-length 8 --chunk-size 33554432
# AES key scan only (no string extraction)
python memstrings.py memory.raw --aespip install -r requirements.txt
streamlit run app.pyThe GUI uses a local path picker rather than upload — you can't realistically upload a 16 GB memory dump through a browser. Type the path; it reads from disk directly.
yara-python requires native YARA libraries that don't always install cleanly via pip on Windows:
# Try the easy path first
pip install yara-python
# If that fails, try the binary wheel
pip install yara-python --only-binary :all:
# If THAT fails, the regex-only path still works — just omit --yaraYARA is optional. The tool works fine with just --regex and --aes.
| Category | What it catches |
|---|---|
emails |
RFC-ish email addresses |
ips |
IPv4 addresses |
ipv6 |
Full IPv6 addresses |
urls |
http:// and https:// URLs |
domains |
Common TLDs (incl. .pk, .uk, .in) |
btc |
Bitcoin addresses (legacy + bech32) |
eth |
Ethereum addresses (0x...) |
aws_key |
AWS access keys (AKIA/ASIA/AIDA/AROA prefixes) |
aws_secret |
AWS secret keys in aws_secret_access_key=... context |
jwt |
JWT tokens (eyJ...eyJ...) |
pem |
PEM-format key/cert headers |
windows_path |
C:\foo\bar.exe-style paths |
unix_path |
/home/..., /etc/..., etc. |
passwords |
password=, pwd=, token= with values |
bearer |
Authorization: Bearer ... tokens |
discord |
Discord bot tokens |
github_pat |
GitHub personal access tokens (ghp_...) |
slack |
Slack tokens (xoxb-..., etc.) |
credit_card |
13-19 digit number sequences |
ssn |
US Social Security number format |
user_agent |
Browser User-Agent strings |
Use --regex all to enable everything.
AES key expansion deterministically produces a "round key schedule" — 176 bytes for AES-128, 240 bytes for AES-256 — derived from the 16- or 32-byte master key. In memory, this schedule is often stored contiguously after the key itself (because the encryption library precomputes it once and reuses it).
For each candidate offset in the file, we:
- Take the first 16 bytes (or 32) as a candidate key.
- Run the AES key expansion algorithm on it.
- Check whether the result matches the next 176 (or 240) bytes on disk.
- If yes — this is almost certainly a real AES key. The probability of a random false match is roughly 2⁻¹²⁸⁰ for AES-128. In practice, zero false positives.
This same algorithm is used by aeskeyfind, findaes, and the commercial tools for BitLocker / TrueCrypt key recovery. It will find AES master keys held by encrypted-volume drivers, SSH agents, browsers, and many crypto libraries.
| File size | String + regex | + AES scan |
|---|---|---|
| 100 MB | ~5 sec | ~30 sec |
| 1 GB | ~30 sec | ~5 min |
| 16 GB | ~10 min | ~80 min |
These are rough — depends on disk speed and string density (memory dumps are denser than executables, which are denser than encrypted files).
- String extraction is byte-aligned for ASCII, 2-byte-aligned for UTF-16LE. Strings stored in other encodings (UTF-8 multibyte, UTF-16BE, EBCDIC) aren't extracted. UTF-8 ASCII subset works since it's identical to ASCII.
- The "passwords" regex catches naive logging patterns (
password=hunter2), not credential-store structures. To find real credentials in lsass.exe → use Mimikatz (offensive) or pypykatz against a Volatility dump (defensive). - AES scanner is single-threaded and byte-by-byte. For huge files, parallelize across chunks if you need to. The structure naturally supports it.
- YARA timing is YARA's responsibility. Complex rules with many
regexclauses can be slow.
MIT.