Skip to content

faizanali2k05/memstrings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠 memstrings

Streaming string extraction and pattern hunting for memory dumps and binary files. Like strings | grep but smarter, stream-safe for files larger than RAM, with built-in regex categories, an AES key-schedule scanner, and optional YARA integration.

What this IS

  • A fast triage tool: dump → "anything interesting here?"
  • A streaming engine that works on files larger than RAM (real Windows memory dumps are 4–32 GB).
  • An AES key recovery tool — finds AES-128 and AES-256 master keys by verifying that the bytes following a candidate key match its computed schedule expansion. Same technique as aeskeyfind / findaes.
  • A YARA-rule runner with bundled example rules for malware indicators.

What this is NOT

  • Not Volatility. For process listing, kernel introspection, DLL unpacking, registry recovery, malware unpacking, network connection enumeration → use Volatility3. Don't put "memory forensics" in your portfolio description; put "string and pattern hunter."
  • Not a "find any password" tool. There's no pattern that says "this is a password" in memory. We catch credentials only when they sit next to giveaway tokens (password=, Authorization: Bearer, aws_secret_access_key=).
  • Not a generic file carver. It finds patterns in printable strings, not embedded file structures.

Spec command — works as written

python memstrings.py memory.raw --regex emails,ips --yara rules/

All commands

# Single category
python memstrings.py memory.raw --regex urls

# Multiple categories
python memstrings.py memory.raw --regex emails,ips,aws_key,jwt

# Everything in one shot
python memstrings.py memory.raw --regex all --aes --yara rules/

# List what categories exist
python memstrings.py --list-patterns

# Export full results to JSON
python memstrings.py memory.raw --regex all --json report.json

# Tune for huge files
python memstrings.py huge_dump.raw --regex emails --min-length 8 --chunk-size 33554432

# AES key scan only (no string extraction)
python memstrings.py memory.raw --aes

GUI

pip install -r requirements.txt
streamlit run app.py

The GUI uses a local path picker rather than upload — you can't realistically upload a 16 GB memory dump through a browser. Type the path; it reads from disk directly.

Installing YARA on Windows

yara-python requires native YARA libraries that don't always install cleanly via pip on Windows:

# Try the easy path first
pip install yara-python

# If that fails, try the binary wheel
pip install yara-python --only-binary :all:

# If THAT fails, the regex-only path still works — just omit --yara

YARA is optional. The tool works fine with just --regex and --aes.

Built-in pattern categories

Category What it catches
emails RFC-ish email addresses
ips IPv4 addresses
ipv6 Full IPv6 addresses
urls http:// and https:// URLs
domains Common TLDs (incl. .pk, .uk, .in)
btc Bitcoin addresses (legacy + bech32)
eth Ethereum addresses (0x...)
aws_key AWS access keys (AKIA/ASIA/AIDA/AROA prefixes)
aws_secret AWS secret keys in aws_secret_access_key=... context
jwt JWT tokens (eyJ...eyJ...)
pem PEM-format key/cert headers
windows_path C:\foo\bar.exe-style paths
unix_path /home/..., /etc/..., etc.
passwords password=, pwd=, token= with values
bearer Authorization: Bearer ... tokens
discord Discord bot tokens
github_pat GitHub personal access tokens (ghp_...)
slack Slack tokens (xoxb-..., etc.)
credit_card 13-19 digit number sequences
ssn US Social Security number format
user_agent Browser User-Agent strings

Use --regex all to enable everything.

How the AES scanner actually works

AES key expansion deterministically produces a "round key schedule" — 176 bytes for AES-128, 240 bytes for AES-256 — derived from the 16- or 32-byte master key. In memory, this schedule is often stored contiguously after the key itself (because the encryption library precomputes it once and reuses it).

For each candidate offset in the file, we:

  1. Take the first 16 bytes (or 32) as a candidate key.
  2. Run the AES key expansion algorithm on it.
  3. Check whether the result matches the next 176 (or 240) bytes on disk.
  4. If yes — this is almost certainly a real AES key. The probability of a random false match is roughly 2⁻¹²⁸⁰ for AES-128. In practice, zero false positives.

This same algorithm is used by aeskeyfind, findaes, and the commercial tools for BitLocker / TrueCrypt key recovery. It will find AES master keys held by encrypted-volume drivers, SSH agents, browsers, and many crypto libraries.

Performance notes

File size String + regex + AES scan
100 MB ~5 sec ~30 sec
1 GB ~30 sec ~5 min
16 GB ~10 min ~80 min

These are rough — depends on disk speed and string density (memory dumps are denser than executables, which are denser than encrypted files).

Honest limitations

  • String extraction is byte-aligned for ASCII, 2-byte-aligned for UTF-16LE. Strings stored in other encodings (UTF-8 multibyte, UTF-16BE, EBCDIC) aren't extracted. UTF-8 ASCII subset works since it's identical to ASCII.
  • The "passwords" regex catches naive logging patterns (password=hunter2), not credential-store structures. To find real credentials in lsass.exe → use Mimikatz (offensive) or pypykatz against a Volatility dump (defensive).
  • AES scanner is single-threaded and byte-by-byte. For huge files, parallelize across chunks if you need to. The structure naturally supports it.
  • YARA timing is YARA's responsibility. Complex rules with many regex clauses can be slow.

License

MIT.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors