Skip to content

zeroc00I/DontFeedTheAI

Repository files navigation

DontFeedTheAI

DontFeedTheAI banner

License Python Ollama FastAPI Platform

A transparent proxy that strips IPs, credentials, hostnames, and PII from every request before it reaches the AI β€” and restores them on the way back.

flowchart TD
    shell["πŸ–₯️ Your Shell\nnmap -sV dc01.acmecorp.local"]
    proxy["πŸ›‘οΈ DontFeedTheAI\ndc01.acmecorp.local β†’ srv-0042.pentest.local\n10.20.0.10 β†’ 203.0.113.47\nAdmin@Acme2024! β†’ [CRED_XK9A2B3C]"]
    api["☁️ LLM API\nsees only\nsrv-0042.pentest.local\n203.0.113.47"]

    shell -- "β‘  real data" --> proxy
    proxy -- "β‘‘ surrogates only" --> api
    api -- "β‘’ response + surrogates" --> proxy
    proxy -- "β‘£ real data restored" --> shell
Loading
Layer Detects
🧠 Ollama (local LLM) hostnames, org names, credentials in prose
πŸ” Regex IPs, hashes, tokens, API keys

Both run on your machine. Nothing sensitive crosses the boundary.


Who How it helps
Pentesters Run nmap, mimikatz, bloodhound output through Claude without exposing client infrastructure
Developers & SREs Debug with production data or internal configs in regulated environments
Legal & consulting Anonymize client contracts, case files, or proprietary IP in AI-assisted reviews
Finance & compliance Analyze reports or audit scripts without exposing account details
Researchers Query LLMs on confidential datasets

Why not just send data directly?

❌ Cloud anonymization API + LLM β€” two bills, two third parties. Your sensitive data still leaves the machine, just through more hands.

flowchart LR
    s0["πŸ–₯️ Your Shell\nreal data"] --> a0["☁️ Anonymization API\nsees everything\nbill #1"]
    a0 --> c0["☁️ LLM API\nbill #2"]
Loading

❌ Ollama alone β€” your data never leaves the machine, but Ollama has no awareness of what's sensitive. It reasons on whatever you paste: real IPs, real credentials, real hostnames.

flowchart LR
    s1["πŸ–₯️ Your Shell\nreal data"] --> o1["🧠 Ollama\nno interception\nreasons on real data"]
Loading

❌ Claude / OpenAI directly β€” best reasoning quality, but everything lands in their infrastructure. Real client IPs, credentials, org names in API logs β€” one policy change or breach away from a problem.

flowchart LR
    s2["πŸ–₯️ Your Shell\nreal data"] --> c1["☁️ LLM API\nsees everything\nlogs your real data"]
Loading

βœ… DontFeedTheAI β€” cloud reasoning quality, local detection, nothing sensitive crosses the boundary. Works with Claude Code, OpenAI SDK, OpenRouter, or any OpenAI-compatible client.

flowchart LR
    s3["πŸ–₯️ Your Shell\nreal data"] --> p["πŸ›‘οΈ DontFeedTheAI"]
    o2["🧠 Ollama\nlocal detector\nnever leaves machine"] --> p
    p --> c2["☁️ LLM API\nsees only surrogates"]
Loading

β†’ See docs/architecture.md for the full technical breakdown. For supported LLM clients and upstream configuration, see docs/providers.md.


Quick Start

With a VPS (recommended for team use or persistent engagements):

git clone https://github.com/zeroc00I/DontFeedTheAI
cd DontFeedTheAI
python3 wizard.py

The wizard asks everything β€” engagement name, VPS address, model β€” then deploys, opens the SSH tunnel, and launches Claude with the proxy active.

Locally without a VPS:

python3 wizard.py setup       # create venv + install dependencies
python3 wizard.py docker up   # start proxy + Ollama in Docker
export ANTHROPIC_BASE_URL=http://localhost:8080
export ENGAGEMENT_ID=my-engagement
claude                        # or any OpenAI-compatible client

Works on Windows, macOS, and Linux.

python3 wizard.py --help   # all available commands

Docs

Doc About
Architecture Two-layer pipeline, what gets anonymized and what doesn't, config reference
Providers Supported LLM clients: Claude Code, OpenAI SDK, OpenRouter
Contributing How to add fixtures, run the improvement loop, open areas
Threat Model What this protects against, what it doesn't, limitations, roadmap

Verifying coverage & contributing improvements

Two tools ship with DontFeedTheAI to help you validate coverage and extend it.

Visual audit β€” open in browser while the proxy is running:

python3 wizard.py tunnel --audit

Shows every ORIGINAL β†’ SURROGATE mapping logged during the session, filterable by entity type (DOMAIN, CREDENTIAL, TOKEN, HASH…) with per-request timing breakdown. Use it to spot leaks at a glance instead of grepping logs.

audit dashboard

The audit page is a debug tool. It exposes the full surrogate β†’ original lookup table, which is why it only runs behind the SSH tunnel. Making this write-only (no reverse lookup over HTTP) is on the roadmap β€” see Threat Model.

Testing the full pipeline β€” requires Ollama running:

python3 wizard.py test --integration

Runs all 53 fixtures through the complete pipeline (LLM + regex) and asserts zero leaks. Without --integration, the LLM is mocked and only the regex layer is validated β€” useful for fast iteration but not a substitute for the full run.

Auto-improvement loop β€” regex layer only, no Ollama required:

python3 wizard.py improve --cycles 3

Runs all fixtures through the regex layer, reports leaks and false positives, and tells you exactly which strings slipped through. The contribution cycle is: add a fixture for a real tool you use β†’ run the loop β†’ add a regex pattern for each leak β†’ repeat. See Contributing.

The two commands complement each other: improve tightens the regex floor fast; test --integration confirms the full pipeline holds.


A note from the author

I'm a pentester, not a software architect.

This wasn't built to be innovative β€” there are already cloud APIs that do LLM-based anonymization. But that means sending your data to yet another third party, and I refuse. If you work in security, you already know why.

I built this so the architecture would be available to everyone, and so the community could help expand its effectiveness for free. You're paying for context processing β€” the AI doesn't need your real data for that.

β€” zeroc00I


Star History

Star History Chart


License

MIT

About

Transparent anonymization proxy for AI-assisted pentesting. Strips IPs, credentials, hostnames and PII before they reach any LLM (Claude, OpenAI, OpenRouter). Local Ollama + regex detection. Per-engagement vault.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages