ast-driven structural code-bloat scanner powered by nvidia nim (deepseek v4 pro).
detects duplicate logic, lazy placeholders, ai-generated stubs, and todo debt — then auto-fixes them with production-grade code via nvidia's hosted inference api. no local models, no gpu needed.
this is a fork of zenapta/bloathunter by shashank bhardwaj. the original used ollama + llama 3 locally; this version uses nvidia nim (deepseek v4 pro) — no local llm required.
- ast structural analysis — uses the typescript compiler api to normalize function bodies (strips variable names, formatting, string/number literals) so it finds true copy-pasted duplicates even when variables are renamed.
- nvidia nim auto-fixes — connects to deepseek v4 pro via nvidia's hosted api (
integrate.api.nvidia.com). free tier available. no local llm, no ollama, no gpu required. - smart placeholder detection — 30+ patterns for ai-generated stubs, lazy todos, unimplemented code, and hidden technical debt. catches generated-by-ai comments, "insert logic here", "not implemented" errors, and more.
- duplicate refactoring strategies — for each duplicate cluster, gets deepseek v4 pro to generate a concrete extraction plan with import paths and shared function signatures.
- health scoring — grades your codebase a–f with severity levels (low → critical) so you know where to focus.
- dry-run mode —
--dry-runshows colored diffs of what would change without touching files.
# set your nvidia api key
export NVIDIA_API_KEY=nvapi-...
# run it (scans current directory)
npx code-debloater
# or install globally
npm install -g code-debloater
code-debloater ./src
- go to integrate.nvidia.com
- sign up / log in
- navigate to the api section and generate a free api key
- deepseek v4 pro is available on the free tier with rate limits
code-debloater [options] [directory]
options:
--dry-run, --dry preview fixes without writing
--scan-only, --no-fix audit only; skip ai fixes
--yes, -y non-interactive auto-fix
--verbose, -v detailed per-file progress
--json structured json output (ci)
--output, -o <file> write results to file
--exclude, -x <patterns> glob exclude patterns (comma-sep)
--model, -m <name> nim model (default: deepseek-ai/deepseek-v4-pro)
--max-concurrent <n> parallel nim requests (default: 3)
--threshold <n> minimum health score (0-100)
--max-function-lines <n> warn on functions over n lines (default: 60)
--init scaffold .code-debloaterrc
--version print version
--help, -h show this help
environment:
NVIDIA_API_KEY required. get yours at https://integrate.nvidia.com
CODE_DEBLOATER_MODEL model override (same as --model)
config file (auto-loaded):
.code-debloaterrc project-specific settings (json)
# scan current directory interactively
code-debloater
# audit only — no fixes
code-debloater --scan-only ./src
# preview what would change
code-debloater --dry-run --verbose
# ci-friendly json report
code-debloater --json --output report.json
# skip test & vendor dirs
code-debloater --exclude "test/**,vendor/**"
# fast unattended fixes
code-debloater --yes --max-concurrent 5
# scaffold a config file
code-debloater --init
| category | examples |
|---|---|
| lazy todos | // TODO: implement later, // FIXME: needs work |
| ai-generated stubs | // generated by claude, // auto-generated stub |
| incomplete code | // insert logic here, // add your own code |
| unimplemented | throw new Error('Not implemented'), // needs implementation |
| placeholders | // your code goes here, // ... implement this |
| structural duplicates | functions with identical ast bodies (variable names ignored) |
| oversized functions | functions exceeding --max-function-lines (default: 60) |
src/
├── index.ts # entry point + cli flag parsing
├── config.ts # config loader (.code-debloaterrc + env + cli)
├── ai/
│ └── nimConnector.ts # nvidia nim api client with retry logic
├── cli/
│ ├── interface.ts # terminal ui (spinners, tables, diffs)
│ └── output.ts # json/csv formatters for ci
├── core/
│ ├── crawler.ts # file discovery (gitignore-aware)
│ ├── fixer.ts # parallel nim fix executor
│ ├── issueScorer.ts # health scoring, grading, recommendations
│ └── scanners/
│ ├── astScanner.ts # typescript ast function extraction + normalization
│ ├── astUtils.ts # shared ast helpers (no source-code duplication)
│ ├── bloatScanner.ts # oversized function detection
│ └── commentScanner.ts # regex-based placeholder/todo detection
- extract — every function/method/arrow function is parsed from js/ts files using the typescript compiler api.
- normalize — variable names →
__id1, __id2, string literals →__str, numbers →0. this strips cosmetic differences. - cluster — functions with identical normalized bodies are grouped.
- report — clusters with 2+ members are reported as duplicates.
- fix — deepseek v4 pro generates a refactoring strategy for each cluster.
| ollama (local llama) | code-debloater (nvidia nim) | |
|---|---|---|
| gpu needed | yes (or very slow on cpu) | no |
| setup | install ollama, pull model | just set NVIDIA_API_KEY |
| speed | depends on hardware | ~1-3s per fix on nim |
| model | llama 3 (8b/70b) | deepseek v4 pro (moe, 200b+) |
| quality | decent for small fixes | production-grade code generation |
| cost | free (your electricity) | free tier available |
| context window | 8k–128k | 1m tokens |
# clone and build
git clone https://github.com/houseofmates/code-debloater
cd code-debloater
npm install
npm run build
# run locally
NVIDIA_API_KEY=nvapi-... npm run start -- ./test-sandbox
# test with dry run
NVIDIA_API_KEY=nvapi-... npm run start -- --dry-run ./test-sandbox
forked from zenapta/bloathunter by shashank bhardwaj (mit license). the original project deservers credit for the ast scanning architecture and the concept of structural duplicate detection. this fork replaces the local ollama/llama 3 backend with nvidia nim (deepseek v4 pro) and adds extensive new features.