A lightweight AI operation proxy designed to reduce token waste, eliminate unnecessary repository scans, and enforce disciplined code generation workflows.
Built for developers who are tired of:
- bloated AI-generated refactors
- full repository rescans for tiny edits
- context drift
- runaway token costs
- "vibe coding" chaos in mature codebases
Most AI coding workflows are backwards.
Instead of:
- understanding architecture
- targeting exact symbols
- minimizing context
- preserving ownership boundaries
modern coding assistants often:
- scan massive repos
- repeatedly reread files
- rewrite unrelated code
- generate unnecessary abstractions
- slowly inflate complexity
This project acts as a governor layer between:
- the developer
- the repository
- the model
The goal is simple:
Surgical operations only.
No full-repo madness unless explicitly required.
Compare:
- baseline AI workflows
- optimized proxy workflows
Measure:
- estimated token usage
- file reads
- context size
- operation counts
- repo traversal overhead
Example:
curl.exe -X POST "http://localhost:5180/api/ab/run" `
-H "Content-Type: application/json" `
-d '{ "repoId": "proxy", "task": "refactor RedisClient error handling" }'Fetch report:
curl.exe "http://localhost:5180/api/ab/report?repoId=proxy"Fetch historical data:
curl.exe "http://localhost:5174/api/ab/history?repoId=proxy&limit=50"Latest local batch benchmarks captured on 2026-05-08 using:
/usr/bin/python3 /home/user/proxy/tools/ab_batch.pyEnvironment used:
PROXY_BASE_URL=http://localhost:5180PROXY_REPO_ID=proxyPROXY_REPO_PATH=/home/user/proxy
totalTests:20totalBaselineTokens:602460totalOptimizedTokens:75293totalTokensAvoided:527167weightedReductionPercent:87.5%contextRetainedPercent:12.5%regressionCount:0averageExpectedHitRate:90.0%- output artifacts:
.benchmark-results/ab_batch_20260508T084450Z.json.benchmark-results/ab_batch_20260508T084450Z.csv
Running task suite...
ID ACTION BASE OPT SAVED RED% KEEP% OVER HIT% STATUS
T01 refactor base= 30123 opt= 1795 saved= 28328 red= 94.04% keep= 5.96% over=16.78x hit= 100.0% ok
T02 bugfix base= 30123 opt= 3761 saved= 26362 red= 87.51% keep= 12.49% over= 8.01x hit= 100.0% ok
T03 feature base= 30123 opt= 1795 saved= 28328 red= 94.04% keep= 5.96% over=16.78x hit= 100.0% ok
T04 refactor base= 30123 opt= 1795 saved= 28328 red= 94.04% keep= 5.96% over=16.78x hit= 100.0% ok
T05 bugfix base= 30123 opt= 1795 saved= 28328 red= 94.04% keep= 5.96% over=16.78x hit= 100.0% ok
T06 test base= 30123 opt= 2012 saved= 28111 red= 93.32% keep= 6.68% over=14.97x hit= 100.0% ok
T07 feature base= 30123 opt= 6313 saved= 23810 red= 79.04% keep= 20.96% over= 4.77x hit= 100.0% ok
T08 edit base= 30123 opt= 6401 saved= 23722 red= 78.75% keep= 21.25% over= 4.71x hit= 100.0% ok
T09 edit base= 30123 opt= 3488 saved= 26635 red= 88.42% keep= 11.58% over= 8.64x hit= 100.0% ok
T10 bugfix base= 30123 opt= 2265 saved= 27858 red= 92.48% keep= 7.52% over= 13.3x hit= 100.0% ok
T11 feature base= 30123 opt= 5854 saved= 24269 red= 80.57% keep= 19.43% over= 5.15x hit= 50.0% ok
T12 feature base= 30123 opt= 1795 saved= 28328 red= 94.04% keep= 5.96% over=16.78x hit= 0.0% ok
T13 refactor base= 30123 opt= 2987 saved= 27136 red= 90.08% keep= 9.92% over=10.08x hit= 100.0% ok
T14 feature base= 30123 opt= 5586 saved= 24537 red= 81.46% keep= 18.54% over= 5.39x hit= 100.0% ok
T15 feature base= 30123 opt= 4240 saved= 25883 red= 85.92% keep= 14.08% over= 7.1x hit= 100.0% ok
T16 delete base= 30123 opt= 6377 saved= 23746 red= 78.83% keep= 21.17% over= 4.72x hit= 100.0% ok
T17 delete base= 30123 opt= 4018 saved= 26105 red= 86.66% keep= 13.34% over= 7.5x hit= 100.0% ok
T18 build base= 30123 opt= 5592 saved= 24531 red= 81.44% keep= 18.56% over= 5.39x hit= 100.0% ok
T19 docs base= 30123 opt= 4026 saved= 26097 red= 86.63% keep= 13.37% over= 7.48x hit= 100.0% ok
T20 cleanup base= 30123 opt= 3398 saved= 26725 red= 88.72% keep= 11.28% over= 8.86x hit= 50.0% ok
Summary
{
"totalTests": 20,
"totalBaselineTokens": 602460,
"totalOptimizedTokens": 75293,
"totalTokensAvoided": 527167,
"weightedReductionPercent": 87.5,
"contextRetainedPercent": 12.5,
"regressionCount": 0,
"regressionRate": 0.0,
"averageExpectedHitRate": 90.0
}
Wrote JSON: .benchmark-results/ab_batch_20260508T084450Z.json
Wrote CSV : .benchmark-results/ab_batch_20260508T084450Z.csv
This follow-up run used the repaired scan flow on http://localhost:5180, where the scan resolved the host path /home/user/proxy to the container-visible repo root and indexed 44 files with 83712 estimated tokens.
totalTests:20totalBaselineTokens:1105360totalOptimizedTokens:75293totalTokensAvoided:1030067weightedReductionPercent:93.19%contextRetainedPercent:6.81%regressionCount:0averageExpectedHitRate:90.0%
user@DESKTOP-JV1KHOV:~/proxy$ /usr/bin/python3 /home/user/proxy/tools/ab_batch.py
A/B batch benchmark
Base URL : http://localhost:5180
Repo ID : proxy
Repo path: /home/user/proxy
Scanning repo: /home/user/proxy
Scan ok: 44 files, 83712 estimated tokens
Running task suite...
ID ACTION BASE OPT SAVED RED% KEEP% OVER HIT% STATUS
T01 refactor base= 55268 opt= 1795 saved= 53473 red= 96.75% keep= 3.25% over=30.79x hit= 100.0% ok
T02 bugfix base= 55268 opt= 3761 saved= 51507 red= 93.19% keep= 6.81% over= 14.7x hit= 100.0% ok
T03 feature base= 55268 opt= 1795 saved= 53473 red= 96.75% keep= 3.25% over=30.79x hit= 100.0% ok
T04 refactor base= 55268 opt= 1795 saved= 53473 red= 96.75% keep= 3.25% over=30.79x hit= 100.0% ok
T05 bugfix base= 55268 opt= 1795 saved= 53473 red= 96.75% keep= 3.25% over=30.79x hit= 100.0% ok
T06 test base= 55268 opt= 2012 saved= 53256 red= 96.36% keep= 3.64% over=27.47x hit= 100.0% ok
T07 feature base= 55268 opt= 6313 saved= 48955 red= 88.58% keep= 11.42% over= 8.75x hit= 100.0% ok
T08 edit base= 55268 opt= 6401 saved= 48867 red= 88.42% keep= 11.58% over= 8.63x hit= 100.0% ok
T09 edit base= 55268 opt= 3488 saved= 51780 red= 93.69% keep= 6.31% over=15.85x hit= 100.0% ok
T10 bugfix base= 55268 opt= 2265 saved= 53003 red= 95.9% keep= 4.1% over= 24.4x hit= 100.0% ok
T11 feature base= 55268 opt= 5854 saved= 49414 red= 89.41% keep= 10.59% over= 9.44x hit= 50.0% ok
T12 feature base= 55268 opt= 1795 saved= 53473 red= 96.75% keep= 3.25% over=30.79x hit= 0.0% ok
T13 refactor base= 55268 opt= 2987 saved= 52281 red= 94.6% keep= 5.4% over= 18.5x hit= 100.0% ok
T14 feature base= 55268 opt= 5586 saved= 49682 red= 89.89% keep= 10.11% over= 9.89x hit= 100.0% ok
T15 feature base= 55268 opt= 4240 saved= 51028 red= 92.33% keep= 7.67% over=13.03x hit= 100.0% ok
T16 delete base= 55268 opt= 6377 saved= 48891 red= 88.46% keep= 11.54% over= 8.67x hit= 100.0% ok
T17 delete base= 55268 opt= 4018 saved= 51250 red= 92.73% keep= 7.27% over=13.76x hit= 100.0% ok
T18 build base= 55268 opt= 5592 saved= 49676 red= 89.88% keep= 10.12% over= 9.88x hit= 100.0% ok
T19 docs base= 55268 opt= 4026 saved= 51242 red= 92.72% keep= 7.28% over=13.73x hit= 100.0% ok
T20 cleanup base= 55268 opt= 3398 saved= 51870 red= 93.85% keep= 6.15% over=16.26x hit= 50.0% ok
Summary
{
"totalTests": 20,
"totalBaselineTokens": 1105360,
"totalOptimizedTokens": 75293,
"totalTokensAvoided": 1030067,
"weightedReductionPercent": 93.19,
"contextRetainedPercent": 6.81,
"regressionCount": 0,
"regressionRate": 0.0,
"averageExpectedHitRate": 90.0
}
The interactive A/B testing dashboard provides:
- Overview Statistics: Total tests, average savings, reduction percentages
- Current Report: Side-by-side baseline vs optimized comparison
- Token Savings: Real-time metrics on token reduction per test
- Historical Trends: Charts showing token savings and reduction % over time
- Test History Table: Recent test results with sortable metrics
- Multi-repo Support: Switch between different repositories
Handles:
- orchestration
- benchmarking
- routing
- strategy selection
Stores:
- repository metadata
- symbol graphs
- file summaries
- hashes
- line counts
- operation history
- embeddings (future)
Indexes:
- files
- methods
- classes
- dependencies
- namespaces
- imports
- call relationships
Creates:
- constrained prompts
- surgical edit requests
- minimal context operations
Chooses:
- baseline mode
- optimized mode
- targeted symbol mode
- future autonomous repair mode
Using Redis Stack allows:
- fast metadata lookup
- JSON document storage
- semantic search
- vector search
- graph-like relationship traversal
- low-latency repository operations
Instead of rescanning thousands of files repeatedly.
- token reduction
- smaller prompts
- faster operations
- better architectural discipline
- symbol-aware operations
- reduced hallucinations
Controlled, deterministic transformations.
Persistent repository intelligence across sessions.
Prevent models from:
- touching unrelated files
- overengineering
- introducing architectural drift
Use:
- small local models for planning
- larger models only for execution
Allow local models to:
- generate tests
- fix build errors
- propose improvements
- benchmark themselves
User request:
Refactor Redis error handling
Typical assistant:
- scans entire repo
- rereads unrelated files
- modifies abstractions
- changes naming conventions
- rewrites architecture
- burns tokens
Proxy:
- detects target symbol
- resolves call graph
- extracts minimal dependencies
- builds constrained prompt
- limits operation scope
- validates patch impact
Only necessary context is sent.
- OpenAI-compatible API proxy
- Ollama support
llama.cppintegration- Redis semantic cache
- operation replay
- telemetry dashboard
- VS Code integration
- GitHub Copilot interception
- operation scoring
- diff validation
- build verification
- policy engine
- Roslyn-powered C# transforms
- incremental indexing
- AI safety rails
- operation heatmaps
- context compression
- SvelteKit
- TailwindCSS
- Flowbite
- TypeScript
- Node.js + SvelteKit server routes
- Redis Stack
- Docker / Podman
llama.cpp- Ollama
- local GGUF models
- optional remote APIs
Current stage:
- heavy R&D
- benchmarking infrastructure
- repository indexing
- operation analysis
- prompt discipline experiments
This is not another "generate app in one click" toy.
The entire point is:
- controlled generation
- measurable operations
- architectural stability
- long-term maintainability
podman run -d \
--name redis-stack \
-p 6379:6379 \
-p 8001:8001 \
docker.io/redis/redis-stack:latestOr with Docker:
docker run -d \
--name redis-stack \
-p 6379:6379 \
-p 8001:8001 \
redis/redis-stack:latestnpm installnpm run devVisit the dashboard at http://localhost:5174/dashboard to visualize A/B tests and historical trends.
curl.exe -X POST "http://localhost:5173/api/ab/run" `
-H "Content-Type: application/json" `
-d '{ "repoId": "proxy", "task": "refactor RedisClient error handling" }'AI coding tools are currently optimized for:
- engagement
- convenience
- generation speed
This project optimizes for:
- correctness
- control
- architectural preservation
- operational efficiency
- reduced waste
The future is not:
let the AI do everything
The future is:
disciplined orchestration with measurable constraints