Skip to content

sunprojectca/proxy

Repository files navigation

image

Proxy

A lightweight AI operation proxy designed to reduce token waste, eliminate unnecessary repository scans, and enforce disciplined code generation workflows.

Built for developers who are tired of:

  • bloated AI-generated refactors
  • full repository rescans for tiny edits
  • context drift
  • runaway token costs
  • "vibe coding" chaos in mature codebases

Philosophy

Most AI coding workflows are backwards.

Instead of:

  • understanding architecture
  • targeting exact symbols
  • minimizing context
  • preserving ownership boundaries

modern coding assistants often:

  • scan massive repos
  • repeatedly reread files
  • rewrite unrelated code
  • generate unnecessary abstractions
  • slowly inflate complexity

This project acts as a governor layer between:

  • the developer
  • the repository
  • the model

The goal is simple:

Surgical operations only.

No full-repo madness unless explicitly required.

Core Features

A/B Benchmarking

Compare:

  • baseline AI workflows
  • optimized proxy workflows

Measure:

  • estimated token usage
  • file reads
  • context size
  • operation counts
  • repo traversal overhead

Example:

curl.exe -X POST "http://localhost:5180/api/ab/run" `
	-H "Content-Type: application/json" `
	-d '{ "repoId": "proxy", "task": "refactor RedisClient error handling" }'

Fetch report:

curl.exe "http://localhost:5180/api/ab/report?repoId=proxy"

Fetch historical data:

curl.exe "http://localhost:5174/api/ab/history?repoId=proxy&limit=50"

Recorded Benchmark Runs

Latest local batch benchmarks captured on 2026-05-08 using:

/usr/bin/python3 /home/user/proxy/tools/ab_batch.py

Environment used:

  • PROXY_BASE_URL=http://localhost:5180
  • PROXY_REPO_ID=proxy
  • PROXY_REPO_PATH=/home/user/proxy

Run snapshot A

  • totalTests: 20
  • totalBaselineTokens: 602460
  • totalOptimizedTokens: 75293
  • totalTokensAvoided: 527167
  • weightedReductionPercent: 87.5%
  • contextRetainedPercent: 12.5%
  • regressionCount: 0
  • averageExpectedHitRate: 90.0%
  • output artifacts:
    • .benchmark-results/ab_batch_20260508T084450Z.json
    • .benchmark-results/ab_batch_20260508T084450Z.csv
Running task suite...
ID  ACTION   BASE    OPT  SAVED    RED%   KEEP%  OVER   HIT% STATUS
T01 refactor base= 30123 opt=  1795 saved= 28328 red=  94.04% keep=   5.96% over=16.78x hit= 100.0% ok
T02 bugfix   base= 30123 opt=  3761 saved= 26362 red=  87.51% keep=  12.49% over= 8.01x hit= 100.0% ok
T03 feature  base= 30123 opt=  1795 saved= 28328 red=  94.04% keep=   5.96% over=16.78x hit= 100.0% ok
T04 refactor base= 30123 opt=  1795 saved= 28328 red=  94.04% keep=   5.96% over=16.78x hit= 100.0% ok
T05 bugfix   base= 30123 opt=  1795 saved= 28328 red=  94.04% keep=   5.96% over=16.78x hit= 100.0% ok
T06 test     base= 30123 opt=  2012 saved= 28111 red=  93.32% keep=   6.68% over=14.97x hit= 100.0% ok
T07 feature  base= 30123 opt=  6313 saved= 23810 red=  79.04% keep=  20.96% over= 4.77x hit= 100.0% ok
T08 edit     base= 30123 opt=  6401 saved= 23722 red=  78.75% keep=  21.25% over= 4.71x hit= 100.0% ok
T09 edit     base= 30123 opt=  3488 saved= 26635 red=  88.42% keep=  11.58% over= 8.64x hit= 100.0% ok
T10 bugfix   base= 30123 opt=  2265 saved= 27858 red=  92.48% keep=   7.52% over= 13.3x hit= 100.0% ok
T11 feature  base= 30123 opt=  5854 saved= 24269 red=  80.57% keep=  19.43% over= 5.15x hit=  50.0% ok
T12 feature  base= 30123 opt=  1795 saved= 28328 red=  94.04% keep=   5.96% over=16.78x hit=   0.0% ok
T13 refactor base= 30123 opt=  2987 saved= 27136 red=  90.08% keep=   9.92% over=10.08x hit= 100.0% ok
T14 feature  base= 30123 opt=  5586 saved= 24537 red=  81.46% keep=  18.54% over= 5.39x hit= 100.0% ok
T15 feature  base= 30123 opt=  4240 saved= 25883 red=  85.92% keep=  14.08% over=  7.1x hit= 100.0% ok
T16 delete   base= 30123 opt=  6377 saved= 23746 red=  78.83% keep=  21.17% over= 4.72x hit= 100.0% ok
T17 delete   base= 30123 opt=  4018 saved= 26105 red=  86.66% keep=  13.34% over=  7.5x hit= 100.0% ok
T18 build    base= 30123 opt=  5592 saved= 24531 red=  81.44% keep=  18.56% over= 5.39x hit= 100.0% ok
T19 docs     base= 30123 opt=  4026 saved= 26097 red=  86.63% keep=  13.37% over= 7.48x hit= 100.0% ok
T20 cleanup  base= 30123 opt=  3398 saved= 26725 red=  88.72% keep=  11.28% over= 8.86x hit=  50.0% ok

Summary
{
	"totalTests": 20,
	"totalBaselineTokens": 602460,
	"totalOptimizedTokens": 75293,
	"totalTokensAvoided": 527167,
	"weightedReductionPercent": 87.5,
	"contextRetainedPercent": 12.5,
	"regressionCount": 0,
	"regressionRate": 0.0,
	"averageExpectedHitRate": 90.0
}

Wrote JSON: .benchmark-results/ab_batch_20260508T084450Z.json
Wrote CSV : .benchmark-results/ab_batch_20260508T084450Z.csv

Run snapshot B

This follow-up run used the repaired scan flow on http://localhost:5180, where the scan resolved the host path /home/user/proxy to the container-visible repo root and indexed 44 files with 83712 estimated tokens.

  • totalTests: 20
  • totalBaselineTokens: 1105360
  • totalOptimizedTokens: 75293
  • totalTokensAvoided: 1030067
  • weightedReductionPercent: 93.19%
  • contextRetainedPercent: 6.81%
  • regressionCount: 0
  • averageExpectedHitRate: 90.0%
user@DESKTOP-JV1KHOV:~/proxy$ /usr/bin/python3 /home/user/proxy/tools/ab_batch.py
A/B batch benchmark
Base URL : http://localhost:5180
Repo ID  : proxy
Repo path: /home/user/proxy

Scanning repo: /home/user/proxy
Scan ok: 44 files, 83712 estimated tokens

Running task suite...
ID  ACTION   BASE    OPT  SAVED    RED%   KEEP%  OVER   HIT% STATUS
T01 refactor base= 55268 opt=  1795 saved= 53473 red=  96.75% keep=   3.25% over=30.79x hit= 100.0% ok
T02 bugfix   base= 55268 opt=  3761 saved= 51507 red=  93.19% keep=   6.81% over= 14.7x hit= 100.0% ok
T03 feature  base= 55268 opt=  1795 saved= 53473 red=  96.75% keep=   3.25% over=30.79x hit= 100.0% ok
T04 refactor base= 55268 opt=  1795 saved= 53473 red=  96.75% keep=   3.25% over=30.79x hit= 100.0% ok
T05 bugfix   base= 55268 opt=  1795 saved= 53473 red=  96.75% keep=   3.25% over=30.79x hit= 100.0% ok
T06 test     base= 55268 opt=  2012 saved= 53256 red=  96.36% keep=   3.64% over=27.47x hit= 100.0% ok
T07 feature  base= 55268 opt=  6313 saved= 48955 red=  88.58% keep=  11.42% over= 8.75x hit= 100.0% ok
T08 edit     base= 55268 opt=  6401 saved= 48867 red=  88.42% keep=  11.58% over= 8.63x hit= 100.0% ok
T09 edit     base= 55268 opt=  3488 saved= 51780 red=  93.69% keep=   6.31% over=15.85x hit= 100.0% ok
T10 bugfix   base= 55268 opt=  2265 saved= 53003 red=   95.9% keep=    4.1% over= 24.4x hit= 100.0% ok
T11 feature  base= 55268 opt=  5854 saved= 49414 red=  89.41% keep=  10.59% over= 9.44x hit=  50.0% ok
T12 feature  base= 55268 opt=  1795 saved= 53473 red=  96.75% keep=   3.25% over=30.79x hit=   0.0% ok
T13 refactor base= 55268 opt=  2987 saved= 52281 red=   94.6% keep=    5.4% over= 18.5x hit= 100.0% ok
T14 feature  base= 55268 opt=  5586 saved= 49682 red=  89.89% keep=  10.11% over= 9.89x hit= 100.0% ok
T15 feature  base= 55268 opt=  4240 saved= 51028 red=  92.33% keep=   7.67% over=13.03x hit= 100.0% ok
T16 delete   base= 55268 opt=  6377 saved= 48891 red=  88.46% keep=  11.54% over= 8.67x hit= 100.0% ok
T17 delete   base= 55268 opt=  4018 saved= 51250 red=  92.73% keep=   7.27% over=13.76x hit= 100.0% ok
T18 build    base= 55268 opt=  5592 saved= 49676 red=  89.88% keep=  10.12% over= 9.88x hit= 100.0% ok
T19 docs     base= 55268 opt=  4026 saved= 51242 red=  92.72% keep=   7.28% over=13.73x hit= 100.0% ok
T20 cleanup  base= 55268 opt=  3398 saved= 51870 red=  93.85% keep=   6.15% over=16.26x hit=  50.0% ok

Summary
{
	"totalTests": 20,
	"totalBaselineTokens": 1105360,
	"totalOptimizedTokens": 75293,
	"totalTokensAvoided": 1030067,
	"weightedReductionPercent": 93.19,
	"contextRetainedPercent": 6.81,
	"regressionCount": 0,
	"regressionRate": 0.0,
	"averageExpectedHitRate": 90.0
}

Dashboard Features

The interactive A/B testing dashboard provides:

  • Overview Statistics: Total tests, average savings, reduction percentages
  • Current Report: Side-by-side baseline vs optimized comparison
  • Token Savings: Real-time metrics on token reduction per test
  • Historical Trends: Charts showing token savings and reduction % over time
  • Test History Table: Recent test results with sortable metrics
  • Multi-repo Support: Switch between different repositories

Architecture

Components

API Server

Handles:

  • orchestration
  • benchmarking
  • routing
  • strategy selection

Redis Stack

Stores:

  • repository metadata
  • symbol graphs
  • file summaries
  • hashes
  • line counts
  • operation history
  • embeddings (future)

Repo Scanner

Indexes:

  • files
  • methods
  • classes
  • dependencies
  • namespaces
  • imports
  • call relationships

Prompt Builder

Creates:

  • constrained prompts
  • surgical edit requests
  • minimal context operations

Strategy Engine

Chooses:

  • baseline mode
  • optimized mode
  • targeted symbol mode
  • future autonomous repair mode

Why Redis Stack?

Using Redis Stack allows:

  • fast metadata lookup
  • JSON document storage
  • semantic search
  • vector search
  • graph-like relationship traversal
  • low-latency repository operations

Instead of rescanning thousands of files repeatedly.

Goals

Immediate Goals

  • token reduction
  • smaller prompts
  • faster operations
  • better architectural discipline
  • symbol-aware operations
  • reduced hallucinations

Long-Term Goals

Autonomous Refactoring

Controlled, deterministic transformations.

Repo Memory

Persistent repository intelligence across sessions.

AI Discipline Layer

Prevent models from:

  • touching unrelated files
  • overengineering
  • introducing architectural drift

Multi-Model Routing

Use:

  • small local models for planning
  • larger models only for execution

Overnight Autonomous Runs

Allow local models to:

  • generate tests
  • fix build errors
  • propose improvements
  • benchmark themselves

Example Workflow

Normal AI Workflow

User request:

Refactor Redis error handling

Typical assistant:

  • scans entire repo
  • rereads unrelated files
  • modifies abstractions
  • changes naming conventions
  • rewrites architecture
  • burns tokens

TokenScope Workflow

Proxy:

  • detects target symbol
  • resolves call graph
  • extracts minimal dependencies
  • builds constrained prompt
  • limits operation scope
  • validates patch impact

Only necessary context is sent.

Planned Features

  • OpenAI-compatible API proxy
  • Ollama support
  • llama.cpp integration
  • Redis semantic cache
  • operation replay
  • telemetry dashboard
  • VS Code integration
  • GitHub Copilot interception
  • operation scoring
  • diff validation
  • build verification
  • policy engine
  • Roslyn-powered C# transforms
  • incremental indexing
  • AI safety rails
  • operation heatmaps
  • context compression

Example Stack

Frontend

  • SvelteKit
  • TailwindCSS
  • Flowbite
  • TypeScript

Backend

  • Node.js + SvelteKit server routes
  • Redis Stack
  • Docker / Podman

AI

  • llama.cpp
  • Ollama
  • local GGUF models
  • optional remote APIs

Development Status

Current stage:

  • heavy R&D
  • benchmarking infrastructure
  • repository indexing
  • operation analysis
  • prompt discipline experiments

This is not another "generate app in one click" toy.

The entire point is:

  • controlled generation
  • measurable operations
  • architectural stability
  • long-term maintainability

Running Locally

Start Redis Stack

podman run -d \
	--name redis-stack \
	-p 6379:6379 \
	-p 8001:8001 \
	docker.io/redis/redis-stack:latest

Or with Docker:

docker run -d \
	--name redis-stack \
	-p 6379:6379 \
	-p 8001:8001 \
	redis/redis-stack:latest

Install Dependencies

npm install

Run Dev Server

npm run dev

Visit the dashboard at http://localhost:5174/dashboard to visualize A/B tests and historical trends.

Run A/B Test

curl.exe -X POST "http://localhost:5173/api/ab/run" `
	-H "Content-Type: application/json" `
	-d '{ "repoId": "proxy", "task": "refactor RedisClient error handling" }'

Vision

AI coding tools are currently optimized for:

  • engagement
  • convenience
  • generation speed

This project optimizes for:

  • correctness
  • control
  • architectural preservation
  • operational efficiency
  • reduced waste

The future is not:

let the AI do everything

The future is:

disciplined orchestration with measurable constraints

About

Building this tool made me skeptical of the AI coding business model because it exposed how much of the workflow is waste disguised as intelligence. A simple edit can trigger broad repo scans, repeated file reads, oversized prompts, unrelated context, and then a tiny junior-dev-style change at the end. When you measure the file selection, token loa

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors