Automatic context injection for Claude Code to dramatically reduce token usage
Anthropic recently cut token limits for Claude Code users. This tool helps you get your quota back by automatically injecting stored context instead of sending it every time.
Every Claude Code conversation sends the same context repeatedly:
Without AI Context Stack:
βββ Project structure: 800 tokens
βββ Dependencies: 600 tokens
βββ Coding standards: 400 tokens
βββ Architecture: 700 tokens
βββ API docs: 500 tokens
βββ Previous decisions: 400 tokens
TOTAL: 3,400 tokens EVERY conversation
Daily usage (10 conversations):
34,000 tokens wasted on repeated context
Monthly: 1,020,000 tokens (exceeds most plans!)
Store context once, reference it automatically:
With AI Context Stack:
βββ First conversation: Store context (3,400 tokens one-time)
βββ Every other conversation: Reference stored context (100 tokens)
SAVINGS: 3,300 tokens per conversation (97%)
Daily savings: 33,000 tokens
Monthly savings: 990,000 tokens
It's like getting 2-3x your plan back!
What You Need:
- β Claude Code subscription ($20/month)
- β Anthropic API key (pay-per-use)
- β Docker & Docker Compose
Why Two Payments? The current version uses a proxy architecture that intercepts Claude Code's API calls and forwards them to Anthropic. This means you're using the public Anthropic API, which requires an API key.
Cost Breakdown:
Claude Code Subscription: $20/month (fixed)
+ Anthropic API Usage: ~$0.15-1.00/month (with this tool)
βββββββββββββββββββββββββββββββββββββββββββββββββ
TOTAL: ~$20.15-21.00/month
The Good News:
- Even with dual payment, token savings make it worthwhile
- API costs are minimal compared to hitting Claude Code limits
- Coming Soon: Native MCP version (no API key needed!)
What You'll Need:
- β Claude Code subscription only ($20/month)
- β No Anthropic API key needed
- β No additional costs
We're actively working on a native MCP server implementation that integrates directly with Claude Code, eliminating the need for the proxy and API key entirely.
Timeline: Aiming for v2.0 release within 2-3 weeks.
- Docker & Docker Compose
- Anthropic API key (for current version)
- 5 minutes
# 1. Clone the repository
git clone https://github.com/aandersen2323/ai-context-stack.git
cd ai-context-stack
# 2. Run setup (installs everything)
./setup-claude-code.sh
# 3. Follow the prompts to add your API keyThat's it! The system is now running.
βββββββββββββββββββββββββββββββββββββββ
β Your Claude Code β
β (configured to use proxy) β
ββββββββββββββββ¬βββββββββββββββββββββββ
β API Request
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Token Proxy (port 8088) β
β 1. Receives your request β
β 2. Searches Memory MCP for context β
β 3. Injects relevant context β
β 4. Forwards to Anthropic β
ββββββββββββββββ¬βββββββββββββββββββββββ
β
ββββΊ Memory MCP (port 8015)
β Stores your project context
β
ββββΊ Anthropic API
Gets request WITH context
Result: You save 80-90% tokens!
The easiest way to get started - let the tool automatically scan your project:
# Run auto-discovery from your project directory
./scripts/auto-discover.sh /path/to/your/project
# Or from within your project:
cd /path/to/your/project
/path/to/ai-context-stack/scripts/auto-discover.sh .What it discovers:
- β
package.jsonβ Project name, framework, dependencies, scripts - β
README.mdβ Project documentation - β
tsconfig.jsonβ TypeScript configuration - β
.eslintrc.*β Code quality standards - β
.env.exampleβ Required environment variables - β
docker-compose.ymlβ Architecture and services - β
vite.config.*/webpack.config.jsβ Build tools - β Project structure β Folder organization patterns
Example output:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AI Context Stack - Auto-Discovery β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π Scanning project: /my-react-app
π Analyzing package.json...
πΎ Storing: package.json
Tags: project,react,javascript
β
Stored
π Analyzing README.md...
πΎ Storing: README.md
Tags: project,documentation,readme
β
Stored
β
Auto-Discovery Complete!
π Discovered and stored: 8 context items
For specific context not auto-discovered, store it manually:
# Project information
./scripts/store-context.sh \
"Project: Building a React e-commerce app with TypeScript" \
"project,react,typescript"
# Coding standards
./scripts/store-context.sh \
"Use snake_case for Python, camelCase for JavaScript" \
"standards,python,javascript"
# Architecture decisions
./scripts/store-context.sh \
"API base URL: https://api.example.com, use JWT auth" \
"api,config,architecture"
# Dependencies
./scripts/store-context.sh \
"Using React 18, TypeScript 5, Vite, TailwindCSS" \
"dependencies,react"./scripts/list-context.shOutput:
π Stored Context:
==================
Total items: 4
π [project, react, typescript] Project: Building a React e-commerce app...
Handle: mem_1234567890_abc
Time: 2025-10-09T22:00:00Z
π [standards, python, javascript] Use snake_case for Python...
Handle: mem_1234567891_def
Time: 2025-10-09T22:01:00Z
Option A: Using Claude Code directly
Edit your Claude Code configuration to point to the proxy:
{
"apiEndpoint": "http://localhost:8088"
}Option B: Environment variable
export ANTHROPIC_BASE_URL=http://localhost:8088Just use Claude Code as you normally would! The proxy automatically:
- Detects relevant context based on your conversation
- Injects it into your request
- Sends to Anthropic with full context
- Saves you 80-90% tokens!
# Test the proxy is working
./scripts/test-proxy.shExpected output:
π§ͺ Testing Token Proxy...
========================
π€ Sending test request to proxy...
β
Success!
Response: Hello! I'm Claude, happy to help you today.
π Token Usage:
Input: 25 tokens
Output: 12 tokens
Session 1: "Build a login page"
- Send project context: 500 tokens
- Send dependencies: 300 tokens
- Send standards: 200 tokens
- Your question: 50 tokens
TOTAL: 1,050 tokens
Session 2: "Add password reset"
- Send project context: 500 tokens
- Send dependencies: 300 tokens
- Send standards: 200 tokens
- Your question: 50 tokens
TOTAL: 1,050 tokens
10 sessions: 10,500 tokens
Session 1: "Build a login page"
- Store project context: 500 tokens (one-time)
- Store dependencies: 300 tokens (one-time)
- Store standards: 200 tokens (one-time)
- Your question: 50 tokens
TOTAL: 1,050 tokens
Session 2: "Add password reset"
- Reference stored context: 50 tokens (automatic!)
- Your question: 50 tokens
TOTAL: 100 tokens
10 sessions: 1,950 tokens
SAVINGS: 8,550 tokens (81%)
- Project information: Name, purpose, tech stack
- Dependencies: Frameworks, libraries, versions
- Coding standards: Style guides, naming conventions
- Architecture: Patterns, folder structure
- API endpoints: Base URLs, authentication
- Previous decisions: Why you chose X over Y
- Common tasks: How to run tests, deploy, etc.
- Secrets (API keys, passwords) - use environment variables
- Temporary information that changes often
- User-specific data (privacy concerns)
- Very short context (overhead > benefit)
# Start services
docker compose up -d
# Stop services
docker compose down
# View logs
docker compose logs -f
# View memory logs only
docker compose logs -f memory-mcp
# View proxy logs only
docker compose logs -f token-proxy
# Auto-discover project context
./scripts/auto-discover.sh /path/to/project
# Store context manually
./scripts/store-context.sh "text" "tag1,tag2"
# List all context
./scripts/list-context.sh
# Test proxy
./scripts/test-proxy.sh
# Restart services
docker compose restartThe easiest way to visualize your savings!
Access the real-time dashboard at: http://localhost:8088/dashboard
What you'll see:
- π Total tokens saved across all requests
- π Average reduction percentage (typically 80-90%)
- π― Context hit rate (how often context is found)
- π Hourly savings chart (last 24 hours)
- π Recent requests table with detailed metrics
- π§ Patterns learned by auto-learning system
Features:
- β Auto-refreshes every 5 seconds
- β Shows last 1,000 requests
- β Tracks hourly trends
- β No configuration needed
Quick Access:
# Open dashboard in your browser
open http://localhost:8088/dashboard
# Or get raw stats via API
curl http://localhost:8088/api/dashboard | jqπ Full Documentation: See ANALYTICS_DASHBOARD.md
The proxy also logs every request showing token savings:
docker compose logs -f token-proxyLook for lines like:
[Proxy] Searching memory with tags: react, project
[Proxy] Found 3 context items (~800 tokens)
[Proxy] π Estimated tokens saved: 800 (context auto-injected)
Edit .env file:
# Required
ANTHROPIC_API_KEY=sk-ant-your-key-here
# Optional
PORT=8088 # Proxy port
MEMORY_MCP_URL=http://localhost:8015 # Memory MCP URL
MEMORY_FILE_PATH=./memory.json # Where to store memoryThe proxy automatically detects context based on keywords:
- "python" β searches for
pythontag - "react" β searches for
reacttag - "api" β searches for
apitag - "project" β searches for
projecttag - "architecture" β searches for
architecturetag
You can customize this in token-proxy/src/index.ts.
# Check if ports are in use
lsof -i :8088
lsof -i :8015
# Check logs
docker compose logs# Verify memory has items
./scripts/list-context.sh
# Check proxy logs for "Found N context items"
docker compose logs token-proxy | grep "Found"
# Test proxy directly
./scripts/test-proxy.shMake sure you:
- Stored relevant context with good tags
- Configured Claude Code to use proxy
- Tags in context match your conversation topics
# Store with multiple specific tags
./scripts/store-context.sh \
"Database: PostgreSQL on localhost:5432, user: dev" \
"database,postgresql,config,dev"
# Store with TTL (auto-delete after 30 days)
curl -X POST http://localhost:8015/tools/memory.pin \
-H "Content-Type: application/json" \
-d '{
"text": "Temporary: API endpoint moved to /v2/",
"tags": ["api", "temp"],
"ttl": 30
}'# Search by tag
curl -X POST http://localhost:8015/tools/memory.search \
-H "Content-Type: application/json" \
-d '{"tags": ["react"]}' | jq# Get handle from list-context.sh, then:
curl -X POST http://localhost:8015/tools/memory.unpin \
-H "Content-Type: application/json" \
-d '{"handle": "mem_1234567890_abc"}'ai-context-stack/
βββ token-proxy/ # Intercepts Claude API calls
β βββ src/
β β βββ index.ts # Main proxy server
β βββ Dockerfile
β βββ package.json
β
βββ memory-mcp/ # Stores context with tagging
β βββ src/
β β βββ index.ts # HTTP API wrapper for memory
β βββ Dockerfile
β βββ package.json
β
βββ scripts/ # Helper scripts
β βββ store-context.sh # Store new context
β βββ list-context.sh # List all context
β βββ test-proxy.sh # Test the system
β
βββ docker-compose.yml # Orchestrates everything
βββ setup-claude-code.sh # One-command setup
βββ README.md # This file
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Test with
./scripts/test-proxy.sh - Submit a pull request
MIT License - see LICENSE
Built with:
- Model Context Protocol by Anthropic
- Express.js for HTTP servers
- Docker for easy deployment
- π Issues: https://github.com/aandersen2323/ai-context-stack/issues
- π‘ Discussions: https://github.com/aandersen2323/ai-context-stack/discussions
# Setup (first time)
./setup-claude-code.sh
# Auto-discover your project
./scripts/auto-discover.sh .
# Store additional context manually
./scripts/store-context.sh "Your context" "tag1,tag2"
# View all stored context
./scripts/list-context.sh
# Test it works
./scripts/test-proxy.sh
# View logs
docker compose logs -f token-proxy
# Stop
docker compose downπ Start saving tokens today!
Configure Claude Code to use http://localhost:8088 and watch your token usage drop by 80-90%.
Questions? Open an issue on GitHub!
