Skip to content

aandersen2323/ai-context-stack

Repository files navigation

🧠 AI Context Stack - Reduce Claude Code Tokens by 80-90%

Automatic context injection for Claude Code to dramatically reduce token usage

Anthropic recently cut token limits for Claude Code users. This tool helps you get your quota back by automatically injecting stored context instead of sending it every time.


🎯 The Problem

Every Claude Code conversation sends the same context repeatedly:

Without AI Context Stack:
β”œβ”€β”€ Project structure: 800 tokens
β”œβ”€β”€ Dependencies: 600 tokens
β”œβ”€β”€ Coding standards: 400 tokens
β”œβ”€β”€ Architecture: 700 tokens
β”œβ”€β”€ API docs: 500 tokens
└── Previous decisions: 400 tokens
    TOTAL: 3,400 tokens EVERY conversation

Daily usage (10 conversations):
34,000 tokens wasted on repeated context

Monthly: 1,020,000 tokens (exceeds most plans!)

✨ The Solution

Store context once, reference it automatically:

With AI Context Stack:
β”œβ”€β”€ First conversation: Store context (3,400 tokens one-time)
└── Every other conversation: Reference stored context (100 tokens)
    SAVINGS: 3,300 tokens per conversation (97%)

Daily savings: 33,000 tokens
Monthly savings: 990,000 tokens

It's like getting 2-3x your plan back!

πŸ’° Cost & Requirements

Current Version (v1.0 - Proxy Mode)

What You Need:

  • βœ… Claude Code subscription ($20/month)
  • βœ… Anthropic API key (pay-per-use)
  • βœ… Docker & Docker Compose

Why Two Payments? The current version uses a proxy architecture that intercepts Claude Code's API calls and forwards them to Anthropic. This means you're using the public Anthropic API, which requires an API key.

Cost Breakdown:

Claude Code Subscription: $20/month (fixed)
+ Anthropic API Usage:    ~$0.15-1.00/month (with this tool)
─────────────────────────────────────────────────
TOTAL:                    ~$20.15-21.00/month

The Good News:

  • Even with dual payment, token savings make it worthwhile
  • API costs are minimal compared to hitting Claude Code limits
  • Coming Soon: Native MCP version (no API key needed!)

Future Version (v2.0 - Native MCP) 🚧

What You'll Need:

  • βœ… Claude Code subscription only ($20/month)
  • ❌ No Anthropic API key needed
  • ❌ No additional costs

We're actively working on a native MCP server implementation that integrates directly with Claude Code, eliminating the need for the proxy and API key entirely.

Timeline: Aiming for v2.0 release within 2-3 weeks.


πŸš€ Quick Start

Prerequisites

  • Docker & Docker Compose
  • Anthropic API key (for current version)
  • 5 minutes

Installation

# 1. Clone the repository
git clone https://github.com/aandersen2323/ai-context-stack.git
cd ai-context-stack

# 2. Run setup (installs everything)
./setup-claude-code.sh

# 3. Follow the prompts to add your API key

That's it! The system is now running.


πŸ’‘ How It Works

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Your Claude Code                   β”‚
β”‚  (configured to use proxy)          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚ API Request
               β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Token Proxy (port 8088)            β”‚
β”‚  1. Receives your request           β”‚
β”‚  2. Searches Memory MCP for context β”‚
β”‚  3. Injects relevant context        β”‚
β”‚  4. Forwards to Anthropic           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
               β”œβ”€β”€β–Ί Memory MCP (port 8015)
               β”‚    Stores your project context
               β”‚
               └──► Anthropic API
                    Gets request WITH context

Result: You save 80-90% tokens!

πŸ“‹ Usage Guide

1. Auto-Discover Your Project Context (Recommended!)

The easiest way to get started - let the tool automatically scan your project:

# Run auto-discovery from your project directory
./scripts/auto-discover.sh /path/to/your/project

# Or from within your project:
cd /path/to/your/project
/path/to/ai-context-stack/scripts/auto-discover.sh .

What it discovers:

  • βœ… package.json β†’ Project name, framework, dependencies, scripts
  • βœ… README.md β†’ Project documentation
  • βœ… tsconfig.json β†’ TypeScript configuration
  • βœ… .eslintrc.* β†’ Code quality standards
  • βœ… .env.example β†’ Required environment variables
  • βœ… docker-compose.yml β†’ Architecture and services
  • βœ… vite.config.* / webpack.config.js β†’ Build tools
  • βœ… Project structure β†’ Folder organization patterns

Example output:

╔══════════════════════════════════════════════════════════════╗
β•‘  AI Context Stack - Auto-Discovery                          β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•

πŸ“‚ Scanning project: /my-react-app

πŸ” Analyzing package.json...
  πŸ’Ύ Storing: package.json
     Tags: project,react,javascript
     βœ… Stored

πŸ” Analyzing README.md...
  πŸ’Ύ Storing: README.md
     Tags: project,documentation,readme
     βœ… Stored

βœ… Auto-Discovery Complete!
πŸ“Š Discovered and stored: 8 context items

2. Manual Context Storage (Advanced)

For specific context not auto-discovered, store it manually:

# Project information
./scripts/store-context.sh \
  "Project: Building a React e-commerce app with TypeScript" \
  "project,react,typescript"

# Coding standards
./scripts/store-context.sh \
  "Use snake_case for Python, camelCase for JavaScript" \
  "standards,python,javascript"

# Architecture decisions
./scripts/store-context.sh \
  "API base URL: https://api.example.com, use JWT auth" \
  "api,config,architecture"

# Dependencies
./scripts/store-context.sh \
  "Using React 18, TypeScript 5, Vite, TailwindCSS" \
  "dependencies,react"

3. View Stored Context

./scripts/list-context.sh

Output:

πŸ“‹ Stored Context:
==================
Total items: 4

πŸ“Œ [project, react, typescript] Project: Building a React e-commerce app...
   Handle: mem_1234567890_abc
   Time: 2025-10-09T22:00:00Z

πŸ“Œ [standards, python, javascript] Use snake_case for Python...
   Handle: mem_1234567891_def
   Time: 2025-10-09T22:01:00Z

4. Configure Claude Code

Option A: Using Claude Code directly

Edit your Claude Code configuration to point to the proxy:

{
  "apiEndpoint": "http://localhost:8088"
}

Option B: Environment variable

export ANTHROPIC_BASE_URL=http://localhost:8088

5. Use Claude Code Normally

Just use Claude Code as you normally would! The proxy automatically:

  1. Detects relevant context based on your conversation
  2. Injects it into your request
  3. Sends to Anthropic with full context
  4. Saves you 80-90% tokens!

πŸ§ͺ Test the System

# Test the proxy is working
./scripts/test-proxy.sh

Expected output:

πŸ§ͺ Testing Token Proxy...
========================

πŸ“€ Sending test request to proxy...
βœ… Success!

Response: Hello! I'm Claude, happy to help you today.

πŸ“Š Token Usage:
   Input:  25 tokens
   Output: 12 tokens

πŸ“Š Real-World Example

Without AI Context Stack

Session 1: "Build a login page"
- Send project context: 500 tokens
- Send dependencies: 300 tokens
- Send standards: 200 tokens
- Your question: 50 tokens
TOTAL: 1,050 tokens

Session 2: "Add password reset"
- Send project context: 500 tokens
- Send dependencies: 300 tokens
- Send standards: 200 tokens
- Your question: 50 tokens
TOTAL: 1,050 tokens

10 sessions: 10,500 tokens

With AI Context Stack

Session 1: "Build a login page"
- Store project context: 500 tokens (one-time)
- Store dependencies: 300 tokens (one-time)
- Store standards: 200 tokens (one-time)
- Your question: 50 tokens
TOTAL: 1,050 tokens

Session 2: "Add password reset"
- Reference stored context: 50 tokens (automatic!)
- Your question: 50 tokens
TOTAL: 100 tokens

10 sessions: 1,950 tokens
SAVINGS: 8,550 tokens (81%)

🎯 What to Store

βœ… Good Things to Store

  • Project information: Name, purpose, tech stack
  • Dependencies: Frameworks, libraries, versions
  • Coding standards: Style guides, naming conventions
  • Architecture: Patterns, folder structure
  • API endpoints: Base URLs, authentication
  • Previous decisions: Why you chose X over Y
  • Common tasks: How to run tests, deploy, etc.

❌ Don't Store

  • Secrets (API keys, passwords) - use environment variables
  • Temporary information that changes often
  • User-specific data (privacy concerns)
  • Very short context (overhead > benefit)

πŸ› οΈ Commands Reference

# Start services
docker compose up -d

# Stop services
docker compose down

# View logs
docker compose logs -f

# View memory logs only
docker compose logs -f memory-mcp

# View proxy logs only
docker compose logs -f token-proxy

# Auto-discover project context
./scripts/auto-discover.sh /path/to/project

# Store context manually
./scripts/store-context.sh "text" "tag1,tag2"

# List all context
./scripts/list-context.sh

# Test proxy
./scripts/test-proxy.sh

# Restart services
docker compose restart

πŸ“ˆ Monitoring Token Savings

Method 1: Analytics Dashboard (NEW! ✨)

The easiest way to visualize your savings!

Access the real-time dashboard at: http://localhost:8088/dashboard

Dashboard Features

What you'll see:

  • πŸ“Š Total tokens saved across all requests
  • πŸ“‰ Average reduction percentage (typically 80-90%)
  • 🎯 Context hit rate (how often context is found)
  • πŸ“ˆ Hourly savings chart (last 24 hours)
  • πŸ• Recent requests table with detailed metrics
  • 🧠 Patterns learned by auto-learning system

Features:

  • βœ… Auto-refreshes every 5 seconds
  • βœ… Shows last 1,000 requests
  • βœ… Tracks hourly trends
  • βœ… No configuration needed

Quick Access:

# Open dashboard in your browser
open http://localhost:8088/dashboard

# Or get raw stats via API
curl http://localhost:8088/api/dashboard | jq

πŸ“– Full Documentation: See ANALYTICS_DASHBOARD.md

Method 2: Log Monitoring

The proxy also logs every request showing token savings:

docker compose logs -f token-proxy

Look for lines like:

[Proxy] Searching memory with tags: react, project
[Proxy] Found 3 context items (~800 tokens)
[Proxy] πŸŽ‰ Estimated tokens saved: 800 (context auto-injected)

πŸ”§ Configuration

Environment Variables

Edit .env file:

# Required
ANTHROPIC_API_KEY=sk-ant-your-key-here

# Optional
PORT=8088                                  # Proxy port
MEMORY_MCP_URL=http://localhost:8015       # Memory MCP URL
MEMORY_FILE_PATH=./memory.json             # Where to store memory

Auto-Tagging

The proxy automatically detects context based on keywords:

  • "python" β†’ searches for python tag
  • "react" β†’ searches for react tag
  • "api" β†’ searches for api tag
  • "project" β†’ searches for project tag
  • "architecture" β†’ searches for architecture tag

You can customize this in token-proxy/src/index.ts.


πŸ› Troubleshooting

Services won't start

# Check if ports are in use
lsof -i :8088
lsof -i :8015

# Check logs
docker compose logs

Context not being injected

# Verify memory has items
./scripts/list-context.sh

# Check proxy logs for "Found N context items"
docker compose logs token-proxy | grep "Found"

# Test proxy directly
./scripts/test-proxy.sh

High token usage still

Make sure you:

  1. Stored relevant context with good tags
  2. Configured Claude Code to use proxy
  3. Tags in context match your conversation topics

πŸŽ“ Advanced Usage

Custom Tags

# Store with multiple specific tags
./scripts/store-context.sh \
  "Database: PostgreSQL on localhost:5432, user: dev" \
  "database,postgresql,config,dev"

# Store with TTL (auto-delete after 30 days)
curl -X POST http://localhost:8015/tools/memory.pin \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Temporary: API endpoint moved to /v2/",
    "tags": ["api", "temp"],
    "ttl": 30
  }'

Search Context

# Search by tag
curl -X POST http://localhost:8015/tools/memory.search \
  -H "Content-Type: application/json" \
  -d '{"tags": ["react"]}' | jq

Delete Context

# Get handle from list-context.sh, then:
curl -X POST http://localhost:8015/tools/memory.unpin \
  -H "Content-Type: application/json" \
  -d '{"handle": "mem_1234567890_abc"}'

πŸ“Š Architecture

ai-context-stack/
β”œβ”€β”€ token-proxy/           # Intercepts Claude API calls
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   └── index.ts      # Main proxy server
β”‚   β”œβ”€β”€ Dockerfile
β”‚   └── package.json
β”‚
β”œβ”€β”€ memory-mcp/           # Stores context with tagging
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   └── index.ts      # HTTP API wrapper for memory
β”‚   β”œβ”€β”€ Dockerfile
β”‚   └── package.json
β”‚
β”œβ”€β”€ scripts/              # Helper scripts
β”‚   β”œβ”€β”€ store-context.sh  # Store new context
β”‚   β”œβ”€β”€ list-context.sh   # List all context
β”‚   └── test-proxy.sh     # Test the system
β”‚
β”œβ”€β”€ docker-compose.yml    # Orchestrates everything
β”œβ”€β”€ setup-claude-code.sh  # One-command setup
└── README.md             # This file

🀝 Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Test with ./scripts/test-proxy.sh
  5. Submit a pull request

πŸ“„ License

MIT License - see LICENSE


πŸ™ Credits

Built with:


πŸ’¬ Support


⚑ Quick Reference

# Setup (first time)
./setup-claude-code.sh

# Auto-discover your project
./scripts/auto-discover.sh .

# Store additional context manually
./scripts/store-context.sh "Your context" "tag1,tag2"

# View all stored context
./scripts/list-context.sh

# Test it works
./scripts/test-proxy.sh

# View logs
docker compose logs -f token-proxy

# Stop
docker compose down

πŸŽ‰ Start saving tokens today!

Configure Claude Code to use http://localhost:8088 and watch your token usage drop by 80-90%.

Questions? Open an issue on GitHub!

About

Reduce Claude Code Tokens by 80-90%

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •