The LocalStack for AI Agents - Enterprise-grade mock API platform for OpenAI, Anthropic, Google Gemini. Develop, Test, and Scale AI Agents locally without burning API credits.
Save $500-50,000/month on API costs while making tests 80Γ faster and 100% reliable. "If you are building Agents, you need this."
The easiest way to run AI LocalStack is using Docker Compose. This spins up the API, Database, Vector Store, and Redis cache instantly.
Simply run the start_windows.bat file in the root directory.
It will:
- Check Docker status
- Build the project
- Open the Dashboard in your browser automatically.
Navigate to the ai-localstack backend directory and start the services:
cd ai-localstack
docker-compose up --buildThis will start the following services:
- API:
http://localhost:8000 - Qdrant (Vector DB):
localhost:6333 - PostgreSQL:
localhost:5432 - Redis:
localhost:6379
# Health check
curl http://localhost:8000/health
# {"status":"ok"}To use the "Magic Record" feature (Proxy to Real OpenAI once -> Save -> Replay forever):
- Edit Environment:
Open
ai-localstack/.envand add:AI_LOCALSTACK_PROXY_MODE=True AI_LOCALSTACK_OPENAI_REAL_KEY=sk-your-real-openai-key
- Restart:
docker-compose restart
- Usage:
- Run your agent/tests.
- The first time you ask a question, it hits OpenAI (costs money).
- It is automatically saved to Qdrant.
- The second time (and forever after), it comes from LocalStack (Free).
Access all models via the OpenAI-compatible endpoint: http://localhost:8000/v1/chat/completions
| Provider | Models | Features |
|---|---|---|
| OpenAI | gpt-4, gpt-4-turbo, gpt-3.5-turbo, gpt-5.2 |
Streaming, Func Calls, Reasoning |
gemini-3-pro, gemini-3-deep-think |
Context Window, Deep Reasoning | |
| Anthropic | claude-3-opus, claude-opus-4.5 |
System Prompts, Large Context |
| DeepSeek | deepseek-v3.2 |
Coding Specialization |
| xAI | grok-4.20 |
Fun/Humorous Responses |
AI LocalStack unifies all providers under the OpenAI-compatible interface.
You do not need to install google-generativeai, anthropic, or other SDKs.
Simply use the standard openai library and change the model parameter.
The system automatically:
- Detects the provider (Google, Anthropic, DeepSeek, etc.)
- Simulates that provider's specific behavior and pricing.
- Returns the response in standard OpenAI format.
This allows you to verify your agents against ANY model using a single, unified codebase.
We provide complete, runnable examples in the ai-localstack/examples/ directory.
Using the standard openai python library:
from openai import OpenAI
# 1. Connect to LocalStack
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="sk-test-key" # Any key works in Free Tier
)
# 2. Chat Completion
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
# 3. Streaming Code Generation
stream = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Write a python function"}],
stream=True
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")Real-world simulation of creating an account, logging in, and using a secure token.
See examples/utils.py for the reusable helper implementation.
import httpx
API_BASE = "http://localhost:8000/v1"
# 1. Register Organization
try:
auth_resp = httpx.post(f"{API_BASE}/auth/register", json={
"email": "user@corp.com",
"password": "strong_password",
"organization_name": "My Corp"
})
token = auth_resp.json()["access_token"]
except:
# 2. Login (if exists)
auth_resp = httpx.post(f"{API_BASE}/auth/login", json={
"email": "user@corp.com",
"password": "strong_password"
})
token = auth_resp.json()["access_token"]
print(f"Authenticated with JWT: {token[:10]}...")AI LocalStack supports tools definitions in the request. Currently, it mocks the agentic behavior by analyzing the tool definitions and returning descriptive responses.
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Check weather in SF"}],
tools=[...]
)
# Returns a mock response acknowledging the tool contextTo use CrewAI with LocalStack, simply point it to localhost.
import os
from crewai import Agent, Task, Crew
# 1. Point to LocalStack
os.environ["OPENAI_API_BASE"] = "http://localhost:8000/v1"
os.environ["OPENAI_API_KEY"] = "sk-test"
os.environ["OPENAI_MODEL_NAME"] = "gpt-4"
# 2. Define Agent (Standard Code)
researcher = Agent(
role='Researcher',
goal='Find news',
backstory='You are an AI assistant',
verbose=True
)
# 3. Execution hits LocalStack automaticallyConfigure the server using environment variables in ai-localstack/.env:
| Variable | Default | MOCK/Description |
|---|---|---|
AI_LOCALSTACK_API_PORT |
8000 |
Mock API Port |
AI_LOCALSTACK_ENVIRONMENT |
development |
Set to production for strict mode |
AI_LOCALSTACK_JWT_SECRET |
change-me |
Secret for generating JWTs |
AI_LOCALSTACK_MOCK_TOKEN_DELAY_MS |
15 |
IMPORTANT: Controls streaming speed (lower = faster) |
AI_LOCALSTACK_QDRANT_URL |
http://localhost:6333 |
Connect semantic search here |
AI_LOCALSTACK_FREE_TIER_DAILY_REQUESTS |
1000 |
Quota Limit |
You can define your own mock patterns in ai-localstack/config/mock_config.yaml.
Edit the patterns section to add specific keywords that trigger instant responses.
patterns:
my_custom_trigger:
triggers: ["deploy", "ship it", "release"]
response_template: "deployment_response"Define what the bot says when triggered.
responses:
deployment_response: |
Deployment initiated! Status: SUCCESS.
(This is a custom mock response)For "Smart Mocks" that don't rely on exact keywords:
- Ensure Qdrant is running (
docker-compose up). - The system automatically embeds your
responsesfrommock_config.yamlon startup. - If a user query matches the meaning of a template (e.g., "start the build" matching "ship it"), it will return that template.
AI LocalStack allows you to Share Mocks with your team. One person records the API traffic (Magic Mode) -> Everyone else uses it offline.
- Export:
- Go to
http://localhost:8000/dashboard - Click [DOWNLOAD BACKUP]
- You get a
localstack_mocks.jsonfile.
- Go to
- Commit:
- Save this file in your git repo (e.g.,
tests/mocks/data.json).
- Save this file in your git repo (e.g.,
- Import:
- Teammates go to their Dashboard.
- Click [RESTORE] and select the file.
- Now their LocalStack is populated with your specific test cases!
- Pattern Matcher (<1ms): Instant regex responses.
- Semantic Search (~50ms): Vector similarity matching.
- Conversation FSM (~5ms): Context-aware state machine.
- Template Engine (~10ms): Dynamic Jinja2 responses.
- Magic Record Mode (Level 5): Proxies to real OpenAI once, saves the result, and replays it forever for free.
Q: I get Connection Refused on port 8000?
A: Ensure Docker is running. If on Windows, check if another service uses port 8000. Edit .env to change AI_LOCALSTACK_API_PORT.
Q: pip install openai fails?
A: You likely have a permission issue or a Python version mismatch. Use a virtual environment (python -m venv venv) as recommended.
Q: How do I reset the database?
A: docker-compose down -v will remove volumes (DB data) and reset everything.
graph TD
User[User/Agent] -->|HTTP Request| API[FastAPI Gateway]
API --> Middleware[Auth & Rate Limit]
Middleware --> Routing[Provider Router]
Routing -->|OpenAI/Anthropic| MockEngine[Mock Engine]
subgraph Mock Engine
P[Pattern Matcher] -->|No Match| S[Semantic Search]
S -->|No Match| F[Conversation FSM]
F -->|No Match| T[Template Logic]
end
MockEngine --> Response[JSON Response]
Response --> User
AGPL-3.0