genai-code

Run Claude Code using Gemma 4, Gemini, or any Google GenAI model — completely local, no Anthropic API key needed.

Drop-in local proxy that lets you use Gemma 4, Gemini, or any Google GenAI model with Claude Code. No changes to your Claude Code workflow — just point it at the proxy.

Features

Anthropic Messages API compatible — works transparently with Claude Code and any Anthropic SDK client
Dynamic model passthrough — use any model you specify in the request (Gemma 4, Gemini, etc.)
Streaming support — full SSE streaming with proper block indexing
Tool / function calling — Claude Code's built-in tools (read, write, edit, bash) work via function calling translation
Google Search grounding — optional web search via Google Search (enabled by default)
Thinking config — control reasoning depth via thinking.budget_tokens
Retry logic — automatic retries with exponential backoff on transient failures
Connection handling — detects client disconnects mid-stream, avoids wasted API calls

Prerequisites

Python 3.10+
Google AI Studio API key — get one free

Quick Start

# 1. Clone the repo
git clone https://github.com/yourusername/genai-code.git
cd genai-code

# 2. Install dependencies
pip install -r requirements.txt

# 3. Set your API key
cp .env.example .env
# Edit .env and replace with your key: GEMINI_API_KEY=your_key_here

# 4. Start the proxy
uvicorn main:app --host 0.0.0.0 --port 8000

Verify it's running:

curl http://localhost:8000/health
# {"status":"ok","gemini_api_key_set":true,"model":"gemma-4-26b-a4b-it",...}

Use with Claude Code

Create a .claude/settings.json file in your project root (or global ~/.claude/settings.json):

{
    "$schema": "https://json.schemastore.org/claude-code-settings.json",
    "env": {
        "ANTHROPIC_AUTH_TOKEN": "sk-my-local-proxy-key",
        "ANTHROPIC_BASE_URL": "http://localhost:8000",
        "ANTHROPIC_MODEL": "gemma-4-26b-a4b-it",
        "ANTHROPIC_SMALL_FAST_MODEL": "gemma-4-26b-a4b-it"
    }
}

Then run claude normally — all requests route through Gemma 4.

Tip: Change ANTHROPIC_MODEL to any GenAI model (e.g., gemini-2.5-flash, gemma-4-29b-a4b-it). The proxy forwards the model name directly to the GenAI API.

API Reference

All endpoints are Anthropic Messages API compatible.

Method	Path	Description
`GET`	`/`	Status check
`GET`	`/health`	Health + API key status
`GET`	`/v1/models`	List available models
`POST`	`/v1/messages`	Chat endpoint (supports `stream: true/false`)
`POST`	`/v1/messages/stream`	Always-streaming alias
`GET`	`/v1/tools`	List proxy built-in tools

Request Format

{
    "model": "gemma-4-26b-a4b-it",
    "messages": [
        { "role": "user", "content": "Write a Python function to sort a list." }
    ],
    "max_tokens": 4096,
    "stream": false,
    "system": "You are an expert Python developer. Write clean, typed code.",
    "temperature": 0.2,

    "enable_google_search": true,
    "thinking": {
        "type": "enabled",
        "budget_tokens": 10000
    },
    "tools": [
        {
            "name": "get_weather",
            "description": "Get weather for a location",
            "input_schema": {
                "type": "object",
                "properties": {
                    "location": { "type": "string", "description": "City name" }
                },
                "required": ["location"]
            }
        }
    ]
}

Proxy-Specific Fields

Field	Type	Default	Description
`model`	`string`	`gemma-4-26b-a4b-it`	Passed through to GenAI — use any supported model
`enable_google_search`	`bool`	`true`	Enables Google Search grounding for factual queries
`thinking`	`object`	—	Maps `budget_tokens` to GenAI thinking level: `<2k`→LOW, `2k-8k`→MEDIUM, `≥8k`→HIGH
`tools`	`array`	—	Custom function-calling tools in Anthropic tool format

Note: When tools are provided, Google Search is automatically disabled to avoid SDK conflicts.

cURL Examples

Non-streaming:

curl http://localhost:8000/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: dummy" \
  -d '{
    "model": "gemma-4-26b-a4b-it",
    "messages": [{"role":"user","content":"What is 2+2?"}],
    "max_tokens": 256
  }'

Streaming:

curl http://localhost:8000/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemma-4-26b-a4b-it",
    "messages": [{"role":"user","content":"Tell me a joke"}],
    "stream": true
  }'

Configuration

Environment Variables

Variable	Required	Description
`GEMINI_API_KEY`	Yes	Your Google AI Studio API key

Set via .env file or export directly:

export GEMINI_API_KEY=your_key_here

Supported Models

Any model available through the Google GenAI SDK should work, including:

gemma-4-26b-a4b-it (default)
gemma-4-29b-a4b-it
gemini-2.5-flash
gemini-2.5-pro

Pass the model name in your request or via ANTHROPIC_MODEL in Claude Code settings.

Architecture

Claude Code ──POST /v1/messages──▶ FastAPI Proxy ──generateContent──▶ Google GenAI API
     ▲                                    │                                │
     │         SSE stream / JSON          │         Gemma 4 / Gemini       │
     └────────────────────────────────────└────────────────────────────────┘

Claude Code sends Anthropic-format requests to the proxy
Proxy converts messages/tools to GenAI Content and Tool formats
GenAI API response is converted back to Anthropic format (tool_use/text blocks)
Responses include proper anthropic-version headers and error shapes

Troubleshooting

Symptom	Likely Cause	Fix
`GEMINI_API_KEY env var not set`	Missing API key	Set `GEMINI_API_KEY` in `.env` or export it
`502` error in Claude Code	GenAI API failure	Check logs for the exception type; retry logic handles most transient errors
Tools not being called	Model doesn't support function calling well	Try `gemini-2.5-flash` which has stronger tool-use capabilities
Empty responses	`max_tokens` too low or model error	Increase `max_tokens` (capped at 8192)
Slow responses	Thinking config or large context	Reduce `budget_tokens` or use a faster model like `gemini-2.5-flash`

Viewing Logs

The proxy logs to stdout. Set log level via uvicorn:

uvicorn main:app --host 0.0.0.0 --port 8000 --log-level debug

Contributing

Issues and pull requests welcome. Areas that could use help:

Better tool schema translation edge cases (anyOf, oneOf)
Token usage extraction from GenAI responses
Multi-model /v1/models listing
Docker image for easier deployment

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
proxy		proxy
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

genai-code

Features

Prerequisites

Quick Start

Use with Claude Code

API Reference

Request Format

Proxy-Specific Fields

cURL Examples

Configuration

Environment Variables

Supported Models

Architecture

Troubleshooting

Viewing Logs

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

genai-code

Features

Prerequisites

Quick Start

Use with Claude Code

API Reference

Request Format

Proxy-Specific Fields

cURL Examples

Configuration

Environment Variables

Supported Models

Architecture

Troubleshooting

Viewing Logs

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages