Skip to content

thecodedata/genai-code

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

genai-code

Run Claude Code using Gemma 4, Gemini, or any Google GenAI model — completely local, no Anthropic API key needed.

Drop-in local proxy that lets you use Gemma 4, Gemini, or any Google GenAI model with Claude Code. No changes to your Claude Code workflow — just point it at the proxy.

Features

  • Anthropic Messages API compatible — works transparently with Claude Code and any Anthropic SDK client
  • Dynamic model passthrough — use any model you specify in the request (Gemma 4, Gemini, etc.)
  • Streaming support — full SSE streaming with proper block indexing
  • Tool / function calling — Claude Code's built-in tools (read, write, edit, bash) work via function calling translation
  • Google Search grounding — optional web search via Google Search (enabled by default)
  • Thinking config — control reasoning depth via thinking.budget_tokens
  • Retry logic — automatic retries with exponential backoff on transient failures
  • Connection handling — detects client disconnects mid-stream, avoids wasted API calls

Prerequisites

Quick Start

# 1. Clone the repo
git clone https://github.com/yourusername/genai-code.git
cd genai-code

# 2. Install dependencies
pip install -r requirements.txt

# 3. Set your API key
cp .env.example .env
# Edit .env and replace with your key: GEMINI_API_KEY=your_key_here

# 4. Start the proxy
uvicorn main:app --host 0.0.0.0 --port 8000

Verify it's running:

curl http://localhost:8000/health
# {"status":"ok","gemini_api_key_set":true,"model":"gemma-4-26b-a4b-it",...}

Use with Claude Code

Create a .claude/settings.json file in your project root (or global ~/.claude/settings.json):

{
    "$schema": "https://json.schemastore.org/claude-code-settings.json",
    "env": {
        "ANTHROPIC_AUTH_TOKEN": "sk-my-local-proxy-key",
        "ANTHROPIC_BASE_URL": "http://localhost:8000",
        "ANTHROPIC_MODEL": "gemma-4-26b-a4b-it",
        "ANTHROPIC_SMALL_FAST_MODEL": "gemma-4-26b-a4b-it"
    }
}

Then run claude normally — all requests route through Gemma 4.

Tip: Change ANTHROPIC_MODEL to any GenAI model (e.g., gemini-2.5-flash, gemma-4-29b-a4b-it). The proxy forwards the model name directly to the GenAI API.

API Reference

All endpoints are Anthropic Messages API compatible.

Method Path Description
GET / Status check
GET /health Health + API key status
GET /v1/models List available models
POST /v1/messages Chat endpoint (supports stream: true/false)
POST /v1/messages/stream Always-streaming alias
GET /v1/tools List proxy built-in tools

Request Format

{
    "model": "gemma-4-26b-a4b-it",
    "messages": [
        { "role": "user", "content": "Write a Python function to sort a list." }
    ],
    "max_tokens": 4096,
    "stream": false,
    "system": "You are an expert Python developer. Write clean, typed code.",
    "temperature": 0.2,

    "enable_google_search": true,
    "thinking": {
        "type": "enabled",
        "budget_tokens": 10000
    },
    "tools": [
        {
            "name": "get_weather",
            "description": "Get weather for a location",
            "input_schema": {
                "type": "object",
                "properties": {
                    "location": { "type": "string", "description": "City name" }
                },
                "required": ["location"]
            }
        }
    ]
}

Proxy-Specific Fields

Field Type Default Description
model string gemma-4-26b-a4b-it Passed through to GenAI — use any supported model
enable_google_search bool true Enables Google Search grounding for factual queries
thinking object Maps budget_tokens to GenAI thinking level: <2k→LOW, 2k-8k→MEDIUM, ≥8k→HIGH
tools array Custom function-calling tools in Anthropic tool format

Note: When tools are provided, Google Search is automatically disabled to avoid SDK conflicts.

cURL Examples

Non-streaming:

curl http://localhost:8000/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: dummy" \
  -d '{
    "model": "gemma-4-26b-a4b-it",
    "messages": [{"role":"user","content":"What is 2+2?"}],
    "max_tokens": 256
  }'

Streaming:

curl http://localhost:8000/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemma-4-26b-a4b-it",
    "messages": [{"role":"user","content":"Tell me a joke"}],
    "stream": true
  }'

Configuration

Environment Variables

Variable Required Description
GEMINI_API_KEY Yes Your Google AI Studio API key

Set via .env file or export directly:

export GEMINI_API_KEY=your_key_here

Supported Models

Any model available through the Google GenAI SDK should work, including:

  • gemma-4-26b-a4b-it (default)
  • gemma-4-29b-a4b-it
  • gemini-2.5-flash
  • gemini-2.5-pro

Pass the model name in your request or via ANTHROPIC_MODEL in Claude Code settings.

Architecture

Claude Code ──POST /v1/messages──▶ FastAPI Proxy ──generateContent──▶ Google GenAI API
     ▲                                    │                                │
     │         SSE stream / JSON          │         Gemma 4 / Gemini       │
     └────────────────────────────────────└────────────────────────────────┘
  1. Claude Code sends Anthropic-format requests to the proxy
  2. Proxy converts messages/tools to GenAI Content and Tool formats
  3. GenAI API response is converted back to Anthropic format (tool_use/text blocks)
  4. Responses include proper anthropic-version headers and error shapes

Troubleshooting

Symptom Likely Cause Fix
GEMINI_API_KEY env var not set Missing API key Set GEMINI_API_KEY in .env or export it
502 error in Claude Code GenAI API failure Check logs for the exception type; retry logic handles most transient errors
Tools not being called Model doesn't support function calling well Try gemini-2.5-flash which has stronger tool-use capabilities
Empty responses max_tokens too low or model error Increase max_tokens (capped at 8192)
Slow responses Thinking config or large context Reduce budget_tokens or use a faster model like gemini-2.5-flash

Viewing Logs

The proxy logs to stdout. Set log level via uvicorn:

uvicorn main:app --host 0.0.0.0 --port 8000 --log-level debug

Contributing

Issues and pull requests welcome. Areas that could use help:

  • Better tool schema translation edge cases (anyOf, oneOf)
  • Token usage extraction from GenAI responses
  • Multi-model /v1/models listing
  • Docker image for easier deployment

License

MIT

About

Use Gemma 4, Gemini, or any Google GenAI model with Claude Code

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%