Claudia Router

Claudia Router is a lightweight local Anthropic-compatible proxy that routes Claude-style /v1/messages requests to OpenAI-compatible model providers.

It is useful for experimenting with Claude-style coding workflows while using backends such as NVIDIA NIM, OpenRouter, LM Studio, Groq, DeepSeek, or local models.

What it does

Accepts Anthropic-style /v1/messages requests
Converts them to OpenAI-compatible /chat/completions
Routes requests based on model aliases
Returns Anthropic-style responses
Converts Anthropic tool definitions/results to OpenAI-compatible function tools
Logs backend, model, latency, and errors

What it does not do yet

Streaming
Full Claude Code compatibility guarantee
Prompt caching
Vision inputs
Multimodal content

Quick Start

Prerequisites:

Node.js 18 or newer
Claude Code CLI installed as claude
NVIDIA API key with access to NVIDIA hosted endpoints

git clone https://github.com/ChunkyMonkey11/Claudia-Router.git
cd Claudia-Router
npm install
npm run setup

Edit .env and set:

NVIDIA_API_KEY=your_nvidia_key_here

Start the router:

npm run dev

Optional: install the Claude wrapper command locally:

npm link

Test the server:

curl http://localhost:8082/health

Send a Claude-style request:

curl http://localhost:8082/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: dummy" \
  -d '{
    "model": "claude-3-5-sonnet-latest",
    "max_tokens": 512,
    "messages": [
      {
        "role": "user",
        "content": "Write a TypeScript function that reverses a string."
      }
    ]
  }'

For Claude Code, keep the router running and start Claude in another terminal:

npm run claude:fast

After npm link, you can use the wrapper from any coding project:

cd /path/to/your/project
claudia-claude

The wrapper runs the normal claude CLI with ANTHROPIC_BASE_URL, dummy auth, and the default router model already set. Pass normal Claude Code arguments as usual:

claudia-claude --model claude-3-5-sonnet-glm

The fast script and default wrapper route claude-3-5-sonnet-latest to NVIDIA stepfun-ai/step-3.5-flash. Use npm run claude:glm for the slower GLM quality profile, or npm run claude:smoke to test routing with the smallest configured model.

NVIDIA Setup

V1 is configured to use NVIDIA NIM by default.

Put your NVIDIA API key in .env:

NVIDIA_API_KEY=your_nvidia_key_here
CLAUDIA_CONFIG=./config.json
LOG_LEVEL=info

Keep defaultBackend set to nvidia in config.json.
Use a mapped Claude-style model alias such as claude-3-5-sonnet-latest, or send any model name and Claudia Router will use the NVIDIA backend default model.

Config Example

{
  "port": 8082,
  "defaultBackend": "nvidia",
  "backends": {
    "nvidia": {
      "baseUrl": "https://integrate.api.nvidia.com/v1",
      "apiKeyEnv": "NVIDIA_API_KEY",
      "defaultModel": "stepfun-ai/step-3.5-flash"
    },
    "openrouter": {
      "baseUrl": "https://openrouter.ai/api/v1",
      "apiKeyEnv": "OPENROUTER_API_KEY",
      "defaultModel": "qwen/qwen-2.5-coder-32b-instruct"
    },
    "local": {
      "baseUrl": "http://localhost:1234/v1",
      "apiKeyEnv": "LOCAL_API_KEY",
      "defaultModel": "local-model"
    }
  },
  "modelProfiles": {
    "claude-3-5-sonnet-latest": {
      "backend": "nvidia",
      "providerModel": "stepfun-ai/step-3.5-flash",
      "retryAttempts": 3,
      "retryBaseDelayMs": 500,
      "notes": "Fast NVIDIA coding profile",
      "capabilities": {
        "toolCalls": true,
        "coding": true
      }
    },
    "claude-opus-4-1": {
      "backend": "nvidia",
      "providerModel": "z-ai/glm4.7",
      "retryAttempts": 3,
      "retryBaseDelayMs": 500,
      "extraBody": {
        "chat_template_kwargs": {
          "enable_thinking": true,
          "clear_thinking": false
        }
      },
      "notes": "Higher-quality GLM coding profile; slower because thinking is enabled",
      "capabilities": {
        "toolCalls": true,
        "coding": true
      }
    },
    "claude-3-5-sonnet-glm": {
      "backend": "nvidia",
      "providerModel": "z-ai/glm4.7",
      "retryAttempts": 3,
      "retryBaseDelayMs": 500,
      "extraBody": {
        "chat_template_kwargs": {
          "enable_thinking": true,
          "clear_thinking": false
        }
      },
      "notes": "Explicit GLM 4.7 profile for harder coding tasks",
      "capabilities": {
        "toolCalls": true,
        "coding": true
      }
    },
    "claude-3-5-sonnet-qwen": {
      "backend": "nvidia",
      "providerModel": "qwen/qwen3.5-122b-a10b",
      "retryAttempts": 3,
      "retryBaseDelayMs": 500,
      "notes": "Qwen fallback NVIDIA coding profile",
      "capabilities": {
        "toolCalls": true,
        "coding": true
      }
    },
    "claude-3-haiku-latest": {
      "backend": "nvidia",
      "providerModel": "nvidia/nemotron-mini-4b-instruct",
      "retryAttempts": 1,
      "retryBaseDelayMs": 250,
      "notes": "Smoke-test/free-small NVIDIA profile",
      "capabilities": {
        "toolCalls": false,
        "coding": false
      }
    }
  },
  "modelMap": {
    "legacy-claude-3-5-sonnet-latest": {
      "backend": "nvidia",
      "model": "stepfun-ai/step-3.5-flash"
    }
  }
}

modelProfiles is the preferred model routing config. Each key is the incoming Claude-style model alias; backend selects a configured backend, and providerModel is the model sent to that provider. If both modelProfiles and legacy modelMap contain the same alias, modelProfiles wins. modelMap remains supported for simple { "backend", "model" } mappings.

retryAttempts is the total number of provider attempts for that profile, including the first request. retryBaseDelayMs is the exponential backoff base delay between retryable provider responses. extraBody is merged into the provider chat-completions JSON body for model-specific options such as NVIDIA chat_template_kwargs. capabilities is advisory metadata for humans and future routing policy; it does not force provider behavior.

NVIDIA Model Set

The default config includes the NVIDIA free/shared endpoint profiles that have worked well with Claude Code-style loops:

Router alias	NVIDIA model	Intended use
`claude-3-5-sonnet-latest`	`stepfun-ai/step-3.5-flash`	Fast default for day-to-day coding/tool loops
`claude-opus-4-1`	`z-ai/glm4.7`	Slower quality profile with GLM thinking enabled
`claude-3-5-sonnet-glm`	`z-ai/glm4.7`	Explicit GLM 4.7 alias for harder coding tasks
`claude-3-5-sonnet-qwen`	`qwen/qwen3.5-122b-a10b`	Alternate/fallback coding model
`claude-3-haiku-latest`	`nvidia/nemotron-mini-4b-instruct`	Smoke test for auth, routing, retries, and simple requests

Use the fast default:

npm run claude:fast

Or from any project after npm link:

claudia-claude

Switch to GLM for a harder task:

npm run claude:glm

Or:

claudia-claude --model claude-3-5-sonnet-glm

Smoke-test the router:

npm run claude:smoke

Claude-Style Request Shape

{
  "model": "claude-3-5-sonnet-latest",
  "max_tokens": 1024,
  "temperature": 0.2,
  "system": "You are a helpful coding assistant.",
  "messages": [
    {
      "role": "user",
      "content": "Write a Python function for binary search."
    }
  ]
}

Text content blocks are also supported:

{
  "role": "user",
  "content": [
    {
      "type": "text",
      "text": "Fix this code."
    }
  ]
}

Non-text blocks are ignored safely in V1.

Tool Routing

Claudia Router converts Anthropic tool-use shapes to OpenAI-compatible function calling:

Request tools become OpenAI tools with type: "function"
Request tool_choice becomes OpenAI tool_choice
Assistant tool_use blocks become OpenAI assistant tool_calls
User tool_result blocks become OpenAI tool messages
Provider tool_calls become Anthropic tool_use response blocks

This is the compatibility layer Claude Code needs before it can execute local tools such as file reads and writes. Actual tool-call quality depends on the selected backend model and whether that provider supports OpenAI-style function calling.

Claude Code Gateway Test

Start the router:

npm run dev

Launch Claude Code against the local router:

ANTHROPIC_BASE_URL=http://localhost:8082 \
ANTHROPIC_AUTH_TOKEN=dummy \
ANTHROPIC_MODEL=claude-3-5-sonnet-latest \
ANTHROPIC_DEFAULT_SONNET_MODEL=claude-3-5-sonnet-latest \
ANTHROPIC_DEFAULT_HAIKU_MODEL=claude-3-5-sonnet-latest \
claude --model claude-3-5-sonnet-latest

Then ask Claude Code to do a safe file operation:

Create hello.md with a one-line greeting.

If the file is created, the gateway can route a complete Claude Code tool loop for that operation. If Claude Code only prints a fake Write(...) call, inspect the router and provider logs to confirm whether the backend returned structured OpenAI tool_calls.

Supported Providers

Any provider with an OpenAI-compatible /chat/completions endpoint should work. The example config includes:

NVIDIA NIM
OpenRouter
LM Studio or another local OpenAI-compatible server

Groq, DeepSeek, Ollama OpenAI-compatible mode, and similar providers can be added by creating another backend entry.

Logging

Claudia Router logs one line per request with request ID, backend, source model, target model, latency, and status.

Prompts are not logged by default.

Development Checks

npm run typecheck
npm test
npm run build

Release Checklist

Before tagging a release:

Run npm run typecheck, npm test, and npm run build.
Verify /health returns 200.
Verify a plain /v1/messages request returns an Anthropic-style message.
Verify Claude Code can create and edit a small file through the router.
Rotate any provider keys that were ever committed, pasted into logs, or shared during testing.

Limitations

Text-only
Non-streaming only
No vision
No perfect Claude Code compatibility guarantee
Provider quality depends on selected model

Roadmap

Streaming support
Broader Claude Code compatibility tests
Request replay logs
Cost estimation
Provider fallback
Simple web dashboard
Policy layer for blocking risky actions
Prompt redaction
Team config
Claude Code-specific compatibility tests

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.example.json		config.example.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Claudia Router

What it does

What it does not do yet

Quick Start

NVIDIA Setup

Config Example

NVIDIA Model Set

Claude-Style Request Shape

Tool Routing

Claude Code Gateway Test

Supported Providers

Logging

Development Checks

Release Checklist

Limitations

Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Claudia Router

What it does

What it does not do yet

Quick Start

NVIDIA Setup

Config Example

NVIDIA Model Set

Claude-Style Request Shape

Tool Routing

Claude Code Gateway Test

Supported Providers

Logging

Development Checks

Release Checklist

Limitations

Roadmap

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages