A universal API gateway for LLM services via ARGO. Translates between OpenAI, Anthropic, and Google GenAI API formats, routing requests to optimal upstream ARGO endpoints. Works with AI coding tools like Claude Code, Codex CLI, Aider, Gemini CLI, and more.
For detailed documentation, visit the argo-proxy ReadTheDocs page.
pip install argo-proxy # install
argo-proxy serve # start the proxyA single proxy instance serves all 4 major LLM API formats:
| API Format | Endpoint | Example Client |
|---|---|---|
| OpenAI Chat Completions | /v1/chat/completions |
OpenAI SDK, Aider, OpenCode |
| OpenAI Responses | /v1/responses |
Codex CLI |
| Anthropic Messages | /v1/messages |
Claude Code, Kilo Code |
| Google GenAI | /v1beta/models/{model}:generateContent |
Gemini CLI |
The machine or server making API calls to Argo must be connected to the Argonne internal network or through a VPN on an Argonne-managed computer if you are working off-site. Your instance of the argo proxy should always be on-premise at an Argonne machine. The software is provided "as is," without any warranties. By using this software, you accept that the authors, contributors, and affiliated organizations will not be liable for any damages or issues arising from its use. You are solely responsible for ensuring the software meets your requirements.
-
Python 3.10+ is required.
Recommended: use conda, mamba, or pipx to manage an exclusive environment.
Conda/Mamba Download and install from: https://conda-forge.org/download/
pipx Download and install from: https://pipx.pypa.io/stable/installation/ -
Install:
pip install argo-proxy
To upgrade:
argo-proxy update check # check for updates (includes dependency status) argo-proxy update install # install latest stable argo-proxy update install --pre # install latest pre-release
Or, from source (at the repo root):
pip install .
The application uses a YAML config file (v3 format). If you don't have one, First-Time Setup will create it interactively.
config_version: "3"
user: "your_username"
host: 0.0.0.0
port: 44497
verbose: true
argo_base_url: "https://apps.inside.anl.gov/argoapi"Config file search order (first found is used):
./config.yaml(current directory)~/.config/argoproxy/config.yaml~/.argoproxy/config.yaml
Migrate from v1/v2 config:
argo-proxy config migrate /path/to/old/config.yamlargo-proxy serve # default config search
argo-proxy serve /path/to/config.yaml # explicit config
argo-proxy serve --verbose --show # verbose mode, show config at startupCreate a new config interactively:
argo-proxy config initThis will:
- Prompt for your ANL username
- Select a random available port (can be overridden)
- Choose upstream environment (prod/dev/test)
- Validate connectivity to upstream URLs
- Write the config file to
~/.config/argoproxy/config.yaml
| Option | Description | Default |
|---|---|---|
config_version |
Config format version | "3" |
user |
Your ANL username | (required) |
host |
Host address to bind to | 0.0.0.0 |
port |
Port number | 44497 |
verbose |
Enable verbose logging | true |
argo_base_url |
Base URL for ARGO API | Dev URL |
native_openai_base_url |
Custom OpenAI endpoint (auto-derived if unset) | — |
native_anthropic_base_url |
Custom Anthropic endpoint (auto-derived if unset) | — |
anthropic_stream_mode |
Non-streaming Anthropic handling: force/retry/passthrough |
force |
force_conversion |
Always run full format conversion | false |
use_legacy_argo |
Use legacy ARGO gateway pipeline | false |
skip_url_validation |
Skip upstream URL checks on startup | false |
connection_test_timeout |
Seconds for URL validation | 5 |
resolve_overrides |
DNS overrides for SSH tunnels (host:port -> IP) | {} |
max_log_history |
Keep last N messages in verbose logs | 3 |
enable_payload_control |
Enable image payload size control | false |
max_payload_size |
Max image payload size in MB | 20 |
image_timeout |
Image download timeout in seconds | 30 |
concurrent_downloads |
Parallel image downloads | 10 |
argo-proxy [-h] [--version] {serve,config,logs,update,models}
| Command | Description |
|---|---|
serve [config] |
Start the proxy server |
config edit |
Open config in default editor |
config validate |
Validate config and check connectivity |
config show |
Display resolved config |
config migrate |
Migrate v1/v2 config to v3 |
config init |
Interactive config setup |
config list |
List all found config files |
config env [prod|dev|test] |
Show or switch upstream environment |
logs collect [--type TYPE] |
Collect diagnostic logs |
update check |
Check for updates (argo-proxy + llm-rosetta) |
update install [--pre] |
Install latest version |
models [--json] |
List available models and aliases |
Key serve flags:
argo-proxy serve --verbose # verbose logging
argo-proxy serve --force-conversion # always convert via llm-rosetta
argo-proxy serve --username-passthrough # use API key as username
argo-proxy serve --anthropic-stream-mode retry # try non-streaming first
argo-proxy serve --legacy-argo # use legacy ARGO gateway pipeline
argo-proxy serve --dump-requests # dump request/response for debuggingAll four formats are served simultaneously from a single proxy instance:
| Endpoint | Format | Typical Client |
|---|---|---|
/v1/chat/completions |
OpenAI Chat Completions | OpenAI SDK, Aider, OpenCode |
/v1/responses |
OpenAI Responses | Codex CLI |
/v1/messages |
Anthropic Messages | Claude Code, Anthropic SDK |
/v1beta/models/{model}:generateContent |
Google GenAI | Gemini CLI |
/v1beta/models/{model}:streamGenerateContent |
Google GenAI (streaming) | Gemini CLI |
/v1/embeddings |
Embeddings | OpenAI SDK |
| Endpoint | Description |
|---|---|
/v1/models |
List available models (OpenAI-compatible format) |
/refresh |
Reload model list from upstream (POST) |
/health |
Health check |
/version |
Version info with update status |
You can override the default timeout with a timeout parameter in your request body. See Timeout Override Examples for details.
Models are fetched dynamically from upstream at startup. Use argo-proxy models or GET /v1/models to list all available models and aliases. Refresh without restart via POST /refresh.
Model names are flexible and case-insensitive:
- OpenAI:
argo:gpt-4o,gpt-4o,argo:gpt-4.1-mini,argo:o3-mini - Claude:
argo:claude-4-opusorargo:claude-opus-4,argo:claude-4.6-sonnet - Gemini:
argo:gemini-2.5-pro,argo:gemini-2.5-flash - Embedding:
argo:text-embedding-ada-002,argo:text-embedding-3-small
The argo: prefix is optional -- bare model names like gpt-4o or claude-4-sonnet work too.
Native function calling is supported for all three providers:
- OpenAI models: Full native function calling
- Anthropic models: Full native function calling
- Gemini models: Full native function calling
Available on /v1/chat/completions in both streaming and non-streaming modes. Cross-format tool call translation is handled automatically via llm-rosetta.
For usage details, refer to the OpenAI function calling guide and the tool calls documentation.
A lightweight tool management library is also available: ToolRegistry.
Argo-proxy works out of the box with popular AI coding tools:
| Tool | API Format | Base URL Env Var | Value |
|---|---|---|---|
| Claude Code | Anthropic | ANTHROPIC_BASE_URL |
http://localhost:<port> |
| Codex CLI | OpenAI Responses | OPENAI_BASE_URL |
http://localhost:<port>/v1 |
| Aider | OpenAI or Anthropic | OPENAI_API_BASE / ANTHROPIC_BASE_URL |
http://localhost:<port>/v1 |
| Gemini CLI | Google GenAI | GOOGLE_GEMINI_BASE_URL |
http://localhost:<port> |
| OpenCode | OpenAI | OPENAI_BASE_URL |
http://localhost:<port>/v1 |
| Kilo Code | Anthropic | (VS Code settings) | http://localhost:<port> |
All tools use your ANL username as the API key. For detailed setup instructions, see the CLI Tools Integration Guide.
SDK-based (openai.OpenAI):
- Chat Completions | Streaming
- Responses | Streaming
- Function Calling (Chat) | Function Calling (Responses)
- Image Chat | Image Base64
- Embedding
- Legacy Completions | Streaming
REST-based (httpx / requests):
SDK-based (anthropic.Anthropic):
REST-based:
This project is developed in my spare time. Bugs and issues may exist. If you encounter any or have suggestions for improvements, please open an issue or submit a pull request. Your contributions are highly appreciated!