argo-proxy

A universal API gateway for LLM services via ARGO. Translates between OpenAI, Anthropic, and Google GenAI API formats, routing requests to optimal upstream ARGO endpoints. Works with AI coding tools like Claude Code, Codex CLI, Aider, Gemini CLI, and more.

For detailed documentation, visit the argo-proxy ReadTheDocs page.

TL;DR

pip install argo-proxy   # install
argo-proxy serve         # start the proxy

A single proxy instance serves all 4 major LLM API formats:

API Format	Endpoint	Example Client
OpenAI Chat Completions	`/v1/chat/completions`	OpenAI SDK, Aider, OpenCode
OpenAI Responses	`/v1/responses`	Codex CLI
Anthropic Messages	`/v1/messages`	Claude Code, Kilo Code
Google GenAI	`/v1beta/models/{model}:generateContent`	Gemini CLI

NOTICE OF USAGE

The machine or server making API calls to Argo must be connected to the Argonne internal network or through a VPN on an Argonne-managed computer if you are working off-site. Your instance of the argo proxy should always be on-premise at an Argonne machine. The software is provided "as is," without any warranties. By using this software, you accept that the authors, contributors, and affiliated organizations will not be liable for any damages or issues arising from its use. You are solely responsible for ensuring the software meets your requirements.

Notice of Usage
Deployment
Usage
Bug Reports and Contributions

Deployment

Prerequisites

Python 3.10+ is required.
Recommended: use conda, mamba, or pipx to manage an exclusive environment.
Conda/Mamba Download and install from: https://conda-forge.org/download/
pipx Download and install from: https://pipx.pypa.io/stable/installation/

Install:

PyPI current version:

pip install argo-proxy

To upgrade:

argo-proxy update check    # check for updates (includes dependency status)
argo-proxy update install  # install latest stable
argo-proxy update install --pre  # install latest pre-release

Or, from source (at the repo root):

pip install .

Configuration

The application uses a YAML config file (v3 format). If you don't have one, First-Time Setup will create it interactively.

config_version: "3"
user: "your_username"
host: 0.0.0.0
port: 44497
verbose: true

argo_base_url: "https://apps.inside.anl.gov/argoapi"

Config file search order (first found is used):

./config.yaml (current directory)
~/.config/argoproxy/config.yaml
~/.argoproxy/config.yaml

Migrate from v1/v2 config:

argo-proxy config migrate /path/to/old/config.yaml

Running the Proxy

argo-proxy serve                     # default config search
argo-proxy serve /path/to/config.yaml  # explicit config
argo-proxy serve --verbose --show    # verbose mode, show config at startup

First-Time Setup

Create a new config interactively:

argo-proxy config init

This will:

Prompt for your ANL username
Select a random available port (can be overridden)
Choose upstream environment (prod/dev/test)
Validate connectivity to upstream URLs
Write the config file to ~/.config/argoproxy/config.yaml

Configuration Options Reference

Option	Description	Default
`config_version`	Config format version	`"3"`
`user`	Your ANL username	(required)
`host`	Host address to bind to	`0.0.0.0`
`port`	Port number	`44497`
`verbose`	Enable verbose logging	`true`
`argo_base_url`	Base URL for ARGO API	Dev URL
`native_openai_base_url`	Custom OpenAI endpoint (auto-derived if unset)	—
`native_anthropic_base_url`	Custom Anthropic endpoint (auto-derived if unset)	—
`anthropic_stream_mode`	Non-streaming Anthropic handling: `force`/`retry`/`passthrough`	`force`
`force_conversion`	Always run full format conversion	`false`
`use_legacy_argo`	Use legacy ARGO gateway pipeline	`false`
`skip_url_validation`	Skip upstream URL checks on startup	`false`
`connection_test_timeout`	Seconds for URL validation	`5`
`resolve_overrides`	DNS overrides for SSH tunnels (host:port -> IP)	`{}`
`max_log_history`	Keep last N messages in verbose logs	`3`
`enable_payload_control`	Enable image payload size control	`false`
`max_payload_size`	Max image payload size in MB	`20`
`image_timeout`	Image download timeout in seconds	`30`
`concurrent_downloads`	Parallel image downloads	`10`

CLI Reference

argo-proxy [-h] [--version] {serve,config,logs,update,models}

Command	Description
`serve [config]`	Start the proxy server
`config edit`	Open config in default editor
`config validate`	Validate config and check connectivity
`config show`	Display resolved config
`config migrate`	Migrate v1/v2 config to v3
`config init`	Interactive config setup
`config list`	List all found config files
`config env [prod\|dev\|test]`	Show or switch upstream environment
`logs collect [--type TYPE]`	Collect diagnostic logs
`update check`	Check for updates (argo-proxy + llm-rosetta)
`update install [--pre]`	Install latest version
`models [--json]`	List available models and aliases

Key serve flags:

argo-proxy serve --verbose               # verbose logging
argo-proxy serve --force-conversion      # always convert via llm-rosetta
argo-proxy serve --username-passthrough  # use API key as username
argo-proxy serve --anthropic-stream-mode retry  # try non-streaming first
argo-proxy serve --legacy-argo           # use legacy ARGO gateway pipeline
argo-proxy serve --dump-requests         # dump request/response for debugging

Usage

Endpoints

API Format Endpoints

All four formats are served simultaneously from a single proxy instance:

Endpoint	Format	Typical Client
`/v1/chat/completions`	OpenAI Chat Completions	OpenAI SDK, Aider, OpenCode
`/v1/responses`	OpenAI Responses	Codex CLI
`/v1/messages`	Anthropic Messages	Claude Code, Anthropic SDK
`/v1beta/models/{model}:generateContent`	Google GenAI	Gemini CLI
`/v1beta/models/{model}:streamGenerateContent`	Google GenAI (streaming)	Gemini CLI
`/v1/embeddings`	Embeddings	OpenAI SDK

Utility Endpoints

Endpoint	Description
`/v1/models`	List available models (OpenAI-compatible format)
`/refresh`	Reload model list from upstream (POST)
`/health`	Health check
`/version`	Version info with update status

Timeout Override

You can override the default timeout with a timeout parameter in your request body. See Timeout Override Examples for details.

Models

Models are fetched dynamically from upstream at startup. Use argo-proxy models or GET /v1/models to list all available models and aliases. Refresh without restart via POST /refresh.

Model Naming

Model names are flexible and case-insensitive:

OpenAI: argo:gpt-4o, gpt-4o, argo:gpt-4.1-mini, argo:o3-mini
Claude: argo:claude-4-opus or argo:claude-opus-4, argo:claude-4.6-sonnet
Gemini: argo:gemini-2.5-pro, argo:gemini-2.5-flash
Embedding: argo:text-embedding-ada-002, argo:text-embedding-3-small

The argo: prefix is optional -- bare model names like gpt-4o or claude-4-sonnet work too.

Tool Calls

Native function calling is supported for all three providers:

OpenAI models: Full native function calling
Anthropic models: Full native function calling
Gemini models: Full native function calling

Available on /v1/chat/completions in both streaming and non-streaming modes. Cross-format tool call translation is handled automatically via llm-rosetta.

For usage details, refer to the OpenAI function calling guide and the tool calls documentation.

A lightweight tool management library is also available: ToolRegistry.

AI Coding Tools Integration

Argo-proxy works out of the box with popular AI coding tools:

Tool	API Format	Base URL Env Var	Value
Claude Code	Anthropic	`ANTHROPIC_BASE_URL`	`http://localhost:<port>`
Codex CLI	OpenAI Responses	`OPENAI_BASE_URL`	`http://localhost:<port>/v1`
Aider	OpenAI or Anthropic	`OPENAI_API_BASE` / `ANTHROPIC_BASE_URL`	`http://localhost:<port>/v1`
Gemini CLI	Google GenAI	`GOOGLE_GEMINI_BASE_URL`	`http://localhost:<port>`
OpenCode	OpenAI	`OPENAI_BASE_URL`	`http://localhost:<port>/v1`
Kilo Code	Anthropic	(VS Code settings)	`http://localhost:<port>`

All tools use your ANL username as the API key. For detailed setup instructions, see the CLI Tools Integration Guide.

Examples

OpenAI Format

SDK-based (openai.OpenAI):

REST-based (httpx / requests):

Anthropic Format

SDK-based (anthropic.Anthropic):

REST-based:

Direct ARGO Access

Chat | Stream | Embed

Bug Reports and Contributions

This project is developed in my spare time. Bugs and issues may exist. If you encounter any or have suggestions for improvements, please open an issue or submit a pull request. Your contributions are highly appreciated!

Name		Name	Last commit message	Last commit date
Latest commit History 825 Commits
.github/workflows		.github/workflows
dev_scripts		dev_scripts
examples		examples
scripts		scripts
src/argoproxy		src/argoproxy
test		test
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config.sample.yaml		config.sample.yaml
pyproject.toml		pyproject.toml
run_app.sh		run_app.sh
timeout_examples.md		timeout_examples.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

argo-proxy

TL;DR

NOTICE OF USAGE

Deployment

Prerequisites

Configuration

Running the Proxy

First-Time Setup

Configuration Options Reference

CLI Reference

Usage

Endpoints

API Format Endpoints

Utility Endpoints

Timeout Override

Models

Model Naming

Tool Calls

AI Coding Tools Integration

Examples

OpenAI Format

Anthropic Format

Direct ARGO Access

Bug Reports and Contributions

About

Uh oh!

Releases 58

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

argo-proxy

TL;DR

NOTICE OF USAGE

Deployment

Prerequisites

Configuration

Running the Proxy

First-Time Setup

Configuration Options Reference

CLI Reference

Usage

Endpoints

API Format Endpoints

Utility Endpoints

Timeout Override

Models

Model Naming

Tool Calls

AI Coding Tools Integration

Examples

OpenAI Format

Anthropic Format

Direct ARGO Access

Bug Reports and Contributions

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 58

Uh oh!

Contributors

Uh oh!

Languages