otari (Python)

Python client for otari-gateway. Communicate with any LLM provider through the gateway using a single, typed interface.

TypeScript SDK | Documentation | Platform (Beta)

Quickstart

from otari import OtariClient

client = OtariClient(
    api_base="http://localhost:8000",
    platform_token="your-token-here",
)

response = await client.completion(
    model="openai:gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
)

print(response.choices[0].message.content)

That's it! Change the model string to switch between LLM providers through the gateway.

Installation

Requirements

Python 3.11 or newer
A running otari-gateway instance

Install

pip install otari

Setting Up Credentials

Set environment variables for your gateway:

export GATEWAY_API_BASE="http://localhost:8000"
export GATEWAY_PLATFORM_TOKEN="your-token-here"
# or for non-platform mode:
export GATEWAY_API_KEY="your-key-here"

Alternatively, pass credentials directly when creating the client (see Usage examples).

otari-gateway

This Python SDK is a client for otari-gateway, an optional FastAPI-based proxy server that adds enterprise-grade features on top of the core library:

Budget Management - Enforce spending limits with automatic daily, weekly, or monthly resets
API Key Management - Issue, revoke, and monitor virtual API keys without exposing provider credentials
Usage Analytics - Track every request with full token counts, costs, and metadata
Multi-tenant Support - Manage access and budgets across users and teams

The gateway sits between your applications and LLM providers, exposing an OpenAI-compatible API that works with any supported provider.

Quick Start

docker run \
  -e GATEWAY_MASTER_KEY="your-secure-master-key" \
  -e OPENAI_API_KEY="your-api-key" \
  -p 8000:8000 \
  ghcr.io/mozilla-ai/otari/gateway:latest

Note: You can use a specific release version instead of latest (e.g., 1.2.0). See available versions.

Managed Platform (Beta)

Prefer a hosted experience? The otari platform provides a managed control plane for keys, usage tracking, and cost visibility across providers, while still building on the same otari interfaces.

Usage

Authentication Modes

The client supports two authentication modes, matching the TypeScript SDK:

Platform Mode (Recommended)

Uses a Bearer token in the standard Authorization header:

client = OtariClient(
    api_base="http://localhost:8000",
    platform_token="tk_your_platform_token",
)

Non-Platform Mode

Sends the API key via a custom Otari-Key header:

client = OtariClient(
    api_base="http://localhost:8000",
    api_key="your-api-key",
)

Auto-Detection from Environment Variables

When no explicit credentials are provided, the client reads from environment variables:

# Uses GATEWAY_API_BASE, GATEWAY_PLATFORM_TOKEN, or GATEWAY_API_KEY
client = OtariClient()

Chat Completions

response = await client.completion(
    model="openai:gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
)

print(response.choices[0].message.content)

Streaming

stream = await client.completion(
    model="openai:gpt-4o-mini",
    messages=[{"role": "user", "content": "Tell me a story."}],
    stream=True,
)

async for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)

Responses API

response = await client.response(
    model="openai:gpt-4o-mini",
    input="Summarize this in one sentence.",
)

print(response.output_text)

Embeddings

result = await client.embedding(
    model="openai:text-embedding-3-small",
    input="Hello world",
)

print(result.data[0].embedding)

Listing Models

models = await client.list_models()
for model in models:
    print(model.id)

Error Handling

In platform mode, HTTP errors are mapped to typed exceptions:

from otari import OtariClient, AuthenticationError, RateLimitError

try:
    response = await client.completion(
        model="openai:gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello!"}],
    )
except AuthenticationError as e:
    print(f"Invalid credentials: {e.message}")
except RateLimitError as e:
    print(f"Rate limited, retry after: {e.retry_after}")

HTTP Status	Error Class	Description
400 (capability)	`UnsupportedCapabilityError`	Selected provider does not support the requested capability
401, 403	`AuthenticationError`	Invalid or missing credentials
402	`InsufficientFundsError`	Budget or credits exhausted
404	`ModelNotFoundError`	Model not found or unavailable
429	`RateLimitError`	Rate limit exceeded (includes `retry_after`)
502	`UpstreamProviderError`	Upstream provider unreachable
504	`GatewayTimeoutError`	Gateway timed out waiting for provider

UnsupportedCapabilityError surfaces in both platform and non-platform modes; the other mappings are platform-mode only.

Context Manager

The client supports async context manager for automatic cleanup:

async with OtariClient(api_base="http://localhost:8000") as client:
    response = await client.completion(
        model="openai:gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello!"}],
    )

Why choose `otari`?

Simple, unified interface - Single client for all providers through the gateway, switch models with just a string change
Developer friendly - Full type hints for better IDE support and clear, actionable error messages
Leverages the OpenAI SDK - Built on the official OpenAI Python SDK for maximum compatibility
Async-first - Built on AsyncOpenAI for modern async Python applications
Stays framework-agnostic so it can be used across different projects and use cases
Battle-tested - Powers our own production tools (any-agent)

Development

# Create a virtual environment
python -m venv .venv
source .venv/bin/activate

# Install with dev dependencies
pip install -e ".[dev]"

# Run unit tests
pytest tests/

# Lint
ruff check src/ tests/

# Type-check
mypy src/

Documentation

Full Documentation - Complete guides and API reference
Supported Providers - List of all supported LLM providers
Gateway Documentation - Gateway setup and deployment
TypeScript SDK - The TypeScript SDK for Node.js applications
otari Platform (Beta) - Hosted control plane for key management, usage tracking, and cost visibility

Contributing

We welcome contributions from developers of all skill levels! Please see the Contributing Guide or open an issue to discuss changes.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
src/otari		src/otari
tests		tests
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

otari (Python)

Quickstart

Installation

Requirements

Install

Setting Up Credentials

otari-gateway

Quick Start

Managed Platform (Beta)

Usage

Authentication Modes

Platform Mode (Recommended)

Non-Platform Mode

Auto-Detection from Environment Variables

Chat Completions

Streaming

Responses API

Embeddings

Listing Models

Error Handling

Context Manager

Why choose `otari`?

Development

Documentation

Contributing

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

otari (Python)

Quickstart

Installation

Requirements

Install

Setting Up Credentials

otari-gateway

Quick Start

Managed Platform (Beta)

Usage

Authentication Modes

Platform Mode (Recommended)

Non-Platform Mode

Auto-Detection from Environment Variables

Chat Completions

Streaming

Responses API

Embeddings

Listing Models

Error Handling

Context Manager

Why choose otari?

Development

Documentation

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Why choose `otari`?

Packages