A unified, provider-agnostic Python client for multiple LLM APIs. Query any LLM (OpenAI, Anthropic Claude, Google Gemini, Mistral, DeepSeek, Qwen, OpenRouter, and more) through a single, consistent interface.
Perfect for: Research workflows, benchmarking studies, automated testing, and applications that need to work with multiple LLM providers without dealing with their individual APIs.
This package is a convenience wrapper for working with multiple LLM providers through a unified interface. It is not intended as a replacement for the official provider libraries (openai, anthropic, google-genai, etc.).
- You need to query multiple LLM providers in the same project
- You're building benchmarking or comparison tools
- You want a consistent interface across providers
- You need provider-agnostic code for research workflows
- You need cutting-edge features on day one of release
- You require provider-specific advanced features
- You only work with a single provider
Update pace: This package is maintained by a small team and may not immediately support every new feature from upstream providers. We prioritize stability and cross-provider compatibility over bleeding-edge feature coverage.
- Provider-Agnostic: Single interface for OpenAI, Anthropic, Google, Mistral, DeepSeek, Qwen, and OpenRouter
- Multimodal Support: Text + images across all supporting providers
- Structured Output: Unified Pydantic model support across providers
- Rich Response Objects: Detailed token usage, costs, timing, and metadata
- Async Support: Parallel processing for faster benchmarks
- Built-in Retry Logic: Automatic exponential backoff for rate limits
- Custom Base URLs: Easy integration with OpenRouter, sciCORE, and other OpenAI-compatible APIs
pip install generic-llm-api-clientfrom ai_client import create_ai_client
# Create a client for any provider
client = create_ai_client('openai', api_key='sk-...')
# Send a prompt
response, duration = client.prompt('gpt-4', 'What is 2+2?')
print(f"Response: {response.text}")
print(f"Tokens used: {response.usage.total_tokens}")
print(f"Time: {duration:.2f}s")| Provider | ID | Multimodal | Structured Output |
|---|---|---|---|
| OpenAI | openai |
Yes | Yes |
| Anthropic Claude | anthropic |
Yes | Yes (via tools) |
| Google Gemini | genai |
Yes | Yes |
| Mistral | mistral |
Yes | Yes |
| DeepSeek | deepseek |
Yes | Yes |
| Qwen | qwen |
Yes | Yes |
| OpenRouter | openrouter |
Yes | Yes |
| sciCORE | scicore |
Yes | Yes |
from ai_client import create_ai_client
client = create_ai_client('anthropic', api_key='sk-ant-...')
response, duration = client.prompt(
'claude-3-5-sonnet-20241022',
'Explain quantum computing in simple terms'
)
print(response.text)from ai_client import create_ai_client
client = create_ai_client('openai', api_key='sk-...')
response, duration = client.prompt(
'gpt-4o',
'Describe this image in detail',
images=['path/to/image.jpg']
)
print(response.text)response, duration = client.prompt(
'gpt-4o',
'Compare these two images',
images=['image1.jpg', 'image2.jpg']
)from pydantic import BaseModel
from ai_client import create_ai_client
class Person(BaseModel):
name: str
age: int
occupation: str
client = create_ai_client('openai', api_key='sk-...')
response, duration = client.prompt(
'gpt-4',
'Extract: John Smith is a 35-year-old software engineer',
response_format=Person
)
# Parse the response
import json
person_data = json.loads(response.text)
person = Person(**person_data)
print(f"{person.name}, {person.age}, {person.occupation}")import asyncio
from ai_client import create_ai_client
async def process_batch():
client = create_ai_client('openai', api_key='sk-...')
# Process multiple prompts in parallel
tasks = [
client.prompt_async('gpt-4', f'Tell me about {topic}')
for topic in ['Python', 'JavaScript', 'Rust']
]
results = await asyncio.gather(*tasks)
for response, duration in results:
print(f"({duration:.2f}s) {response.text[:100]}...")
asyncio.run(process_batch())from ai_client import create_ai_client
# OpenRouter - access to 100+ models
client = create_ai_client(
'openrouter',
api_key='sk-or-...',
base_url='https://openrouter.ai/api/v1',
default_headers={
"HTTP-Referer": "https://your-site.com",
"X-Title": "Your App"
}
)
response, _ = client.prompt('anthropic/claude-3-opus', 'Hello!')
# sciCORE (University HPC)
client = create_ai_client(
'scicore',
api_key='your-key',
base_url='https://llm-api-h200.ceda.unibas.ch/litellm/v1'
)
response, _ = client.prompt('deepseek/deepseek-chat', 'Hello!')response, duration = client.prompt('gpt-4', 'Hello')
# Response text
print(response.text)
# Token usage
print(f"Input tokens: {response.usage.input_tokens}")
print(f"Output tokens: {response.usage.output_tokens}")
print(f"Total tokens: {response.usage.total_tokens}")
# Metadata
print(f"Model: {response.model}")
print(f"Provider: {response.provider}")
print(f"Finish reason: {response.finish_reason}")
print(f"Duration: {response.duration}s")
# Raw provider response (for detailed analysis)
raw = response.raw_response
# Convert to dict (for JSON serialization)
response_dict = response.to_dict()from ai_client import create_ai_client
# OpenAI
client = create_ai_client(
'openai',
api_key='sk-...',
temperature=0.7,
max_tokens=500,
frequency_penalty=0.5
)
# Claude
client = create_ai_client(
'anthropic',
api_key='sk-ant-...',
temperature=1.0,
max_tokens=4096,
top_k=40
)
# Settings can also be passed per-request
response, _ = client.prompt(
'gpt-4',
'Hello',
temperature=0.9,
max_tokens=100
)from ai_client import create_ai_client
client = create_ai_client(
'openai',
api_key='sk-...',
system_prompt="You are a helpful coding assistant specialized in Python."
)
# Override for specific request
response, _ = client.prompt(
'gpt-4',
'Write a haiku',
system_prompt="You are a poetic assistant."
)Perfect for research workflows that need to evaluate multiple models:
from ai_client import create_ai_client
import asyncio
async def benchmark_models():
providers = [
('openai', 'gpt-4'),
('anthropic', 'claude-3-5-sonnet-20241022'),
('genai', 'gemini-2.0-flash-exp'),
]
prompt = 'Explain quantum entanglement'
for provider_id, model in providers:
client = create_ai_client(provider_id, api_key=f'{provider_id}_key')
response, duration = await client.prompt_async(model, prompt)
print(f"\n=== {provider_id}/{model} ===")
print(f"Duration: {duration:.2f}s")
print(f"Tokens: {response.usage.total_tokens}")
print(f"Response: {response.text[:200]}...")
asyncio.run(benchmark_models())The package includes built-in retry logic with exponential backoff:
from ai_client import create_ai_client, RateLimitError, APIError
client = create_ai_client('openai', api_key='sk-...')
try:
response, duration = client.prompt('gpt-4', 'Hello')
# Automatically retries up to 3 times on rate limit errors
except RateLimitError as e:
print(f"Rate limited after retries: {e}")
except APIError as e:
print(f"API error: {e}")
except Exception as e:
print(f"Unknown error: {e}")from ai_client import create_ai_client
client = create_ai_client('openai', api_key='sk-...')
models = client.get_model_list()
for model_id, created_date in models:
print(f"{model_id} (created: {created_date})")client = create_ai_client('openai', api_key='sk-...')
if client.has_multimodal_support():
print("This provider supports images!")ai_client/
__init__.py # Package exports
base_client.py # BaseAIClient + factory
response.py # LLMResponse, Usage dataclasses
utils.py # Retry logic, exceptions, utilities
openai_client.py # OpenAI implementation
claude_client.py # Anthropic Claude
gemini_client.py # Google Gemini
mistral_client.py # Mistral AI
deepseek_client.py # DeepSeek
qwen_client.py # Qwen
- Python >=3.9
- anthropic ~=0.71.0
- openai ~=2.6.1
- mistralai ~=1.9.11
- google-genai ~=1.46.0
- requests ~=2.32.5
# Clone the repository
git clone https://github.com/RISE-UNIBAS/generic-llm-api-client.git
cd generic-llm-api-client
# Install in development mode
pip install -e ".[dev]"
# Run tests
pytest
# Run integration tests (requires API keys)
pytest -m integration
# Format code
black ai_client tests
# Type checking
mypy ai_client/- EXAMPLES.md - Comprehensive usage examples
- PUBLISHING.md - Guide for maintainers on publishing releases
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
If you use this package in your research, please cite:
@software{generic_llm_api_client,
author = {Sorin Marti},
title = {Generic LLM API Client: A Unified Interface for Multiple LLM Providers},
year = {2025},
url = {https://github.com/RISE-UNIBAS/generic-llm-api-client}
}- GitHub Issues: Report bugs or request features
- Documentation: Full documentation
- Tool use / function calling support
- Streaming support
- Conversation history management
- More providers (Cohere, AI21, etc.)
- Cost estimation utilities
- Prompt caching support