AIAPI is a small web service that sends one generation request to different AI providers and returns a standard JSON response. It supports OpenAI, Claude, Gemini, and Microsoft Copilot (Azure OpenAI).
# Go to the project folder
cd aiapi
# Install the dependencies defined in pyproject.toml
uv sync
# Install the development tools too, if you want to work on the project
uv sync --group dev
# Start the app with the main entrypoint
uv run python main.py
# Or run FastAPI directly with Uvicorn
uv run uvicorn infrastructure.api.app:app --reload --host 127.0.0.1 --port 8080uv synccreates the virtual environment and installs all runtime dependencies.uv sync --group devadds tools likeruff,mypy, andpre-commit.- The application runs on
http://127.0.0.1:8080.
This project reads its secrets from a .env file in the project root. Create it like this:
# OpenAI
OPENAI_API_KEY="your_openai_api_key"
# Claude
CLAUDE_API_KEY="your_claude_api_key"
# Gemini
GEMINI_API_KEY="your_gemini_api_key"
# Microsoft Copilot (Azure OpenAI)
COPILOT_API_KEY="your_azure_openai_api_key"
COPILOT_API_ENDPOINT="https://your-resource.openai.azure.com/"
COPILOT_API_VERSION="2024-10-21"
COPILOT_DEPLOYMENT="your_default_deployment_name"- Go to the OpenAI API keys page: https://platform.openai.com/api-keys.
- Sign in to your OpenAI account.
- Create a new API key in the dashboard.
- Copy the key and paste it into
.envasOPENAI_API_KEY.
- Open the Anthropic Console: https://platform.claude.com/.
- Sign in and go to the keys page: https://platform.claude.com/settings/keys.
- Create a new API key.
- Copy the key and save it in
.envasCLAUDE_API_KEY.
- Open Google AI Studio: https://aistudio.google.com/app/apikey.
- If needed, sign in with your Google account.
- Create or import a Google Cloud project in AI Studio.
- Generate a Gemini API key and save it in
.envasGEMINI_API_KEY.
- Go to the Azure Portal and create an Azure OpenAI resource.
- Once created, open the resource and go to Keys and Endpoint to copy your key and endpoint.
- In Azure OpenAI Studio, deploy a model (e.g.
gpt-4o) and note the deployment name. - Set the four values in
.env:COPILOT_API_KEYβ the API key from Keys and Endpoint.COPILOT_API_ENDPOINTβ the endpoint URL (e.g.https://your-resource.openai.azure.com/).COPILOT_API_VERSIONβ the REST API version (e.g.2024-10-21).COPILOT_DEPLOYMENTβ the default deployment name used when the request omitsdeployment.
POST http://127.0.0.1:8080/api/v1/generate
{
"provider": "openai_api",
"model": "gpt-4o-mini",
"prompt": "Write a short summary about hexagonal architecture.",
"temperature": 0.2,
"top_p": 0.9,
"top_k": 40,
"max_tokens": 200
}For Microsoft Copilot (Azure OpenAI), two additional optional fields are available:
| Field | Type | Description |
|---|---|---|
deployment |
string |
Azure OpenAI deployment name. Overrides model and the COPILOT_DEPLOYMENT env var. |
api_version |
string |
Azure OpenAI REST API version (accepted but deferred to v2; the env-level version is used). |
Example Copilot request:
{
"provider": "copilot_api",
"model": "gpt-4o",
"prompt": "Write a short summary about hexagonal architecture.",
"temperature": 0.2,
"top_p": 0.9,
"max_tokens": 200,
"deployment": "gpt-4o-prod"
}{
"success": true,
"provider": "openai_api",
"model": "gpt-4o-mini",
"result": "Hexagonal architecture keeps business logic isolated from external systems.",
"error": null
}All provider values are lowercase. Sending an uppercase value (e.g. "OPENAI_API") returns a 422 Unprocessable Entity.
| Provider | provider value |
|---|---|
| OpenAI | openai_api |
| Claude (Anthropic) | claude_api |
| Gemini (Google) | gemini_api |
| Microsoft Copilot (Azure OpenAI) | copilot_api |
The API returns structured error bodies for 400 and 500 responses:
{ "code": "UNSUPPORTED_PROVIDER", "message": "Provider 'x' is not supported.", "supported": ["openai_api", "claude_api", "gemini_api", "copilot_api"] }
{ "code": "PROVIDER_MISCONFIGURED", "message": "..." }
{ "code": "INTERNAL_ERROR", "message": "An unexpected error occurred." }You can call the API directly with curl and format the response with jq:
curl -s -X POST http://127.0.0.1:8080/api/v1/generate \
-H "Content-Type: application/json" \
-d '{
"provider": "gemini_api",
"model": "gemini-2.5-flash",
"prompt": "Escribe un resumen breve sobre Python.",
"temperature": 0.2,
"top_p": 0.9,
"top_k": 20,
"max_tokens": 256
}' | jq .Microsoft Copilot example:
curl -s -X POST http://127.0.0.1:8080/api/v1/generate \
-H "Content-Type: application/json" \
-d '{
"provider": "copilot_api",
"model": "gpt-4o",
"prompt": "Escribe un resumen breve sobre Python.",
"temperature": 0.2,
"top_p": 0.9,
"max_tokens": 256,
"deployment": "gpt-4o-prod"
}' | jq .- Make sure the server is running before you send the request.
- Use
jqto read the JSON response in a nicer format. - You can change the provider, model, and sampling parameters as needed.
- You can use Postman or other tools to call the API.
- For Copilot,
deploymenttakes priority overmodelandCOPILOT_DEPLOYMENTfor selecting the Azure deployment.
application/ # Core use-case logic and data contracts
βββ domain/
β βββ ai_provider.py # AIProvider enum (openai_api, claude_api, gemini_api, copilot_api)
βββ dto/
β βββ ai_request.py # Request data model (provider, model, prompt, sampling params, deployment, api_version)
β βββ ai_response.py # Response data model
βββ ports/
β βββ ai_provider_port.py # Provider interface definition
βββ services/
βββ ai_service.py # Service that calls the provider and closes client
bootstrap/ # Factory and dependency wiring for adapters
βββ ai_factory.py # Creates the selected AI provider adapter
βββ dependencies.py # Helper to build service instances
infrastructure/ # External adapters and API layer
βββ adapters/
β βββ openai_adapter.py # Adapter for OpenAI API
β βββ claude_adapter.py # Adapter for Claude (Anthropic)
β βββ gemini_adapter.py # Adapter for Gemini (Google)
β βββ copilot_adapter.py # Adapter for Microsoft Copilot (Azure OpenAI)
βββ api/
βββ app.py # FastAPI application and route definitions
config.py # Global configuration settings (includes COPILOT_* env-var names)
main.py # Application entrypoint (starts the server)
pyproject.toml # Project metadata and dependencies
This project includes custom Claude Code agents to accelerate development. Each agent has a specific role and should be used at the right stage of the workflow.
A senior software architect that designs features and systems from scratch. It produces a complete RFC document β use case diagrams, package diagrams, class diagrams, sequence diagrams, entity-relationship diagrams, JSON data models, and OpenAPI/Swagger contracts.
When to use it:
- You need to add a new feature and want a full architectural design before writing code.
- You need to define packages, classes, data models, API contracts, or design patterns.
- You want a structured RFC that the entire team (Backend, Frontend, QA) can follow.
β οΈ This agent does not write production code. It delivers the design document and defers implementation to the team or thebackend-developeragent.
How to use it:
@software-architect Design a new provider adapter for Microsoft Copilot LLM API following the hexagonal architecture already in place.
The agent will explore the codebase, ask clarifying questions, evaluate alternatives, and write an RFC file under docs/rfc/.
A senior Python backend developer that turns an already-decided architecture design (RFC, ADR) into functional, tested code. It advances step by step, pausing for human validation between changes.
When to use it:
- An RFC or ADR has been approved and you want to implement it.
- You need to add REST endpoints, business logic, data models, or unit/integration tests.
- You want code that follows the project's conventions, SOLID principles, and quality gates.
β οΈ This agent does not make architecture decisions. If a change requires rethinking the design, it stops and defers to the architect.
How to use it:
@backend-developer Implement the Microsoft Copilot adapter described in docs/rfc/20250601-copilot-adapter.md.
The agent will read the RFC, explore the codebase, present an implementation plan for your approval, and then implement each phase waiting for your validation before continuing.
1. ποΈ software-architect β RFC document in docs/rfc/
2. β
Human review β Approve or adjust the RFC
3. π» backend-developer β Implementation + tests, phase by phase
4. β
Human validation β Review each phase before the next one
This project uses hexagonal architecture to keep the core logic independent from external services.
applicationcontains the use case logic and the data contracts.application/portsdefines the interface that every AI provider must follow.bootstrapwires the selected provider with the service layer.infrastructure/adapterscontains the concrete clients for OpenAI, Claude, Gemini, and Microsoft Copilot (Azure OpenAI).infrastructure/apiexposes the FastAPI web layer.main.pystarts the server.
- OpenAI: forwards
model,prompt,temperature,top_p. OpenAI Chat/Completions supportstemperatureandtop_p, but does not supporttop_kin the current chat API. - Claude (Anthropic): forwards
model,prompt,temperature,max_tokens,top_k. Claude supportstemperatureortop_pandtop_k(thetop_pandtemperatureparameters cannot be combined). - Gemini (Google): forwards
model,prompt,temperature,top_p,top_k. Gemini supportstemperature,top_p, andtop_kviaGenerateContentConfig. - Microsoft Copilot (Azure OpenAI): forwards
deployment(or falls back tomodel/COPILOT_DEPLOYMENT),prompt,temperature,top_p,max_tokens. Does not supporttop_kβ Azure OpenAI rejects it with a 400. Also acceptsapi_versionin the request body (deferred to v2; the env-level version is used in v1).
| Parameter | OpenAI | Claude | Gemini | Copilot (Azure) |
|---|---|---|---|---|
model / deployment |
β | β | β | β (see note) |
prompt |
β | β | β | β |
temperature |
β | β (XOR top_p) | β | β |
top_p |
β | β (XOR temperature) | β | β |
top_k |
β | β | β | β |
max_tokens |
β | β (required) | β | β |
deployment |
β | β | β | β
(overrides model) |
api_version |
β | β | β | accepted, v2 |
Copilot deployment note: Azure OpenAI requires a deployment name (not a raw model ID). The adapter resolves it in this order:
deploymentfield βmodelfield βCOPILOT_DEPLOYMENTenv var.
The parameters control sampling randomness and diversity in a similar way across providers, but availability differs:
temperature(all providers): float that scales the model probability distribution. Lower (e.g. 0.2) β more deterministic; higher (e.g. 1.0) β more random.top_p(OpenAI, Gemini, Copilot): nucleus sampling threshold. The model samples from the smallest token set whose cumulative probability β₯top_p.top_k(Claude, Gemini): hard cutoff to the topkmost probable tokens; sampling is limited to those tokens.
Summary: OpenAI (temperature, top_p), Claude (temperature, top_k), Gemini (temperature, top_p, top_k), Copilot/Azure (temperature, top_p, max_tokens β no top_k).