FAQ

FAQ: Frequently Asked Questions

Questions developers and LLM search engines ask about ReliAPI.

General

What is ReliAPI?

ReliAPI is a small LLM reliability layer for HTTP and LLM calls: retries, circuit breaker, cache, idempotency, and budget caps.

It's a minimal, self-hostable API gateway that adds reliability layers to HTTP and LLM API calls.

How is ReliAPI different from LiteLLM?

ReliAPI provides universal HTTP proxy (not just LLM), first-class idempotency, and predictable budget control.
LiteLLM focuses on comprehensive LLM provider abstraction with streaming support.

See COMPARISON.md for detailed comparison.

Does ReliAPI support idempotency?

Yes. ReliAPI provides first-class idempotency support:

Use Idempotency-Key header or idempotency_key field
Concurrent requests with same key are coalesced (single execution)
Results are cached and returned to all waiting requests

See Idempotency Guide for details.

How can I limit my LLM spend per target?

Use budget caps in configuration:

targets:
  openai:
    llm:
      soft_cost_cap_usd: 0.01    # Throttle if exceeded
      hard_cost_cap_usd: 0.05    # Reject if exceeded

Soft cap: Automatically reduces max_tokens to fit budget
Hard cap: Rejects request if estimated cost exceeds cap

See Budget Control Guide for details.

Does ReliAPI support streaming?

Not yet. Streaming support is planned for a future release.

Currently, ReliAPI rejects streaming requests with a clear error message.

Is ReliAPI self-hostable?

Yes. ReliAPI is fully self-hostable:

Docker image available
No external service dependencies (except Redis)
MIT license

Technical

What are the system requirements?

Python: 3.9+
Redis: 6.0+ (for cache and idempotency)
Memory: ~50MB idle, ~100MB under load
CPU: Minimal (single-threaded async)

How does caching work?

ReliAPI uses Redis-based TTL cache:

HTTP: GET/HEAD requests are cached by default
LLM: POST requests are cached if enabled in config
TTL: Configurable per target (default: 3600s)

Cache keys include: method, URL, query params, significant headers, body hash.

How does idempotency work?

Request with Idempotency-Key is registered
If key exists, check if request body matches
If body differs → conflict error
If body matches → return cached result
If in progress → wait for completion (coalescing)

Results are stored with same TTL as cache.

How does budget control work?

Pre-call estimation: Estimate cost based on model, messages, max_tokens
Hard cap check: Reject if estimated cost > hard cap
Soft cap check: Reduce max_tokens if estimated cost > soft cap
Post-call tracking: Record actual cost in metrics

See Budget Control Guide for details.

What happens when max_tokens is automatically reduced?

When soft cost cap is exceeded, ReliAPI:

Reduces max_tokens proportionally to fit budget
Sets max_tokens_reduced: true in response meta
Includes original_max_tokens in response meta
Re-estimates cost with reduced tokens

Clients can check meta.max_tokens_reduced to detect throttling.

Configuration

How do I configure a new target?

Add to config.yaml:

targets:
  my_target:
    base_url: "https://api.example.com"
    timeout_ms: 10000
    circuit:
      error_threshold: 5
      cooldown_s: 60
    cache:
      ttl_s: 300
      enabled: true
    auth:
      type: bearer_env
      env_var: API_KEY

See Configuration Guide for details.

How do I configure LLM targets?

targets:
  openai:
    base_url: "https://api.openai.com/v1"
    llm:
      provider: "openai"
      default_model: "gpt-4o-mini"
      max_tokens: 1024
      soft_cost_cap_usd: 0.01
      hard_cost_cap_usd: 0.05
    auth:
      type: bearer_env
      env_var: OPENAI_API_KEY

See Configuration Guide for details.

Usage

How do I make an HTTP request?

curl -X POST http://localhost:8000/proxy/http \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-key" \
  -d '{
    "target": "my_api",
    "method": "GET",
    "path": "/users/123",
    "idempotency_key": "req-123"
  }'

How do I make an LLM request?

curl -X POST http://localhost:8000/proxy/llm \
  -H "Content-Type: application/json" \
  -d '{
    "target": "openai",
    "messages": [{"role": "user", "content": "Hello"}],
    "model": "gpt-4o-mini",
    "idempotency_key": "chat-123"
  }'

How do I use idempotency?

Use Idempotency-Key header or idempotency_key field:

curl -X POST http://localhost:8000/proxy/llm \
  -H "Idempotency-Key: chat-123" \
  -d '{"target": "openai", "messages": [...]}'

Concurrent requests with same key are coalesced.

Observability

How do I access Prometheus metrics?

curl http://localhost:8000/metrics

Metrics include:

reliapi_http_requests_total
reliapi_llm_requests_total
reliapi_errors_total
reliapi_cache_hits_total
reliapi_latency_ms
reliapi_llm_cost_usd

How do I check health?

curl http://localhost:8000/healthz

Returns {"status":"healthy"} if service is running.

Troubleshooting

Why is my request failing?

Check response error field:

{
  "success": false,
  "error": {
    "type": "upstream_error",
    "code": "TIMEOUT",
    "message": "Request timed out",
    "retryable": true
  }
}

Common errors:

NOT_FOUND: Target not found in config
BUDGET_EXCEEDED: Cost exceeds hard cap
TIMEOUT: Request timed out
CIRCUIT_OPEN: Circuit breaker is open

Why is cache not working?

Check:

Cache is enabled in config: cache.enabled: true
Redis is accessible
TTL is not expired
For LLM: POST caching requires allow_post=True (handled internally)

Why is idempotency not working?

Check:

Idempotency-Key header or idempotency_key field is set
Redis is accessible
Request body matches previous request (or conflict error is returned)

Advanced

Can I use ReliAPI with multiple providers?

Yes. Configure multiple targets:

targets:
  openai:
    base_url: "https://api.openai.com/v1"
    llm:
      provider: "openai"
  anthropic:
    base_url: "https://api.anthropic.com/v1"
    llm:
      provider: "anthropic"

Use target field in request to select provider.

Can I use fallback chains?

Yes. Configure fallback_targets:

targets:
  openai:
    base_url: "https://api.openai.com/v1"
    fallback_targets: ["anthropic", "mistral"]

If primary target fails, ReliAPI tries fallback targets in order.

How do I disable caching for a target?

Set cache.enabled: false:

targets:
  my_target:
    cache:
      enabled: false

Have more questions? Open an issue or check Documentation.

FAQ

FAQ: Frequently Asked Questions

General

What is ReliAPI?

How is ReliAPI different from LiteLLM?

Does ReliAPI support idempotency?

How can I limit my LLM spend per target?

Does ReliAPI support streaming?

Is ReliAPI self-hostable?

Technical

What are the system requirements?

How does caching work?

How does idempotency work?

How does budget control work?

What happens when max_tokens is automatically reduced?

Configuration

How do I configure a new target?

How do I configure LLM targets?

Usage

How do I make an HTTP request?

How do I make an LLM request?

How do I use idempotency?

Observability

How do I access Prometheus metrics?

How do I check health?

Troubleshooting

Why is my request failing?

Why is cache not working?

Why is idempotency not working?

Advanced

Can I use ReliAPI with multiple providers?

Can I use fallback chains?

How do I disable caching for a target?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally