LLM APIs with permanent free tiers for text inference.
All endpoints are OpenAI SDK-compatible unless noted. Each link points to the provider's API key page.
APIs run by the companies that train or fine-tune the models themselves.
AI21 Labs 🇮🇱
$10 trial credits at signup, no credit card. Credits expire in 3 months. Covers Jamba Large and Jamba Mini.
Base URL: https://api.ai21.com/studio/v1
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
| Jamba Large 1.7 | 256K | 4K | Text | 200 RPM, 10 RPS |
| Jamba Mini 2 | 256K | 4K | Text | 200 RPM, 10 RPS |
Aion Labs 🇮🇱
Free daily token allowance, no credit card required. Specialized for roleplay and storytelling.
Base URL: https://api.aionlabs.ai/v1
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
| aion-2.0 | 131K | ~32K | Text (roleplay) | Daily token allowance |
| aion-1.0 | 131K | ~32K | Text | Daily token allowance |
| aion-1.0-mini | 131K | ~32K | Text | Daily token allowance |
1M free tokens per Qwen model on signup, expires in 90 days (International / Singapore region). No credit card required. 1
Base URL: https://dashscope-intl.aliyuncs.com/compatible-mode/v1
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
| Qwen3-Max | 128K | 32K | Text | Tiered by region |
| Qwen3-Plus | 1M | 32K | Text | Tiered by region |
| Qwen3-VL-Plus | 128K | 8K | Text + Vision | Tiered by region |
| Qwen3-Coder-Plus | 256K | 8K | Text (code) | Tiered by region |
| QwQ-Plus | 131K | 32K | Text (reasoning) | Tiered by region |
Cohere 🇨🇦
Free "Trial" API key, no credit card. 1,000 API calls/month. Non-commercial use only.
Base URL: https://api.cohere.com/v2
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
| Command A (111B) | 256K | 4K | Text | 20 RPM |
| Command R+ | 128K | 4K | Text | 20 RPM |
| Command R | 128K | 4K | Text | 20 RPM |
| Command R7B | 128K | 4K | Text | 20 RPM |
| Embed 4 | — | — | Embeddings (Text + Image) | 2,000 inputs/min |
| Rerank 3.5 | — | — | Reranking | 10 RPM |
DeepSeek 🇨🇳
5M free tokens on signup, no credit card. Credits expire 30 days after signup; pay-as-you-go after. Prompts may be used for training unless opted out. 2
Base URL: https://api.deepseek.com/v1
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
| deepseek-chat (V3.2) | 128K | 8K | Text | Dynamic |
| deepseek-reasoner (R1) | 128K | 8K | Text (reasoning) | Dynamic |
Free tier unavailable in EU/UK/Switzerland. Free-tier prompts may be used by Google to improve products. 3
Base URL: https://generativelanguage.googleapis.com/v1beta
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
| Gemini 2.5 Pro | 2M | 65K | Text + Image + Audio + Video | 5 RPM, 100 RPD |
| Gemini 2.5 Flash | 1M | 65K | Text + Image + Audio + Video | 10 RPM, 250 RPD |
| Gemini 2.5 Flash-Lite | 1M | 65K | Text + Image + Audio + Video | 15 RPM, 1,000 RPD |
| Gemini 3 Flash (Preview) | 1M | 65K | Text + Image + Audio + Video | Preview limits |
Mistral AI 🇫🇷
Free "Experiment" plan, no credit card. ~1B tokens/month. Prompts may be used to improve models.
Base URL: https://api.mistral.ai/v1
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
| Mistral Small 4 | 256K | 256K | Text + Image + Code | ~1 RPS, 500K TPM |
| Mistral Medium 3 | 128K | 128K | Text | ~1 RPS, 500K TPM |
| Mistral Large 3 | 256K | 256K | Text | ~1 RPS, 500K TPM |
| Mistral Nemo (12B) | 128K | 128K | Text | ~1 RPS, 500K TPM |
| Codestral | 256K | 256K | Code | ~1 RPS, 500K TPM |
| Pixtral Large | 128K | 128K | Text + Image | ~1 RPS, 500K TPM |
xAI 🇺🇸
$25 sign-up credit, no credit card required. One-time only; additional $150/month available via opt-in data-sharing program (requires prior spend). 4
Base URL: https://api.x.ai/v1
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
| grok-4.3 | 1M | ~32K | Text | Credit-based |
| grok-4.1-fast | 2M | ~32K | Text | Credit-based |
| grok-3-mini | 131K | 8K | Text | Credit-based |
Permanent free models, no credit card required.
Base URL: https://open.bigmodel.cn/api/paas/v4
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
| GLM-4.7-Flash | 200K | 128K | Text | 1 concurrent request |
| GLM-4.5-Flash | 128K | ~8K | Text | 1 concurrent request |
| GLM-4.6V-Flash | 128K | ~4K | Text + Image | 1 concurrent request |
Third-party platforms that host open-weight models from various sources.
Cerebras 🇺🇸
Free tier, no credit card. Ultra-fast inference (~2,600 tok/s). 1M tokens/day cap. 8K context cap on free tier. llama3.1-8b scheduled for deprecation May 27, 2026.
Base URL: https://api.cerebras.ai/v1
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
| llama-3.3-70b | 128K (8K on free) | 8K | Text | 30 RPM, 14,400 RPD, 1M TPD |
| gpt-oss-120b | 128K (8K on free) | 8K | Text | 30 RPM, 14,400 RPD, 1M TPD |
| qwen-3-235b-a22b-instruct-2507 | 131K (8K on free) | 8K | Text | 30 RPM, 14,400 RPD, 1M TPD |
| qwen-3-32b | 131K (8K on free) | 8K | Text | 30 RPM, 14,400 RPD, 1M TPD |
| llama-4-scout-17b-16e-instruct | 128K (8K on free) | 8K | Text + Vision | 30 RPM, 14,400 RPD, 1M TPD |
| zai-glm-4.7 | 128K (8K on free) | 8K | Text | 10 RPM, 100 RPD, 1M TPD |
10,000 Neurons/day free. 50+ models available on free tier.
Base URL: https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
@cf/meta/llama-3.3-70b-instruct-fp8-fast |
131K | Shared w/ context | Text | 10K neurons/day (shared) |
@cf/meta/llama-3.1-8b-instruct-fp8-fast |
131K | Shared w/ context | Text | 10K neurons/day (shared) |
@cf/meta/llama-3.2-11b-vision-instruct |
131K | Shared w/ context | Text + Vision | 10K neurons/day (shared) |
@cf/meta/llama-4-scout-17b-16e-instruct |
Up to 10M | Shared w/ context | Multimodal | 10K neurons/day (shared) |
@cf/mistralai/mistral-small-3.1-24b-instruct |
128K | Shared w/ context | Text | 10K neurons/day (shared) |
@cf/google/gemma-4-26b-a4b-it |
256K | Shared w/ context | Text | 10K neurons/day (shared) |
@cf/moonshotai/kimi-k2.5 |
256K | Shared w/ context | Text + Vision | 10K neurons/day (shared) |
@cf/deepseek-ai/deepseek-r1-distill-qwen-32b |
32K | Shared w/ context | Text (reasoning) | 10K neurons/day (shared) |
| + 42 more models | Varies | Varies | Text, Image, Audio, Embeddings | 10K neurons/day (shared) |
Free prototyping for all GitHub users. 45+ models. Per-request limits (8K in / 4K out).
Base URL: https://models.github.ai/inference
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
| gpt-5 | 200K | 32K | Text | 10 RPM, 50 RPD |
| gpt-4.1 | 1M | 32K | Text | 10 RPM, 50 RPD |
| gpt-4.1-mini | 1M | 32K | Text | 15 RPM, 150 RPD |
| gpt-4o | 128K | 16K | Text + Vision | 10 RPM, 50 RPD |
| o4-mini | 200K | 100K | Text (reasoning) | 10 RPM, 50 RPD |
| Llama-4-Scout-17B-16E | 512K | ~4K | Text + Vision | 15 RPM, 150 RPD |
| Llama-4-Maverick-17B-128E | 256K | ~4K | Text + Vision | 10 RPM, 50 RPD |
| Meta-Llama-3.3-70B | 131K | ~4K | Text | 15 RPM, 150 RPD |
| DeepSeek-R1 | 64K | 8K | Text (reasoning) | 15 RPM, 150 RPD |
| Mistral-Small-3.1 | 128K | ~4K | Text + Vision | 15 RPM, 150 RPD |
| + 35 more models | Varies | Varies | Text / Image | Varies by tier |
Groq 🇺🇸
Free tier, no credit card. Ultra-fast LPU inference. 5
Base URL: https://api.groq.com/openai/v1
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
| llama-3.3-70b-versatile | 131K | 32K | Text | 30 RPM, 14,400 RPD |
| llama-3.1-8b-instant | 131K | 131K | Text | 30 RPM, 14,400 RPD |
| llama-4-scout-17b-16e-instruct | 131K | 8K | Text + Vision | 30 RPM, 14,400 RPD |
| llama-4-maverick-17b-128e-instruct | 131K | 8K | Text + Vision | 15 RPM, 500 RPD |
| qwen3-32b | 131K | 131K | Text | 30 RPM, 14,400 RPD |
| gpt-oss-120b | 131K | 32K | Text | 30 RPM, 14,400 RPD |
| kimi-k2-instruct | 262K | 262K | Text | 30 RPM, 14,400 RPD |
| deepseek-r1-distill-70b | 131K | 8K | Text | 30 RPM, 14,400 RPD |
| whisper-large-v3 | — | — | Audio → Text | 20 RPM, 2,000 RPD |
| whisper-large-v3-turbo | — | — | Audio → Text | 20 RPM, 2,000 RPD |
Hugging Face 🇺🇸
100K monthly Inference Provider credits for free users. Routes to Fireworks, Together, Hyperbolic, Nebius, Novita, DeepInfra and others. Thousands of models.
Base URL: https://router.huggingface.co/v1
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
| Meta-Llama-3.1-8B-Instruct | 128K | ~4K | Text | Credit-metered |
| Mistral-7B-Instruct-v0.3 | 32K | ~4K | Text | Credit-metered |
| Mixtral-8x7B-Instruct-v0.1 | 32K | ~4K | Text | Credit-metered |
| Phi-3.5-mini-instruct | 128K | ~4K | Text | Credit-metered |
| Qwen2.5-7B-Instruct | 131K | ~4K | Text | Credit-metered |
| + thousands of community models | Varies | Varies | Text, Image, Audio, Embeddings | 100K credits/month free |
Kilo Code 🇺🇸
Free models with no credit card required. kilo-auto/free auto-router routes to minimax/minimax-m2.5:free (80%) and stepfun/step-3.5-flash:free (20%). 6
Base URL: https://api.kilo.ai/api/gateway
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
x-ai/grok-code-fast-1:free |
256K | — | Text (code) | ~200 req/hr |
minimax/minimax-m2.5:free |
196K | 8K | Text | ~200 req/hr |
bytedance-seed/dola-seed-2.0-pro:free |
— | — | Text | ~200 req/hr |
nvidia/nemotron-3-super-120b-a12b:free |
262K | 32K | Text | ~200 req/hr |
arcee-ai/trinity-large-thinking:free |
— | — | Text (reasoning) | ~200 req/hr |
openrouter/free |
Varies | Varies | Text | ~200 req/hr |
LLM7.io 🇬🇧
Zero-friction API gateway. No registration needed for basic access. 30+ models. GDPR-compliant.
Base URL: https://api.llm7.io/v1
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
| deepseek-r1-0528 | — | — | Text (reasoning) | 30 RPM (120 with token) |
| deepseek-v3-0324 | — | — | Text | 30 RPM (120 with token) |
| gemini-2.5-flash-lite | — | — | Text + Vision | 30 RPM (120 with token) |
| gpt-4o-mini | — | — | Text + Vision | 30 RPM (120 with token) |
| mistral-small-3.1-24b | 32K | — | Text | 30 RPM (120 with token) |
| qwen2.5-coder-32b | — | — | Text (code) | 30 RPM (120 with token) |
| + ~24 more models | Varies | Varies | Text | 30 RPM (120 with token) |
ModelScope 🇨🇳
Free API-Inference for registered users. Requires Alibaba Cloud account binding + real-name verification. 7
Base URL: https://api-inference.modelscope.cn/v1
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
Qwen/Qwen3.5-35B-A3B |
— | — | Text + Vision | 2,000 RPD total; <=500 RPD/model (dynamic) |
Qwen/Qwen3.5-27B |
— | — | Text | 2,000 RPD total; <=500 RPD/model (dynamic) |
Qwen/Qwen-Image |
— | — | Image Generation | 2,000 RPD total; model/AIGC-specific caps |
| + API-Inference-enabled models | Varies | Varies | LLM, MLLM, AIGC | Dynamic quotas + dynamic concurrency |
Nebius 🇳🇱
$1 free signup credits, no credit card required. 60+ open-source models via OpenAI-compatible API. EU-based. 8
Base URL: https://api.studio.nebius.com/v1
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
| Meta-Llama-3.3-70B-Instruct | 128K | ~8K | Text | Tier-based |
| DeepSeek-V3-0324 | 128K | ~8K | Text | Tier-based |
| DeepSeek-R1 | 128K | ~32K | Text (reasoning) | Tier-based |
| Qwen3-235B-A22B | 128K | ~32K | Text | Tier-based |
| gpt-oss-120b | 128K | ~32K | Text | Tier-based |
| + 55 more open-source models | Varies | Varies | Text, Vision, Code, Embeddings | Tier-based |
Nscale 🇬🇧
$5 free signup credits, no credit card required. EU-sovereign provider; data centers in Norway. "No rate limits, no cold starts." 9
Base URL: https://inference.api.nscale.com/v1
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
| Llama-3.3-70B-Instruct | 128K | ~8K | Text | Fair-use |
| Qwen3-Coder-30B-A3B-Instruct | 256K | ~32K | Text (code) | Fair-use |
| DeepSeek-R1-Distill-Llama-70B | 128K | ~32K | Text (reasoning) | Fair-use |
| gpt-oss-120b | 128K | ~32K | Text | Fair-use |
| Qwen3-32B | 128K | ~32K | Text | Fair-use |
NVIDIA NIM 🇺🇸
Free with NVIDIA Developer Program membership. 100+ models. Rate-limited (no daily token cap).
Base URL: https://integrate.api.nvidia.com/v1
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
deepseek-ai/deepseek-r1 |
128K | ~163K | Text (reasoning) | ~40 RPM |
nvidia/llama-3.1-nemotron-ultra-253b-v1 |
128K | 4K | Text | ~40 RPM |
nvidia/nemotron-3-super-120b-a12b |
262K | 262K | Text | ~40 RPM |
nvidia/nemotron-3-nano-30b-a3b |
128K | 32K | Text | ~40 RPM |
meta/llama-3.1-405b-instruct |
128K | 4K | Text | ~40 RPM |
qwen/qwen2.5-72b-instruct |
128K | 8K | Text | ~40 RPM |
google/gemma-4-31b |
128K | 8K | Text | ~40 RPM |
mistralai/mistral-large-2-instruct |
128K | 4K | Text | ~40 RPM |
nvidia/nemotron-nano-2-vl |
128K | 8K | Vision + Text + Video | ~40 RPM |
minimax/minimax-m2.7 |
128K | 8K | Text | ~40 RPM |
| + 90 more models | Varies | Varies | Text, Image, Video, Speech, Embeddings | ~40 RPM |
Ollama Cloud 🇺🇸
Free tier with qualitative usage limits. 400+ models from Ollama library. Not OpenAI SDK-compatible; uses Ollama API. 10
Base URL: https://api.ollama.com
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
gpt-oss:120b-cloud |
128K | Model-dependent | Text | Session/weekly limits (unpublished) |
deepseek-v3.1:671b-cloud |
128K | Model-dependent | Text | Session/weekly limits (unpublished) |
qwen3-coder:480b-cloud |
128K | Model-dependent | Text (code) | Session/weekly limits (unpublished) |
kimi-k2:1t-cloud |
262K | Model-dependent | Text | Session/weekly limits (unpublished) |
glm-4.6:cloud |
128K | Model-dependent | Text | Session/weekly limits (unpublished) |
deepseek-r1:cloud |
128K | Model-dependent | Text (reasoning) | Session/weekly limits (unpublished) |
| + 30 more cloud models | Varies | Varies | Text | Session/weekly limits (unpublished) |
OpenRouter 🇺🇸
~28 free models (marked with :free suffix). OpenAI SDK-compatible. 11
Base URL: https://openrouter.ai/api/v1
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
deepseek/deepseek-r1-0528:free |
163K | ~163K | Text (reasoning) | 20 RPM, 50 RPD |
deepseek/deepseek-chat-v3.1:free |
163K | 163K | Text | 20 RPM, 50 RPD |
qwen/qwen3-235b-a22b:free |
128K | ~32K | Text | 20 RPM, 50 RPD |
qwen/qwen3-coder-480b-a35b:free |
262K | ~32K | Text (code) | 20 RPM, 50 RPD |
meta-llama/llama-4-scout:free |
10M | 16K | Multimodal | 20 RPM, 50 RPD |
meta-llama/llama-4-maverick:free |
1M | 16K | Multimodal | 20 RPM, 50 RPD |
meta-llama/llama-3.3-70b-instruct:free |
65K | ~16K | Text | 20 RPM, 50 RPD |
google/gemma-4-31b-it:free |
256K | ~8K | Multimodal | 20 RPM, 50 RPD |
nvidia/nemotron-3-super-120b-a12b:free |
1M | ~32K | Text | 20 RPM, 50 RPD |
openai/gpt-oss-120b:free |
131K | 131K | Text | 20 RPM, 50 RPD |
minimax/minimax-m2.5:free |
196K | 8K | Text | 20 RPM, 50 RPD |
mistralai/devstral-2512:free |
256K | ~32K | Text | 20 RPM, 50 RPD |
| + ~16 more free models | Varies | Varies | Text / Image | 20 RPM, 50 RPD |
Free anonymous tier (no API key, no signup): 2 RPM per IP per model. 40+ open-weight models hosted in EU. OpenAI SDK-compatible. 12
Base URL: https://oai.endpoints.kepler.ai.cloud.ovh.net/v1
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
| Meta-Llama-3_3-70B-Instruct | 131K | ~4K | Text | 2 RPM (anonymous) |
| Meta-Llama-3_1-8B-Instruct | 131K | ~4K | Text | 2 RPM (anonymous) |
| DeepSeek-R1-Distill-Llama-70B | 131K | ~32K | Text (reasoning) | 2 RPM (anonymous) |
| Qwen3-32B | 131K | ~32K | Text | 2 RPM (anonymous) |
| Qwen3-Coder-30B-A3B-Instruct | 262K | ~32K | Text (code) | 2 RPM (anonymous) |
| Qwen2.5-VL-72B-Instruct | 128K | ~8K | Text + Vision | 2 RPM (anonymous) |
| Mixtral-8x7B-Instruct-v0.1 | 32K | ~4K | Text | 2 RPM (anonymous) |
| Mistral-Nemo-Instruct-2407 | 128K | ~4K | Text | 2 RPM (anonymous) |
| Qwen3Guard-Gen-8B | 32K | ~4K | Text (safety guard) | 2 RPM (anonymous) |
| Qwen3Guard-Gen-0.6B | 32K | ~4K | Text (safety guard) | 2 RPM (anonymous) |
| + 30 more models | Varies | Varies | Text, Vision, Code, Image, Speech | 2 RPM (anonymous) |
SiliconFlow 🇨🇳
3 permanently free models. Free tier capped at 50 req/day; ≥10 CNY lifetime purchase raises cap to 1,000/day. 200+ paid models also available.
Base URL: https://api.siliconflow.cn/v1
| Model Name | Context | Max Output | Modality | Rate Limit |
|---|---|---|---|---|
Qwen/Qwen3-8B |
131K | 131K | Text | 30 RPM, 60K TPM |
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B |
131K | Configurable | Text (reasoning) | 30 RPM, 60K TPM |
deepseek-ai/DeepSeek-OCR |
— | 8K | Vision (OCR) | 30 RPM, 60K TPM |
| Abbreviation | Meaning |
|---|---|
| RPM | Requests per minute |
| RPD | Requests per day |
| TPM | Tokens per minute |
| TPD | Tokens per day |
| RPS | Requests per second |
Know a free tier that's missing? Open a PR. Include the provider, endpoint, rate limits (link to their docs), and a few notable models. Trial credits and time-limited promos don't count.
Footnotes
-
Free quota is signup-only with 90-day expiration and only granted in the Singapore / International region. Alibaba Cloud account requires phone/email verification but no credit card. After exhaustion, pay-as-you-go applies. Use the international endpoint
dashscope-intl.aliyuncs.com; the China region (dashscope.aliyuncs.com) requires real-name verification. ↩ -
DeepSeek grants 5M free tokens at signup with a 30-day expiration. After expiry, pay-as-you-go applies. No credit card required at signup; prompts may be used to improve models unless explicitly opted out in account settings. ↩
-
Free tier not available in the EU, UK, or Switzerland (available regions). ↩
-
xAI's $25 sign-up credit is one-time. Users who opt into the data-sharing program (prompts logged) receive an additional $150/month in credits, but the program requires $5 of prior spend before activation, so it is not a pure free tier. Several older Grok models (grok-4, grok-4-fast, grok-4-1-fast) were retired on May 15, 2026 and now redirect to grok-4.3 (models). ↩
-
Groq rate limits vary by model. Llama 4 Maverick is limited to 500 RPD. Most other models get 14,400 RPD (rate limits). ↩
-
Kilo Code free model list may change over time. nvidia/nemotron-3-super-120b-a12b:free is for trial use only — prompts are logged by NVIDIA. Auto-router
kilo-auto/freeroutes to minimax/minimax-m2.5:free (80%) and stepfun/step-3.5-flash:free (20%). ↩ -
API-Inference is free for registered users. Current published limits are 2,000 requests/day per user (total across models), with per-model daily quotas dynamically adjusted and capped at 500; concurrency is also dynamically rate-limited. Requires Alibaba Cloud account binding and real-name verification (limits, intro). ↩
-
Nebius grants $1 in free credits at signup, usable without a payment method. Credit card required to top up after exhaustion. Promo codes have expiration dates; the base $1 credit typically does not expire. ↩
-
Nscale grants $5 in free signup credits with no credit card required. Credits typically expire within 30–90 days (check console). Credit card required to top up. Pay-per-token after free credits exhausted. EU-sovereign, with data centers in Norway. ↩
-
Ollama Cloud measures usage by GPU time, not tokens or requests. Free tier described as "light usage" with session limits resetting every 5 hours and weekly limits every 7 days. Pro (50x more) and Max (250x more) plans available. Not OpenAI SDK-compatible; uses the Ollama API. ↩
-
Free models default to 50 RPD per model. A one-time purchase of $10+ in credits unlocks 1,000 RPD for free models. OpenRouter also offers a Free Models Router (
openrouter/free) and model fallbacks for chaining models in priority order. Free providers may log prompts for training. ↩ -
OVHcloud AI Endpoints offers a permanent free anonymous tier (2 requests per minute per IP, per model) with no signup or API key required — click "Get your free token" on the OVHcloud AI Endpoints site. Higher rate limits (400 RPM per Public Cloud project per model) require an API key and are billed pay-as-you-go per token; new Public Cloud accounts get up to $200 in free trial credits. Models are hosted in EU data centers. ↩