Skip to content

cheahjs/free-llm-api-resources

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 

Repository files navigation

Free LLM API resources

This lists various services that provide free access or credits towards API-based LLM usage.

Note

Please don't abuse these services, else we might lose them.

Warning

This list explicitly excludes any services that are not legitimate (eg reverse engineers an existing chatbot)

Free Providers

Limits:

20 requests/minute
50 requests/day
1000 requests/day with $10 lifetime topup

Models share a common quota.

Data is used for training when used outside of the UK/CH/EEA/EU.

Model NameModel Limits
Gemini 2.5 Pro6,000,000 tokens/day
250,000 tokens/minute
100 requests/day
5 requests/minute
Gemini 2.5 Flash250,000 tokens/minute
250 requests/day
10 requests/minute
Gemini 2.0 Flash1,000,000 tokens/minute
200 requests/day
15 requests/minute
Gemini 2.0 Flash-Lite1,000,000 tokens/minute
200 requests/day
30 requests/minute
Gemini 2.0 Flash (Experimental)250,000 tokens/minute
50 requests/day
10 requests/minute
Gemini 1.5 Flash250,000 tokens/minute
50 requests/day
15 requests/minute
Gemini 1.5 Flash-8B250,000 tokens/minute
50 requests/day
15 requests/minute
LearnLM 2.0 Flash (Experimental)1,500 requests/day
15 requests/minute
Gemma 3 27B Instruct15,000 tokens/minute
14,400 requests/day
30 requests/minute
Gemma 3 12B Instruct15,000 tokens/minute
14,400 requests/day
30 requests/minute
Gemma 3 4B Instruct15,000 tokens/minute
14,400 requests/day
30 requests/minute
Gemma 3 1B Instruct15,000 tokens/minute
14,400 requests/day
30 requests/minute
text-embedding-004150 batch requests/minute
1,500 requests/minute
100 content/batch
Shared Quota
embedding-001

Phone number verification required. Models tend to be context window limited.

Limits: 40 requests/minute

  • Free tier (Experiment plan) requires opting into data training
  • Requires phone number verification.

Limits (per-model): 1 request/second, 500,000 tokens/minute, 1,000,000,000 tokens/month

  • Currently free to use
  • Monthly subscription based
  • Requires phone number verification

Limits: 30 requests/minute, 2,000 requests/day

  • Codestral

HuggingFace Serverless Inference limited to models smaller than 10GB. Some popular models are supported even if they exceed 10GB.

Limits: $0.10/month in credits

  • Various open models across supported providers

Routes to various supported providers.

Limits: $5/month

Free tier restricted to 8K context.

Model NameModel Limits
Qwen 3 32B30 requests/minute
60,000 tokens/minute
900 requests/hour
1,000,000 tokens/hour
14,400 requests/day
1,000,000 tokens/day
Llama 4 Scout30 requests/minute
60,000 tokens/minute
900 requests/hour
1,000,000 tokens/hour
14,400 requests/day
1,000,000 tokens/day
Llama 3.1 8B30 requests/minute
60,000 tokens/minute
900 requests/hour
1,000,000 tokens/hour
14,400 requests/day
1,000,000 tokens/day
Llama 3.3 70B30 requests/minute
60,000 tokens/minute
900 requests/hour
1,000,000 tokens/hour
14,400 requests/day
1,000,000 tokens/day
Model NameModel Limits
Allam 2 7B7,000 requests/day
6,000 tokens/minute
DeepSeek R1 Distill Llama 70B1,000 requests/day
6,000 tokens/minute
Distil Whisper Large v37,200 audio-seconds/minute
2,000 requests/day
Gemma 2 9B Instruct14,400 requests/day
15,000 tokens/minute
Groq compound-beta200 requests/day
70,000 tokens/minute
Groq compound-beta-mini200 requests/day
70,000 tokens/minute
Llama 3 70B14,400 requests/day
6,000 tokens/minute
Llama 3 8B14,400 requests/day
6,000 tokens/minute
Llama 3.1 8B14,400 requests/day
6,000 tokens/minute
Llama 3.3 70B1,000 requests/day
12,000 tokens/minute
Llama 4 Maverick 17B 128E Instruct1,000 requests/day
6,000 tokens/minute
Llama 4 Scout Instruct1,000 requests/day
30,000 tokens/minute
Mistral Saba 24B1,000 requests/day
6,000 tokens/minute
Qwen QwQ 32B1,000 requests/day
6,000 tokens/minute
Whisper Large v37,200 audio-seconds/minute
2,000 requests/day
Whisper Large v3 Turbo7,200 audio-seconds/minute
2,000 requests/day
meta-llama/llama-guard-4-12b14,400 requests/day
15,000 tokens/minute
meta-llama/llama-prompt-guard-2-22m
meta-llama/llama-prompt-guard-2-86m
qwen/qwen3-32b1,000 requests/day
6,000 tokens/minute

Limits: Up to 60 requests/minute

Limits:

20 requests/minute
1,000 requests/month

Models share a common quota.

  • Command-A
  • Command-R7B
  • Command-R+
  • Command-R
  • Aya Expanse 8B
  • Aya Expanse 32B
  • Aya Vision 8B
  • Aya Vision 32B

Extremely restrictive input/output token limits.

Limits: Dependent on Copilot subscription tier (Free/Pro/Pro+/Business/Enterprise)

  • AI21 Jamba 1.5 Large
  • AI21 Jamba 1.5 Mini
  • Codestral 25.01
  • Cohere Command A
  • Cohere Command R
  • Cohere Command R 08-2024
  • Cohere Command R+
  • Cohere Command R+ 08-2024
  • Cohere Embed v3 English
  • Cohere Embed v3 Multilingual
  • DeepSeek-R1
  • DeepSeek-R1-0528
  • DeepSeek-V3-0324
  • Grok 3
  • Grok 3 Mini
  • JAIS 30b Chat
  • Llama 4 Maverick 17B 128E Instruct FP8
  • Llama 4 Scout 17B 16E Instruct
  • Llama-3.2-11B-Vision-Instruct
  • Llama-3.2-90B-Vision-Instruct
  • Llama-3.3-70B-Instruct
  • MAI-DS-R1
  • Meta-Llama-3-70B-Instruct
  • Meta-Llama-3-8B-Instruct
  • Meta-Llama-3.1-405B-Instruct
  • Meta-Llama-3.1-70B-Instruct
  • Meta-Llama-3.1-8B-Instruct
  • Ministral 3B
  • Mistral Large 24.11
  • Mistral Medium 3 (25.05)
  • Mistral Nemo
  • Mistral Small 3.1
  • OpenAI GPT-4.1
  • OpenAI GPT-4.1-mini
  • OpenAI GPT-4.1-nano
  • OpenAI GPT-4o
  • OpenAI GPT-4o mini
  • OpenAI Text Embedding 3 (large)
  • OpenAI Text Embedding 3 (small)
  • OpenAI o1
  • OpenAI o1-mini
  • OpenAI o1-preview
  • OpenAI o3
  • OpenAI o3-mini
  • OpenAI o4-mini
  • Phi-3-medium instruct (128k)
  • Phi-3-medium instruct (4k)
  • Phi-3-mini instruct (128k)
  • Phi-3-mini instruct (4k)
  • Phi-3-small instruct (128k)
  • Phi-3-small instruct (8k)
  • Phi-3.5-MoE instruct (128k)
  • Phi-3.5-mini instruct (128k)
  • Phi-3.5-vision instruct (128k)
  • Phi-4
  • Phi-4-Reasoning
  • Phi-4-mini-instruct
  • Phi-4-mini-reasoning
  • Phi-4-multimodal-instruct

Distributed, decentralized crypto-based compute. Data is sent to individual hosts. Limits: 200 requests/day

  • Various open models

Limits: 10,000 neurons/day

  • DeepSeek R1 Distill Qwen 32B
  • Deepseek Coder 6.7B Base (AWQ)
  • Deepseek Coder 6.7B Instruct (AWQ)
  • Deepseek Math 7B Instruct
  • Discolm German 7B v1 (AWQ)
  • Falcom 7B Instruct
  • Gemma 2B Instruct (LoRA)
  • Gemma 3 12B Instruct
  • Gemma 7B Instruct
  • Gemma 7B Instruct (LoRA)
  • Hermes 2 Pro Mistral 7B
  • Llama 2 13B Chat (AWQ)
  • Llama 2 7B Chat (FP16)
  • Llama 2 7B Chat (INT8)
  • Llama 2 7B Chat (LoRA)
  • Llama 3 8B Instruct
  • Llama 3 8B Instruct
  • Llama 3 8B Instruct (AWQ)
  • Llama 3.1 8B Instruct (AWQ)
  • Llama 3.1 8B Instruct (FP8)
  • Llama 3.2 11B Vision Instruct
  • Llama 3.2 1B Instruct
  • Llama 3.2 3B Instruct
  • Llama 3.3 70B Instruct (FP8)
  • Llama 4 Scout Instruct
  • Llama Guard 3 8B
  • LlamaGuard 7B (AWQ)
  • Mistral 7B Instruct v0.1
  • Mistral 7B Instruct v0.1 (AWQ)
  • Mistral 7B Instruct v0.2
  • Mistral 7B Instruct v0.2 (LoRA)
  • Mistral Small 3.1 24B Instruct
  • Neural Chat 7B v3.1 (AWQ)
  • OpenChat 3.5 0106
  • OpenHermes 2.5 Mistral 7B (AWQ)
  • Phi-2
  • Qwen 1.5 0.5B Chat
  • Qwen 1.5 1.8B Chat
  • Qwen 1.5 14B Chat (AWQ)
  • Qwen 1.5 7B Chat (AWQ)
  • Qwen 2.5 Coder 32B Instruct
  • Qwen QwQ 32B
  • SQLCoder 7B 2
  • Starling LM 7B Beta
  • TinyLlama 1.1B Chat v1.0
  • Una Cybertron 7B v2 (BF16)
  • Zephyr 7B Beta (AWQ)

Very stringent payment verification for Google Cloud.

Model NameModel Limits
Llama 3.2 90B Vision Instruct30 requests/minute
Free during preview
Llama 3.1 70B Instruct60 requests/minute
Free during preview
Llama 3.1 8B Instruct60 requests/minute
Free during preview
DeepSeek R1-052860 requests/minute
Free during preview

Providers with trial credits

Credits: $1 when you add a payment method

Models: Various open models

Credits: $1

Models: Various open models

Credits: $30

Models: Any supported model - pay by compute time

Credits: $1

Models: Various open models

Credits: $0.5 for 1 year, $10 for 3 months for LLMs with referral code + GitHub account connection

Models: Various open models

Credits: $10 for 3 months

Models: Jamba family of models

Credits: $10 for 3 months

Models: Solar Pro/Mini

Credits: $15

Requirements: Phone number verification

Models: Various open models

Credits: 1 million tokens/model

Models: Various open and proprietary Qwen models

Credits: $5/month upon sign up, $30/month with payment method added

Models: Any supported model - pay by compute time

Credits: $1, $25 on responding to email survey

Models: Various open models

Credits: $1

Models: Various open models

Credits: $5

Models:

  • BGE-M3
  • DeepSeek-R1-0528
  • DeepSeek-V3-0324
  • Gemma 3 27B
  • Magistral Small
  • Meta Llama 3.1 8B
  • Meta Llama 3.3 70B
  • Meta Llama 4 Maverick
  • Meta Llama 4 Scout
  • Mistral NeMo
  • Mistral Small
  • Qwen2.5-VL 7B
  • Qwen3-235B-A22B
  • kluster reliability check

Credits: $1

Models:

  • DeepSeek V3
  • DeepSeek V3 0324
  • Hermes 3 Llama 3.1 70B
  • Llama 3 70B Instruct
  • Llama 3.1 405B Base
  • Llama 3.1 405B Base (FP8)
  • Llama 3.1 405B Instruct
  • Llama 3.1 70B Instruct
  • Llama 3.1 8B Instruct
  • Llama 3.2 3B Instruct
  • Llama 3.3 70B Instruct
  • Pixtral 12B (2409)
  • Qwen QwQ 32B
  • Qwen QwQ 32B Preview
  • Qwen2.5 72B Instruct
  • Qwen2.5 Coder 32B Instruct
  • Qwen2.5 VL 72B Instruct
  • Qwen2.5 VL 7B Instruct

Credits: $5 for 3 months

Models:

  • E5-Mistral-7B-Instruct
  • Llama 3.1 8B
  • Llama 3.3 70B
  • Llama-4-Maverick-17B-128E-Instruct
  • Qwen/Qwen3-32B
  • Whisper-Large-v3
  • deepseek-ai/DeepSeek-R1-0528
  • deepseek-ai/DeepSeek-R1-Distill-Llama-70B
  • deepseek-ai/DeepSeek-V3-0324

Credits: 1,000,000 free tokens

Models:

  • BGE-Multilingual-Gemma2
  • DeepSeek R1 Distill Llama 70B
  • Gemma 3 27B Instruct
  • Llama 3.1 70B Instruct
  • Llama 3.1 8B Instruct
  • Llama 3.3 70B Instruct
  • Mistral Nemo 2407
  • Mistral Small 3.1 24B Instruct 2503
  • Pixtral 12B (2409)
  • Qwen2.5 Coder 32B Instruct

About

A list of free LLM inference resources accessible via API.

Topics

Resources

Stars

Watchers

Forks

Languages