Skip to content

ReadyPixels/AI_Models_Matrix

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

6 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Awesome AI Models Matrix Awesome

Resources

๐Ÿš€ A curated awesome list of top AI models and Large Language Models (LLMs) with comprehensive specifications, benchmarks, pricing, and official resources.

A comprehensive, community-driven resource meticulously curated to help developers, researchers, and organizations navigate the rapidly evolving landscape of artificial intelligence. This matrix provides transparent, up-to-date comparisons of 48+ leading AI models across performance benchmarks, pricing structures, deployment options, and licensing terms.

Whether you're building the next breakthrough application, conducting cutting-edge research, or making strategic technology decisions for your enterprise, this guide empowers you with the critical information needed to choose the perfect AI model for your specific needs.

Last Updated: ๐Ÿ—“๏ธ October 7, 2025 Total Models: 48+ models from 15+ companies Data Sources: OpenRouter Rankings, LLM-Stats.com, Official Documentation


๐Ÿ“‹ Table of Contents


๐ŸŽฏ About This Matrix

This matrix provides a comprehensive overview of the leading AI models and LLMs available in 2025. Whether you're a ๐Ÿ‘จโ€๐Ÿ’ป developer, ๐Ÿ”ฌ researcher, or ๐Ÿข enterprise decision-maker, this guide helps you understand the capabilities, costs, and trade-offs of each model.

โœจ What's Included

  • ๐Ÿ†• Latest Versions - Up-to-date information on model releases
  • ๐Ÿ“Š Benchmarks - HumanEval, MMLU, SWE-bench, and other metrics
  • ๐Ÿ’ฐ Pricing - Transparent cost information per 1M tokens
  • ๐Ÿ–ฅ๏ธ Self-Hosting - Open-source availability and licensing
  • ๐Ÿ”— Official Links - Direct access to documentation and APIs
  • ๐ŸŽฏ Sortable Tables - Filter by update date, company, pricing, and more

๐Ÿง  Understanding LLMs

What are Large Language Models?

Large Language Models (LLMs) are AI systems trained on vast amounts of text data to understand and generate human-like language. They power applications ranging from ๐Ÿ’ฌ chatbots and ๐Ÿ’ป coding assistants to โœ๏ธ content creation and ๐Ÿ“Š data analysis.

๐Ÿ”‘ Key Concepts

  • Parameters ๐Ÿ“ - The size of the model's neural network (billions or trillions of values)
  • Context Window ๐Ÿ“– - How much text the model can process at once (measured in tokens)
  • Reasoning Models ๐Ÿง  - Models that "think step-by-step" before answering
  • Multimodal ๐ŸŽจ - Models that can process text, images, audio, and video
  • MoE (Mixture of Experts) ๐ŸŽญ - Architecture that activates only relevant parts for efficiency
  • RAG (Retrieval Augmented Generation) ๐Ÿ” - Enhancing models with external knowledge

๐ŸŒŸ Model Generations (2023-2025)

  1. GPT-3.5 Era (2022-2023) ๐Ÿ“ฑ - Foundation of conversational AI
  2. GPT-4 Era (2023-2024) ๐ŸŽจ - Multimodal capabilities and improved reasoning
  3. Reasoning Era (2024-2025) ๐Ÿง  - Step-by-step thinking models (o1, R1)
  4. Unified Era (2025) โšก - Hybrid models combining speed and reasoning
  5. Open-Source Surge (2025) ๐Ÿ†“ - GPT-OSS, DeepSeek, Qwen competing with proprietary

๐Ÿ†• Latest Updates

๐Ÿ”ฅ Recent Additions (Last 30 Days)

Date Model Company Highlights Market Share*
2025-10-03 ๐Ÿข IBM Granite 4.0 IBM ISO 42001 certified, Mamba/Transformer hybrid, open weights -
2025-09-30 ๐Ÿ‡จ๐Ÿ‡ณ GLM-4.6 Zhipu AI 355B MoE, real-world coding, open weights, $0.13/$0.39 6.0% tool calls
2025-09-30 ๐Ÿ”ฌ DeepSeek-V3.2-Exp DeepSeek Sparse Attention (DSA), efficient long-context, MIT license 11.4% overall
2025-09-29 ๐Ÿ‘‘ Claude 4.5 Sonnet Anthropic #2 Most Used - 35.5B tokens on OpenRouter 13.6% overall
2025-09-23 ๐Ÿค– GPT-5 Codex OpenAI 7+ hour autonomous coding, API and open weights 13.8% overall
2025-09-11 ๐Ÿชถ Phi-4 Microsoft 14B compact reasoning, MIT license, self-hostable -
2025-09-10 ๐Ÿ‡จ๐Ÿ‡ณ Qwen3-Next Alibaba Apache 2.0, strong coding, open weights -
2025-09-09 ๐ŸŒ™ Kimi K2-0905 Moonshot AI 256K context, agentic coding, Modified MIT -
2025-09-05 ๐Ÿ‡จ๐Ÿ‡ณ Qwen3-Max Alibaba 1T+ params, ranks 3rd globally, open weights -
2025-09-03 ๐Ÿ‡ช๐Ÿ‡บ TildeOpen LLM Tilde AI 30B params, 34 European languages, open weights -
2025-08-21 ๐Ÿ”ฌ DeepSeek-V3.1 DeepSeek Hybrid architecture, 40% improvement, MIT license 11.4% overall
2025-08-07 ๐Ÿค– GPT-5 OpenAI Unified reasoning, multimodal, open weights -
2025-08-05 ๐Ÿ†“ GPT-OSS-120B OpenAI First open-weight since GPT-2, Apache 2.0 -
2025-08-05 ๐Ÿ†“ GPT-OSS-20B OpenAI #6 Most Used - 20.7B tokens on OpenRouter, Apache 2.0 -
2025-08-01 ๐Ÿค— StarCoder2 BigCode/Hugging Face New open-source coding models, community-driven -
2025-07-15 ๐Ÿ’ป Magistral Mistral AI European reasoning, open weights, self-hostable -

*Market Share Data from OpenRouter.ai (Live Usage Stats) | 2025-09-10 | ๐Ÿ‡จ๐Ÿ‡ณ Qwen3-Next | Alibaba | Apache 2.0, strong coding | | 2025-09-09 | ๐ŸŒ™ Kimi K2-0905 | Moonshot AI | 256K context, agentic coding | | 2025-09-05 | ๐Ÿ‡จ๐Ÿ‡ณ Qwen3-Max | Alibaba | 1T+ params, ranks 3rd globally | | 2025-09-03 | ๐Ÿ‡ช๐Ÿ‡บ TildeOpen LLM | Tilde AI | 30B params, 34 European languages | | 2025-08-21 | ๐Ÿ”ฌ DeepSeek-V3.1 | DeepSeek | Hybrid architecture, 40% improvement | | 2025-08-07 | ๐Ÿค– GPT-5 | OpenAI | Unified reasoning, multimodal | | 2025-08-05 | ๐Ÿ†“ GPT-OSS-120B | OpenAI | First open-weight since GPT-2, Apache 2.0 | - | | 2025-08-05 | ๐Ÿ†“ GPT-OSS-20B | OpenAI | #6 Most Used - 20.7B tokens on OpenRouter | - |

๐Ÿ”ฅ Top Performers (Real-World Usage - OpenRouter Rankings)

Rank Model Company Usage Market Position
๐Ÿฅ‡ #1 Grok Code Fast xAI 174B tokens 50% of coding category
๐Ÿฅˆ #2 Claude 4.5 Sonnet Anthropic 35.5B tokens 10.2% overall usage
๐Ÿฅ‰ #3 Qwen3-Coder 30B Alibaba 21.6B tokens 6.2% coding market
#4 Claude 4 Sonnet Anthropic 20.8B tokens 6.0% overall
#5 GPT-OSS-20B OpenAI 20.7B tokens 6.0% overall

Market Share by Company (OpenRouter Live Stats):

  • ๐Ÿฅ‡ xAI: 26.8% (426B tokens)
  • ๐Ÿฅˆ Google: 18.9% (301B tokens)
  • ๐Ÿฅ‰ OpenAI: 13.8% (220B tokens)
  • ๐Ÿ… Anthropic: 13.6% (217B tokens)
  • ๐Ÿ… DeepSeek: 11.4% (181B tokens)

๐Ÿ“Š Model Comparison Tables

๐Ÿ”„ Sort by Latest Update

๐Ÿ’ก Models sorted by most recently updated first - โญ indicates updates within last 30 days

๐Ÿข Company ๐Ÿค– Model ๐Ÿ“ฆ Version ๐Ÿ“… Release ๐Ÿ”„ Last Updated ๐Ÿ’ป Coding ๐Ÿ“Š Benchmarks ๐Ÿ’ฐ Price ($/1M) ๐Ÿ–ฅ๏ธ Self-Host ๐ŸŒŸ Usage Rank ๐Ÿ”— Link
๐Ÿข IBM Granite 4.0 4.0 Small 2025-10-03 2025-10-03 โญ โœ… Good ~70% / ~75% ๐Ÿ†“ Free โœ… Apache 2.0 - ๐Ÿ”—
๐Ÿ‡จ๐Ÿ‡ณ Zhipu AI GLM-4.6 GLM-4.6 2025-09-30 2025-09-30 โญ โœ… Excellent ~85% / ~84% $0.13 / $0.39 โœ… Open-weight #4 Tool Calls ๐Ÿ”—
๐Ÿ”ฌ DeepSeek DeepSeek-V3.2-Exp V3.2-Exp 2025-09-29 2025-09-30 โญ โœ… Excellent Experimental DSA $0.27 / $0.41 โœ… MIT #3 Coding ๐Ÿ”—
๐Ÿค– Anthropic Claude 4.5 Sonnet Sonnet 4.5 2025-09-29 2025-09-29 โญ โœ… Best-in-class SWE-bench leader $3.00 / $15.00 โŒ ๐Ÿฅˆ #2 Overall ๐Ÿ”—
๐Ÿค– OpenAI GPT-5 Codex Codex 2025-09-23 2025-09-23 โญ โœ… Best-in-class Coding-optimized API pricing โŒ ๐Ÿ”—
๐Ÿชถ Microsoft Phi-4 Phi-4 2024-12 2025-09-11 โญ โœ… Good ~70% / ~75% ๐Ÿ†“ Free โœ… MIT ๐Ÿ”—
๐Ÿ‡จ๐Ÿ‡ณ Alibaba Qwen3-Next Qwen3-Next 2025-09-10 2025-09-10 โญ โœ… Good ~80% / ~84% Varies โœ… Apache 2.0 ๐Ÿ”—
๐ŸŒ™ Moonshot AI Kimi K2-0905 K2-0905 2025-09-09 2025-09-09 โญ โœ… Excellent 256K context Varies โœ… Modified MIT ๐Ÿ”—
๐Ÿ‡จ๐Ÿ‡ณ Alibaba Qwen3-Max Qwen3-Max 2025-09-05 2025-09-05 โญ โœ… Excellent ~82% / ~85% $0.30 / $3.00 โŒ API-only ๐Ÿ”—
๐Ÿ‡ช๐Ÿ‡บ Tilde AI TildeOpen LLM 30B 2025-09-03 2025-09-03 โญ โœ… Good EU Languages ๐Ÿ†“ Free โœ… Open-source ๐Ÿ”—
๐Ÿ”ฌ DeepSeek DeepSeek-V3.1 V3.1 2025-08-21 2025-08-21 โœ… Excellent 82%+ / 85%+ $0.27 / $0.41 โœ… MIT ๐Ÿ”—
๐Ÿ’ป Mistral AI Codestral 2508 2025-08 2025-08 โœ… Excellent Coding-specialized $0.30 / $0.90 โŒ ๐Ÿ”—
๐Ÿค– OpenAI GPT-5 GPT-5 2025-08-07 2025-08-07 โœ… Excellent ~90%+ / ~92% $1.25 / $10.00 โŒ ๐Ÿ”—
๐Ÿค– Anthropic Claude Opus 4.1 Opus 4.1 2025-08-05 2025-08-05 โœ… Excellent ~85%+ / ~85% $15.00 / $75.00 โŒ ๐Ÿ”—
๐Ÿ†“ OpenAI GPT-OSS-120B OSS-120B 2025-08-05 2025-08-05 โœ… Excellent 91.4% / ~89% ๐Ÿ†“ Free โœ… Apache 2.0 ๐Ÿ”—
๐Ÿ†“ OpenAI GPT-OSS-20B OSS-20B 2025-08-05 2025-08-05 โœ… Good ~85% / 85.3% ๐Ÿ†“ Free โœ… Apache 2.0 ๐Ÿ”—
๐Ÿ‡จ๐Ÿ‡ณ Alibaba Qwen3-Coder 480B 2025-07-23 2025-07-23 โœ… Excellent Coding-optimized ๐Ÿ†“ Free โœ… Apache 2.0 ๐Ÿ”—
๐Ÿš€ xAI Grok 4 Grok 4 2025-07-09 2025-07-09 โœ… Excellent ~85% / ~87% $3.00 / $15.00 โŒ ๐Ÿ”—
๐Ÿ‡จ๐Ÿ‡ณ Zhipu AI GLM-4.5 GLM-4.5 2025-07 2025-07 โœ… Good ~82% / ~82% $0.15 / $0.45 โœ… Open-weight ๐Ÿ”—
๐ŸŒ™ Moonshot AI Kimi K2 K2 2025-07 2025-07 โœ… Excellent ~85% / ~83% Varies โœ… Modified MIT ๐Ÿ”—
๐Ÿ’ป Mistral AI Magistral Magistral 2025-06 2025-06 โœ… Good Reasoning Varies โŒ ๐Ÿ”—
๐Ÿ”ฌ DeepSeek DeepSeek-R1 R1-0528 2025-05-28 2025-05-28 โœ… Excellent 81% / 85% $0.50 / $2.15 โœ… MIT ๐Ÿ”—

๐Ÿข Sort by Company

๐Ÿ’ก Models grouped by company/organization

๐Ÿค– OpenAI Models (7 models)
Model Version Release Last Updated Coding Benchmarks Price Self-Host
GPT-5 Codex Codex 2025-09-23 2025-09-23 โœ… Best Coding-optimized API โŒ
GPT-5 GPT-5 2025-08-07 2025-08-07 โœ… Excellent ~90%+ / ~92% $1.25 / $10 โŒ
๐Ÿ†“ GPT-OSS-120B OSS-120B 2025-08-05 2025-08-05 โœ… Excellent 91.4% / ~89% Free โœ… Apache 2.0
๐Ÿ†“ GPT-OSS-20B OSS-20B 2025-08-05 2025-08-05 โœ… Good ~85% / 85.3% Free โœ… Apache 2.0
o3 o3 2025-04 2025-04 โœ… Excellent 85%+ / ~88% $2.00 / $8 โŒ
o1-Pro o1-Pro API 2025-03 2025-03 โœ… Advanced Pro reasoning $150 / $600 โŒ
o3-Mini o3-Mini 2024-12 2024-12 โœ… Good ~77% / ~87% $1.10 / $4.40 โŒ
๐Ÿค– Anthropic Models (3 models)
Model Version Release Last Updated Coding Benchmarks Price Self-Host
Claude Sonnet 4.5 Sonnet 4.5 2025-09-29 2025-09-29 โœ… Best SWE-bench leader $3 / $15 โŒ
Claude Opus 4.1 Opus 4.1 2025-08-05 2025-08-05 โœ… Excellent ~85%+ / ~85% $15 / $75 โŒ
Claude 3.7 Sonnet 3.7 Sonnet 2025-02-24 2025-02-24 โœ… Excellent ~86% / ~84.8% $3 / $15 โŒ
๐Ÿ”ฌ DeepSeek Models (4 models)
Model Version Release Last Updated Coding Benchmarks Price Self-Host
DeepSeek-V3.2-Exp V3.2-Exp 2025-09-29 2025-09-30 โœ… Excellent Experimental DSA $0.27 / $0.41 โœ… MIT
DeepSeek-V3.1 V3.1 2025-08-21 2025-08-21 โœ… Excellent 82%+ / 85%+ $0.27 / $0.41 โœ… MIT
DeepSeek-R1 R1-0528 2025-05-28 2025-05-28 โœ… Excellent 81% / 85% $0.50 / $2.15 โœ… MIT
๐Ÿ†“ DeepSeek-Coder-V2 Coder-V2 2024-06 2024-06 โœ… Excellent Coding specialist Free โœ… MIT
๐Ÿ‡จ๐Ÿ‡ณ Alibaba/Qwen Models (5 models)
Model Version Release Last Updated Coding Benchmarks Price Self-Host
Qwen3-Next Qwen3-Next 2025-09-10 2025-09-10 โœ… Good ~80% / ~84% Varies โœ… Apache 2.0
Qwen3-Max Qwen3-Max 2025-09-05 2025-09-05 โœ… Excellent ~82% / ~85% $0.30 / $3 โŒ API
๐Ÿ†“ Qwen3-Coder 480B 2025-07-23 2025-07-23 โœ… Excellent Coding-optimized Free โœ… Apache 2.0
๐Ÿ†“ Qwen2.5-Coder 32B 2024-11 2024-11 โœ… Excellent Coding-focused Free โœ… Apache 2.0
Qwen2.5-Max 2.5-Max 2025-01-29 2025-01-29 โœ… Good ~80% / ~84% Varies โŒ API
๐Ÿ‡จ๐Ÿ‡ณ Zhipu AI (Z.ai) Models (2 models)
Model Version Release Last Updated Coding Benchmarks Price Self-Host
GLM-4.6 GLM-4.6 2025-09-30 2025-09-30 โœ… Excellent ~85% / ~84% $0.13 / $0.39 โœ… Open-weight
GLM-4.5 GLM-4.5 2025-07 2025-07 โœ… Good ~82% / ~82% $0.15 / $0.45 โœ… Open-weight
๐ŸŒ™ Moonshot AI Models (2 models)
Model Version Release Last Updated Coding Benchmarks Price Self-Host
Kimi K2-0905 K2-0905 2025-09-09 2025-09-09 โœ… Excellent 256K context Varies โœ… Modified MIT
Kimi K2 K2 2025-07 2025-07 โœ… Excellent ~85% / ~83% Varies โœ… Modified MIT
Other Companies (10+ models)

๐ŸŒ Google DeepMind - Gemini 2.5 Pro, Gemini 2.5 Flash ๐Ÿš€ xAI - Grok 4, Grok 4 Fast, Grok 3 ๐Ÿฆ™ Meta - Llama 4 Maverick, Llama 4 Scout ๐Ÿ’ป Mistral AI - Codestral, Magistral, Medium 3, Pixtral Large, Large 2 ๐Ÿข IBM - Granite 4.0 ๐Ÿชถ Microsoft - Phi-4 ๐Ÿ‡ช๐Ÿ‡บ Tilde AI - TildeOpen LLM And more...


๐Ÿ–ฅ๏ธ Sort by Self-Hosting

๐Ÿ’ก Models filtered by self-hosting capability

โœ… Self-Hostable Models (20+ models)

Model Company Parameters License API Price Last Updated
๐Ÿ†“ GPT-OSS-120B OpenAI 117B (5.1B active) Apache 2.0 Free 2025-08-05
๐Ÿ†“ GPT-OSS-20B OpenAI 21B (3.6B active) Apache 2.0 Free 2025-08-05
๐Ÿ†“ DeepSeek-V3.2-Exp DeepSeek 671B (37B active) MIT $0.27/$0.41 2025-09-30
๐Ÿ†“ DeepSeek-V3.1 DeepSeek 671B (37B active) MIT $0.27/$0.41 2025-08-21
๐Ÿ†“ DeepSeek-R1 DeepSeek 671B MIT $0.50/$2.15 2025-05-28
๐Ÿ†“ DeepSeek-Coder-V2 DeepSeek 236B MIT Free 2024-06
๐Ÿ†“ Qwen3-Next Alibaba Various Apache 2.0 Varies 2025-09-10
๐Ÿ†“ Qwen3-Coder Alibaba 480B Apache 2.0 Free 2025-07-23
๐Ÿ†“ Qwen2.5-Coder Alibaba 32B Apache 2.0 Free 2024-11
๐Ÿ†“ Kimi K2-0905 Moonshot AI 1T (32B active) Modified MIT Varies 2025-09-09
๐Ÿ†“ Kimi K2 Moonshot AI 1T (32B active) Modified MIT Varies 2025-07
๐Ÿ†“ GLM-4.6 Zhipu AI 355B MoE Open-weight $0.13/$0.39 2025-09-30
๐Ÿ†“ GLM-4.5 Zhipu AI Various Open-weight $0.15/$0.45 2025-07
๐Ÿ†“ Llama 4 Maverick Meta 400B Meta License Free 2025-04-05
๐Ÿ†“ Llama 4 Scout Meta 109B Meta License Free 2025-04-05
๐Ÿ†“ Granite 4.0 IBM 8B-3B active Apache 2.0 Free 2025-10-03
๐Ÿ†“ Phi-4 Microsoft 14B MIT Free 2025-09-11
๐Ÿ†“ TildeOpen LLM Tilde AI 30B Open-source Free 2025-09-03
๐Ÿ†“ Yi-Coder 01.AI 9B / 1.5B Apache 2.0 Free 2024-09
๐Ÿ†“ StarCoder2 BigCode/HF 3B-15B BigCode Free 2024

โŒ API-Only Models (Proprietary)

Model Company Pricing Performance Last Updated
GPT-5 OpenAI $1.25 / $10 Excellent 2025-08-07
GPT-5 Codex OpenAI API pricing Best coding 2025-09-23
Claude Sonnet 4.5 Anthropic $3 / $15 Best coding 2025-09-29
Claude Opus 4.1 Anthropic $15 / $75 Excellent 2025-08-05
Gemini 2.5 Pro Google $1.25 / $10 99% HumanEval 2025-01-31
Grok 4 xAI $3 / $15 Excellent 2025-07-09
Grok 4 Fast xAI $0.20 / $1.50 Cost-efficient 2025-09

๐Ÿ’ฐ Sort by Price

๐Ÿ’ก Models sorted by cost (cheapest first)

๐Ÿ†“ Free Models (Self-Hostable)

All models in the "Self-Hostable" section above are free to self-host!

๐Ÿ’ต Budget-Friendly (< $0.50 per 1M tokens)

Model Company Input Output Total (avg)
๐Ÿฅ‡ GLM-4.6 Zhipu AI $0.13 $0.39 $0.26
๐Ÿฅˆ Yi-Lightning 01.AI $0.14 $0.42 $0.28
๐Ÿฅ‰ Grok 4 Fast xAI $0.20 $1.50 $0.85
DeepSeek-V3.1/V3.2 DeepSeek $0.27 $0.41 $0.34
Gemini 2.5 Flash Google $0.30 $2.50 $1.40
Qwen3-Max Alibaba $0.30 $3.00 $1.65
Codestral Mistral AI $0.30 $0.90 $0.60

๐Ÿ’ฐ Mid-Tier ($1 - $5 per 1M tokens)

Model Company Input Output Total (avg)
GPT-5 OpenAI $1.25 $10.00 $5.63
Gemini 2.5 Pro Google $1.25 $10.00 $5.63
Mistral Medium 3 Mistral AI $1.00 $3.00 $2.00
Mistral Large 2 Mistral AI $2.00 $6.00 $4.00
o3 OpenAI $2.00 $8.00 $5.00
Claude Sonnet 4.5 Anthropic $3.00 $15.00 $9.00
Grok 4 xAI $3.00 $15.00 $9.00

๐Ÿ’Ž Premium (> $5 per 1M tokens)

Model Company Input Output Total (avg)
Claude Opus 4.1 Anthropic $15.00 $75.00 $45.00
o1-Pro OpenAI $150.00 $600.00 $375.00

๐ŸŽฏ Models by Category

๐Ÿ† Frontier Models

The most advanced, cutting-edge models with state-of-the-art capabilities:

  • ๐Ÿฅ‡ Grok Code Fast (xAI) - #1 Most Used - 50% of coding market, 174B tokens
  • ๐Ÿฅˆ Claude 4.5 Sonnet (Anthropic) - #2 Overall - Best coding, 35.5B tokens
  • GPT-5 (OpenAI) - Unified reasoning and multimodal, $1.25/$10
  • Gemini 2.5 Pro (Google) - Leading multimodal reasoning, 99% HumanEval
  • Grok 4 (xAI) - First-principles reasoning, 26.8% company market share
  • Qwen3-Max (Alibaba) - 1T+ parameters, ranks 3rd globally
  • GPT-OSS-120B (OpenAI) - First open-weight since GPT-2, Apache 2.0

Commercial Coding Models

  • Claude Sonnet 4.5 (Anthropic) - SWE-bench Verified leader
  • GPT-5 Codex (OpenAI) - 7+ hour autonomous coding
  • Codestral (Mistral AI) - Low-latency, fill-in-middle
  • Grok 4 Fast (xAI) - Cost-efficient at $0.20/$1.50

Open-Source Coding Models

  • GPT-OSS-120B (OpenAI) - 91.4% AIME, Apache 2.0
  • Qwen3-Coder (Alibaba) - 480B params, autonomous coding
  • DeepSeek-Coder-V2 (DeepSeek) - 236B params, MIT
  • Kimi K2-0905 (Moonshot AI) - 256K context, agentic tasks
  • GLM-4.6 (Zhipu AI) - Real-world coding, $0.13/$0.39
  • IBM Granite 4.0 - Enterprise-ready, ISO 42001

๐Ÿง  Reasoning Models

Models that employ chain-of-thought and step-by-step problem solving:

  • o3 / o1-Pro (OpenAI) - Advanced reasoning with extended thinking
  • DeepSeek-R1 (DeepSeek) - Open-source reasoning champion, MIT
  • Claude 3.7 Sonnet (Anthropic) - Hybrid reasoning model
  • Magistral (Mistral AI) - European reasoning model
  • Qwen3-Max-Thinking (Alibaba) - 100% AIME25 accuracy

๐Ÿ†“ Open-Source Models (2025)

Freely available models with permissive licenses (Apache 2.0, MIT, etc.) and public source code:

  • GPT-OSS-120B / GPT-OSS-20B (OpenAI) โ€” Apache 2.0; first open-weight since GPT-2; strong coding and reasoning.
  • DeepSeek-R1 / V3.1 / V3.2 / Coder-V2 (DeepSeek) โ€” MIT License; top-tier open-source reasoning and coding.
    GitHub
  • Llama 4 (Scout/Maverick) (Meta) โ€” Multimodal, open weights; supports text, image, and code tasks.
  • Qwen3-Next / Qwen3-Coder / Qwen3-Max / Qwen2.5-Coder (Alibaba) โ€” Apache 2.0; competitive coding and reasoning, multilingual.
    GitHub
  • Yi-Coder (01.AI) โ€” MIT License; 128K context, 52 programming languages, efficient for local deployment.
    GitHub
  • TildeOpen LLM (Tilde AI) โ€” European language specialist, open weights. Hugging Face
  • Phi-4 (Microsoft) โ€” MIT License; compact, efficient, strong reasoning.
  • GLM-4.6 / GLM-4.5 (Zhipu AI) โ€” Open-weight, real-world coding, multilingual.
  • Kimi K2-0905 / Kimi K2 (Moonshot AI) โ€” Modified MIT; agentic coding, 256K context.
  • Codestral (Mistral AI) โ€” Coding-specialized, low-latency, open weights.
    GitHub
  • StarCoder2 (BigCode/Hugging Face) โ€” Community-driven, open-source coding.
    GitHub
  • Magistral (Mistral AI) โ€” European reasoning, open weights.
  • IBM Granite 4.0 โ€” Apache 2.0; enterprise-ready, ISO 42001 certified.

Recent 2025 additions:

  • Qwen3-Max (Alibaba) โ€” 1T+ parameters, top-tier performance.
  • GLM-4.6 (Zhipu AI) โ€” 355B MoE, real-world coding, multilingual.
  • StarCoder2 โ€” New open-source coding models from BigCode/Hugging Face.
  • Magistral โ€” Reasoning-focused, open weights, European origin.

Most open-source models now support context windows of 128Kโ€“256K tokens, multimodal capabilities, and community benchmarks (HumanEval, SWE-bench, MMLU).

๐Ÿ’ป Coding-Specialized Models

Optimized for software development tasks:

Commercial Coding Models

  • Claude Sonnet 4.5 (Anthropic) - SWE-bench Verified leader
  • GPT-5 Codex (OpenAI) - 7+ hour autonomous coding
  • Codestral (Mistral AI) - Low-latency, fill-in-middle
  • Grok 4 Fast (xAI) - Cost-efficient at $0.20/$1.50

Open-Source Coding Models

  • GPT-OSS-120B (OpenAI) - 91.4% AIME, Apache 2.0
  • Qwen3-Coder (Alibaba) - 480B params, autonomous coding
  • DeepSeek-Coder-V2 (DeepSeek) - 236B params, MIT
  • Kimi K2-0905 (Moonshot AI) - 256K context, agentic tasks
  • GLM-4.6 (Zhipu AI) - Real-world coding, $0.13/$0.39
  • IBM Granite 4.0 - Enterprise-ready, ISO 42001

๐ŸŽจ Multimodal Models

Process text, images, audio, and video:

  • GPT-5 (OpenAI) - Unified multimodal interface
  • Gemini 2.5 Pro/Flash (Google) - Native multimodal architecture
  • Claude Sonnet 4.5 (Anthropic) - Vision and document understanding
  • Pixtral Large (Mistral AI) - 124B params, image understanding
  • Llama 4 Maverick (Meta) - Native multimodality

๐Ÿข Enterprise Models

Designed for business and production deployments:

  • Claude Opus 4.1 (Anthropic) - ASL-3 safety, highest capability
  • Gemini 2.5 Pro (Google) - Google Cloud integration
  • Command A (Cohere) - Enterprise RAG, 10+ languages
  • Jamba 1.6 (AI21 Labs) - Private deployment, hybrid architecture
  • IBM Granite 4.0 - ISO 42001 certified, auditable

๐Ÿ’ป Coding Models Deep Dive

๐Ÿ† Best Coding Models by Task

Code Generation & Autocomplete:

  1. ๐Ÿฅ‡ Grok Code Fast (xAI) - #1 Most Used - 174B tokens
  2. ๐Ÿฅˆ Claude 4.5 Sonnet - Most accurate, #2 overall
  3. ๐Ÿฅ‰ GPT-5 Codex - Complex algorithms

Code Review & Refactoring:

  1. ๐Ÿฅ‡ Claude 4.5 Sonnet - Industry-leading, 35.5B tokens
  2. ๐Ÿฅˆ Grok Code Fast - Real-world proven, 50% market
  3. ๐Ÿฅ‰ GPT-5 Codex - Large-scale refactoring

Debugging & Error Fixing:

  1. ๐Ÿฅ‡ Claude 4.5 Sonnet - Clear explanations
  2. ๐Ÿฅˆ GPT-5 Codex - Deep analysis
  3. ๐Ÿฅ‰ DeepSeek-V3.1 - Reasoning-based

Test Generation:

  1. ๐Ÿฅ‡ Grok Code Fast - Real-world adoption leader
  2. ๐Ÿฅˆ Claude 4.5 Sonnet - Comprehensive coverage
  3. ๐Ÿฅ‰ Codestral - Purpose-built for testing

Real-World Usage (OpenRouter Live Data):

  • Grok Code Fast dominates with 50% of coding category
  • Claude 4.5 Sonnet is #2 most deployed model
  • Qwen3-Coder 30B is #3 in coding workflows

๐Ÿ† Coding Benchmarks (SWE-bench Verified)

  1. ๐Ÿฅ‡ Claude Sonnet 4.5 - State-of-the-art
  2. ๐Ÿฅˆ GPT-5 Codex - 7+ hour autonomous coding
  3. ๐Ÿฅ‰ GPT-OSS-120B - 91.4% AIME, open-source leader
  4. Kimi K2-0905 - Agentic coding excellence
  5. Qwen3-Coder - 480B autonomous generation
  6. DeepSeek-V3.2-Exp - Sparse attention efficiency
  7. GLM-4.6 - Real-world tasks
  8. Grok 4 Fast - Best cost/performance
  9. IBM Granite 4.0 - Enterprise-grade
  10. Phi-4 - Compact reasoning

๐Ÿ“ Context Window Comparison

For working with large codebases:

  1. ๐ŸŒ™ Kimi K2-0905 - 256K tokens (โ‰ˆ192K lines)
  2. ๐Ÿ‡จ๐Ÿ‡ณ GLM-4.6 - 200K tokens (โ‰ˆ150K lines)
  3. ๐Ÿค– Claude Sonnet 4.5 - 200K tokens (โ‰ˆ150K lines)
  4. ๐Ÿ‡จ๐Ÿ‡ณ Qwen3-Coder - 128K tokens (โ‰ˆ96K lines)
  5. ๐Ÿ”ฌ DeepSeek-Coder-V2 - 128K tokens (โ‰ˆ96K lines)

๐Ÿ”— Official Resources

๐Ÿค– OpenAI

๐Ÿค– Anthropic

๐ŸŒ Google DeepMind

๐Ÿš€ xAI

  • Website: x.ai
  • Chat: grok.x.ai
  • Models: Grok 4, Grok 4 Fast, Grok 3

๐Ÿ”ฌ DeepSeek

๐Ÿ‡จ๐Ÿ‡ณ Alibaba/Qwen

๐Ÿ‡จ๐Ÿ‡ณ Zhipu AI (Z.ai)

๐ŸŒ™ Moonshot AI

๐Ÿฆ™ Meta AI

๐Ÿ’ป Mistral AI

๐Ÿข IBM Research

๐Ÿชถ Microsoft Research

๐Ÿ‡ช๐Ÿ‡บ Tilde AI


๐Ÿ“ˆ Performance Benchmarks

๐Ÿ† Coding Benchmarks (SWE-bench Verified)

  1. ๐Ÿฅ‡ Claude Sonnet 4.5 - State-of-the-art
  2. ๐Ÿฅˆ GPT-5 Codex - 7+ hour autonomous coding
  3. ๐Ÿฅ‰ GPT-OSS-120B - 91.4% AIME, open-source leader
  4. Kimi K2-0905 - Agentic coding excellence
  5. Qwen3-Coder - 480B autonomous generation
  6. DeepSeek-V3.2-Exp - Sparse attention efficiency
  7. GLM-4.6 - Real-world tasks
  8. Grok 4 Fast - Best cost/performance
  9. IBM Granite 4.0 - Enterprise-grade
  10. Phi-4 - Compact reasoning

๐Ÿงฎ Reasoning Benchmarks (AIME 2025)

  1. ๐Ÿฅ‡ Qwen3-Max-Thinking - 100% accuracy
  2. ๐Ÿฅˆ o1-Pro - Advanced reasoning
  3. ๐Ÿฅ‰ GPT-OSS-120B - 91.4%
  4. GPT-5 - ~90%
  5. DeepSeek-R1 - 81%
  6. Grok 4 - ~85%

๐Ÿ“š General Knowledge (MMLU)

  1. ๐Ÿฅ‡ GPT-5 - ~92%
  2. ๐Ÿฅˆ o3 - ~88%
  3. ๐Ÿฅ‰ Gemini 2.5 Pro - 86.4%
  4. DeepSeek-V3.1 - 85%+
  5. Claude Opus 4.1 - ~85%

๐Ÿ’ฐ Cost Analysis

๐Ÿ†“ Most Affordable Options

Free Self-Hostable (20+ models):

  • All models in "Self-Hostable" section can be run for FREE!
  • No API costs when self-hosting
  • Just need appropriate hardware

Cheapest Commercial APIs:

  1. ๐Ÿฅ‡ GLM-4.6 - $0.13/$0.39 per 1M tokens
  2. ๐Ÿฅˆ Yi-Lightning - $0.14/$0.42 per 1M tokens
  3. ๐Ÿฅ‰ Grok 4 Fast - $0.20/$1.50 per 1M tokens

๐Ÿ’Ž Best Value (Performance/Cost)

  1. DeepSeek-V3.1 - Excellent performance, ultra-low cost
  2. Grok 4 Fast - Frontier performance, $0.20/$1.50
  3. Qwen3-Max - Top-3 global, $0.30/$3.00
  4. Gemini 2.5 Flash - Fast & capable, $0.30/$2.50
  5. Claude Sonnet 4.5 - Best coding, $3/$15 (reasonable)

๐Ÿ–ฅ๏ธ Self-Hosting Guide

๐Ÿ”ง Hardware Requirements

Small Models (7-20B parameters):

  • GPU: 1x RTX 3090/4090 (24GB VRAM)
  • RAM: 32GB+
  • Storage: 50GB+ SSD
  • Examples: Phi-4, GPT-OSS-20B, Yi-Coder

Medium Models (30-70B parameters):

  • GPU: 2x RTX 4090 or 1x A100 (40-80GB)
  • RAM: 64GB+
  • Storage: 100GB+ SSD
  • Examples: Qwen3-Coder, TildeOpen LLM, Llama 4 Scout

Large Models (120B+ parameters):

  • GPU: 4x A100 (80GB) or 8x RTX 4090
  • RAM: 128GB+
  • Storage: 200GB+ SSD
  • Examples: GPT-OSS-120B, Pixtral Large

๐Ÿ“ฆ Deployment Options

Local Inference:

  • Ollama - Easy local deployment (ollama.ai)
  • LM Studio - User-friendly GUI (lmstudio.ai)
  • llama.cpp - Efficient C++ implementation
  • vLLM - High-throughput serving (vllm.ai)

Cloud Self-Hosting:

  • Hugging Face Inference - Managed deployment
  • AWS/GCP/Azure - Full control, scalable
  • RunPod/Vast.ai - GPU rental platforms
  • Modal/Replicate - Serverless options

๐Ÿ–ฅ๏ธ Self-Hosted Models (2025)

Models that can be run locally or on your own infrastructure (open weights, permissive license) with public source code:

Coding:

  • DeepSeek-Coder-V2 (MIT) โ€” GitHub
  • Qwen3-Coder (Apache 2.0) โ€” GitHub +- GPT-OSS-120B (Apache 2.0)
  • Codestral (Mistral AI, open weights) โ€” GitHub
  • StarCoder2 (BigCode/Hugging Face) โ€” GitHub

General Use:

  • GPT-OSS-20B (Apache 2.0)
  • Phi-4 (MIT) โ€” GitHub
  • DeepSeek-V3.1 (MIT) โ€” GitHub
  • Llama 4 Scout (Meta license) โ€” GitHub
  • GLM-4.6 (Zhipu AI, open weights) โ€” GitHub

Enterprise:

  • IBM Granite 4.0 (Apache 2.0, ISO 42001)
  • Llama 4 Maverick (Meta license) โ€” GitHub
  • Qwen3-Max (Apache 2.0) โ€” GitHub
  • Magistral (Mistral AI, open weights)

Recent 2025 additions:

  • Qwen3-Max (Alibaba) โ€” 1T+ parameters, self-hostable.
  • GLM-4.6 (Zhipu AI) โ€” 355B MoE, open weights, self-hostable.
  • StarCoder2 โ€” New open-source coding models, easy to self-host.
  • Magistral โ€” Reasoning, open weights, European origin.

Most self-hosted models now support large context windows, multimodal tasks, and efficient deployment on consumer or enterprise hardware.


๐Ÿ” Model Selection Guide

๐Ÿ’ผ By Use Case

๐Ÿข Enterprise Production:

  • Best: Claude Opus 4.1, IBM Granite 4.0
  • Budget: DeepSeek-V3.1, Qwen3-Max
  • Self-Host: IBM Granite 4.0, Llama 4

๐Ÿ’ป Software Development:

  • Best: Claude Sonnet 4.5, GPT-5 Codex
  • Popular: Grok Code Fast (50% market share)
  • Self-Host: DeepSeek-Coder-V2, Qwen3-Coder

๐Ÿ”ฌ Research & Analysis:

  • Reasoning: o1-Pro, Qwen3-Max-Thinking
  • General: GPT-5, Gemini 2.5 Pro
  • Open: GPT-OSS-120B, DeepSeek-R1

๐ŸŽจ Creative & Multimodal:

  • Best: GPT-5, Gemini 2.5 Pro
  • Vision: Pixtral Large, Claude Sonnet 4.5
  • Affordable: Gemini 2.5 Flash

๐ŸŒ Multilingual & Regional:

  • Chinese: Qwen3-Max, GLM-4.6, Kimi K2
  • European: TildeOpen LLM (34 languages)
  • Global: GPT-5, Gemini 2.5 Pro

๐Ÿ’ฐ By Budget

๐Ÿ†“ Free (Self-Host Only):

  • GPT-OSS-120B/20B, DeepSeek models, Llama 4, Phi-4

๐Ÿ’ต Budget-Friendly ($0.10-$0.50 per 1M):

  • GLM-4.6, Yi-Lightning, Grok 4 Fast, DeepSeek-V3.1

๐Ÿ’Ž Mid-Range ($1-$5 per 1M):

  • GPT-5, Gemini 2.5 Flash, Claude Sonnet 4.5, Grok 4

๐Ÿ‘‘ Premium ($10+ per 1M):

  • Claude Opus 4.1, o1-Pro, Gemini 2.5 Pro

โšก By Performance Needs

Speed Priority:

  • Gemini 2.5 Flash, Grok 4 Fast, Claude 3.7 Sonnet

Quality Priority:

  • Claude Opus 4.1, GPT-5, o1-Pro

Balanced:

  • Claude Sonnet 4.5, GPT-5, DeepSeek-V3.1

๐ŸŽฏ Awesome AI IDEs

Awesome AI IDEs - A comprehensive list of AI-powered IDEs with categories and details.


๐Ÿค Contributing

We welcome contributions from the community! Here's how you can help:

๐Ÿ“ How to Contribute

  1. Fork the Repository
  2. Add/Update Model Information
    • Verify from official sources
    • Include benchmark data
    • Add pricing information
    • Provide official links
  3. Follow the Style Guide
    • Use emojis consistently
    • Match existing formatting
    • Keep tables aligned
  4. Submit a Pull Request
    • Describe your changes
    • Include sources/verification
    • Follow UPDATE_RULES.md

๐ŸŽฏ What We Need

  • โœ… New model releases and updates
  • โœ… Benchmark results and performance data
  • โœ… Pricing updates and corrections
  • โœ… Bug fixes and typo corrections
  • โœ… Additional resources and links
  • โœ… Real-world usage insights
  • โœ… Community feedback and reviews

๐Ÿ“‹ Guidelines

Please read our detailed contribution guidelines:

๐Ÿ™ Contributors

Special thanks to all contributors who help keep this matrix up-to-date!


๐Ÿ“œ License

Project License

This project is licensed under the GNU General Public License v3.0 (GPL-3.0).

Copyright (c) 2025 ReadyPixels

For full license terms, see: GNU GPL v3.0

License: GPL v3

๐Ÿ“Š Data Sources Attribution

This project aggregates information from:

  • OpenRouter Rankings - openrouter.ai - Real-world usage statistics
  • LLM-Stats.com - llm-stats.com - Benchmark aggregation
  • Official Documentation - Direct from model providers
  • Community Contributions - GitHub contributors

โš–๏ธ Model Licenses

Each AI model listed has its own license. Please refer to official documentation:

  • Open Source: Apache 2.0, MIT, Modified MIT, etc.
  • Proprietary: API usage terms from providers
  • Enterprise: Contact providers for licensing

๐Ÿ”— Quick Links


โญ Star this repository if you find it helpful! โญ

Last Updated: October 7, 2025


๐Ÿ“š Additional Resources


๐Ÿš€ The most comprehensive AI models matrix on the internet! Made with โค๏ธ by ReadyPixels for the global AI community Last Updated: October 7, 2025

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

Packages

No packages published