Awesome Free Models

Running AI shouldn't require a credit card. This list curates genuinely free models — open-weight models you can self-host, free API tiers from major providers, and tools to run everything locally.

🧠 Open-Weight Models
🔌 Free API Providers
💻 Local Inference Tools
💬 AI Chatbot UIs
🤖 AI Coding Assistants
📝 Code Models
🔍 RAG & Vector Databases
🧩 Agentic Frameworks
🎛 Fine-tuning Tools
✨ Prompt Engineering Tools
📊 Datasets
☁ Model Hosting Platforms
📚 Learning Resources
🏆 Resources & Leaderboards
👥 Communities

🧠 Open-Weight Models

Notable open-weight models you can download and run on your own hardware.

Name	Description
Llama 4 Scout / Maverick	Meta's latest MoE generation. Scout: 109B, 10M context. Maverick: 402B, 1M context. Native multimodal. [License]
DeepSeek V4	Latest generation with extreme cost-efficiency. MIT license.
DeepSeek-V4-Flash	Apr 2026. Efficiency-focused variant of DeepSeek V4. 1M token context, optimized for fast inference. MIT license.
Gemma 4 31B / 26B MoE / E4B / E2B	Fully permissive Apache 2.0. 256K context, native multimodal. New standard for open-weight.
GLM-5.1 (Zhipu AI)	744B MoE model, competitive with top proprietary models. MIT license.
MiniMax M3	Frontier-tier 1M context, native multimodal + computer use. MSA architecture.
Trinity (Arcee AI)	400B parameter enterprise model. Apache 2.0.
Step 3.7 Flash (StepFun)	May 2026. Apache 2.0. Native multimodal (image+video), strong agentic performance. Efficient enough for high-end local hardware.
Kimi K2.6 (Moonshot AI)	Apr 2026. 1T-parameter MoE model. Modified MIT license. Exceptional coding (SWE-Bench ~54%) and multi-agent swarm orchestration.
Qwen 3.6-35B-A3B	Apr 2026. MoE variant with only 3B active parameters. Extremely efficient for consumer hardware. Apache 2.0.
InternLM 3 (Shanghai AI Lab)	Early 2026. Strong long-context reasoning and agentic performance. Competitive in open-weight benchmarks.
MiMo-V2.5-Pro (Xiaomi)	Apr 2026. 1.02T-parameter MoE (42B active). Optimized for complex agentic tasks, coding, and long-context.
Bonsai 8B (PrismML)	Apr 2026. Groundbreaking 1-bit quantized model. Extremely efficient for edge and consumer hardware (Apple Silicon).
Mistral Small 3.1 (Mistral)	Mar 2025. Versatile 24B multimodal model. Strong text performance with native image understanding and 128K context. Apache 2.0.
Mistral Small 4 (Mistral)	Mar 2026. Hybrid MoE (6.5B active params) unifying instruction, reasoning, and multimodal capabilities. Efficient frontier-class model. Apache 2.0.
Command A+ (Cohere)	May 2026. Enterprise multimodal MoE optimized for sovereignty and multilingual RAG across 48 languages. Apache 2.0.
Hermes 4 (NousResearch)	Feb 2026. Self-improving agentic model with closed-loop learning. Curates own memory and builds skills from experience. Apache 2.0.
Snowflake Arctic	Apr 2024. Enterprise MoE model balancing high-quality performance with efficient training costs. Optimized for complex data operations. Apache 2.0.
Falcon 3 (TII)	Dec 2024. Compact high-performance model with strong reasoning. Designed for efficient deployment on resource-constrained hardware. TII Falcon-LLM License 2.0.
Apple OpenELM	Apr 2024. Family of efficient on-device SLMs using layer-wise attention scaling. Runs locally on Apple Silicon with full privacy. Apple Sample Code License.

🔌 Free API Providers

Providers offering free tiers to access models via API — no local hardware required.

Name	Description
Google AI Studio	Most generous free tier. Access Gemini 2.5 Flash, Gemini 2.0 Flash, and other models. Generous rate limits for prototyping.
OpenRouter	Aggregates 500+ models. Filter by "Free" to see models available at no cost. Includes experimental and subsidized open-weight models.
Groq	Ultra-fast inference. Free tier includes Llama, Gemma, Mixtral, Whisper models with generous daily rate limits.
Hugging Face Inference API	Free tier for thousands of community models. Rate-limited but excellent for testing.
NVIDIA NIM	Free API access to accelerated versions of Llama, Mistral, Gemma, and more on NVIDIA infrastructure.
DeepInfra	Serverless inference. Free tier with daily rate limits for popular open-source models.
Together AI	Free trial credits for new users. Fast inference on open-source models.
Fireworks AI	Free tier for community models. Optimized for low latency.
SiliconFlow	Rising platform with free access to many open-source models.
Cloudflare Workers AI	Free tier for running select open-source models at the edge.
Replicate	Free tier with limited credits for running open-source models.
Poe (Quora)	Free tier with daily credits for GPT-4 mini, Claude instant, and community bots.
CatGPT	Completely free chat with multiple models, no login required. ⚠️ Currently unreachable (SSL cert expired as of June 2026).
Qwen Chat (Alibaba)	Free access to Qwen 3.6-Plus, Qwen 3.6-Max, and other Qwen models via web chat and API. 1M token context for agentic coding.
Ollama Cloud	Free tier for running open-source models on Ollama's cloud infrastructure. Light usage included, 1 concurrent model. Same `ollama run` command as local. Zero data retention.
Mistral AI (La Plateforme)	Free API tier with access to Mistral Large, Mistral Nemo, Codestral and more. 1 req/s, 500k tokens/min. Requires phone verification and data usage opt-in.
Cohere	Free evaluation API key for Command R, Command R+, Embed, and Rerank models. 20 req/min, 1,000 req/month.
DeepSeek Platform	Free API credits for new users (5M tokens). Access to DeepSeek V4, DeepSeek-R1, and other models. Generous free allocation.
GitHub Models	Free tier for GitHub users. Access GPT-4o, Llama 3.3, Mistral, and more with rate-limited playground and API.
Hyperbolic	Open-access AI cloud with affordable inference. Free compute credits via referral program. Supports Llama, Qwen, DeepSeek, and other open models.
Novita AI	Free credits for testing 100+ models including Llama, Qwen, DeepSeek, and Mistral. OpenAI-compatible API with competitive pricing beyond the free tier.
Anakin.ai	30 daily free credits for accessing multiple AI models. Web chat interface and API access. Supports GPT-4, Claude, and open-weight models.
Nebius AI	$100 free credits for new users. AI Studio with access to Llama, Qwen, DeepSeek, and other open-weight models. Fast inference on NVIDIA H100 infrastructure.
Fal.ai	Free starter credits for AI inference. Fast, serverless platform supporting Llama, Flux, and Stable Diffusion models. Pay-as-you-go beyond free tier.
Vercel AI Gateway	$5/month free credits for the AI Gateway. Proxy and cache requests across multiple LLM providers. SDK is open-source and free.
AI21 Labs	$10 trial credits for accessing Jamba 1.5, Jamba 1.6, and other AI21 models. Valid for 3 months. Requires account sign-up.
Amazon Bedrock	$200 AWS credits for new customers. Access to Llama, Mistral, Claude, Titan, and other foundation models via API.
Azure AI Foundry	$200 free trial credits (30 days). Access GPT-4o, Llama, Mistral, Phi, and other models via Azure's unified AI platform.
RunPod	Free credits for serverless GPU inference. Deploy open-weight models as serverless endpoints. Supports Llama, Qwen, DeepSeek, and more.
OpenCode	Go-based terminal AI coding assistant. Model-neutral, supports multiple LLM providers, LSP integration, and MCP tools. Free and open-source. GitHub

💻 Local Inference Tools

Run models on your own machine — no API keys needed, full privacy.

Name	Description
Ollama	The easiest way to run local LLMs. One command to download and run any model. macOS, Linux, Windows. GitHub
LM Studio	Polished desktop GUI. Browse, download, and chat with models. Built-in model browser and local API server.
llama.cpp	High-performance C++ inference engine. Runs on CPU and GPU. Supports GGUF quantization. Powers most other local tools.
Jan	Open-source ChatGPT alternative for desktop. Built-in model downloader, local API server. GitHub
GPT4All	Privacy-focused local chatbot. Runs on consumer hardware. Built-in model browser. GitHub
text-generation-webui (Oobabooga)	Feature-rich web UI. Supports multiple backends (Transformers, llama.cpp, ExLlama, AutoGPTQ).
LocalAI	Drop-in OpenAI API replacement. Run models locally with an OpenAI-compatible API. GitHub
KoboldCPP	Single-file executable for running GGUF models. Focused on story generation but general-purpose.
llamafile (Mozilla)	Distributable single-file executables that run LLMs. No installation needed.
vLLM	High-throughput production inference engine. Uses PagedAttention for efficient serving.
SGLang	Fast inference framework with structured generation and RadixAttention.
TensorRT-LLM (NVIDIA)	NVIDIA's optimized inference engine. Best performance on NVIDIA GPUs.
ExLlamaV2	Optimized inference for Llama-family models. Fastest option for single-GPU inference.
Aphrodite Engine	High-performance LLM serving engine with advanced quantization support.
TabbyAPI	Lightweight, fast OpenAI-compatible API server for ExLlamaV2.
LlamaEdge	Lightweight inference framework for edge devices. OpenAI-compatible API for open-source models. Runs on WasmEdge for portability. GitHub
MLC LLM	Universal deployment engine by UW/SJTU. Runs LLMs on any hardware — laptops, phones, browsers. OpenAI-compatible API.
WebLLM	In-browser LLM inference via WebGPU. Runs models directly in your browser with zero setup. No server needed.
FastChat (LMSYS)	Open platform for training, serving, and evaluating LLMs. Provides OpenAI-compatible API and web UI for local models.
Hugging Face TGI	Production-grade serving toolkit for large language models. Optimized for high throughput on local hardware.
DeepSpeed (Microsoft)	Deep learning optimization library with inference acceleration. Enables running larger models on limited hardware through ZeRO optimization.
AirLLM	Run large models (70B+) on consumer hardware with limited memory. Loads models layer-by-layer for extreme memory efficiency.
AI Toolkit for VS Code (Microsoft)	VS Code extension to browse, test, fine-tune, and deploy models locally. Integrates ONNX and llama.cpp.
Ollama Grid Search	Desktop utility for systematic model evaluation. Test multiple models, prompts, and inference parameters side-by-side via a Rust/React GUI.

💬 AI Chatbot UIs

Free, open-source web interfaces for chatting with AI models — self-host or use hosted versions.

Name	Description
Open WebUI	Feature-rich ChatGPT-like interface for Ollama and OpenAI-compatible backends. RAG, image generation, multi-user. GitHub
LibreChat	Open-source ChatGPT clone supporting 40+ providers, multi-user, plugins, and RAG. GitHub
AnythingLLM	All-in-one desktop app for chatting with documents and models. Built-in RAG pipeline. GitHub
Big-AGI	Feature-rich AI chat with personas, multi-model support, voice, and code execution. GitHub
Lobe Chat	Modern, extensible chat framework with plugin system and multi-provider support. GitHub
Chatbot UI	Simple, clean ChatGPT interface. Easy to self-host with any OpenAI-compatible API. GitHub
NextChat (ChatGPT-Next-Web)	Lightweight cross-platform chat app. Self-host on Vercel or download official desktop/mobile clients.

🤖 AI Coding Assistants

Free tools that integrate AI into your development workflow.

Name	Description
Continue.dev	Open-source AI code assistant for VS Code and JetBrains. Chat, autocomplete, and edit with any model. GitHub
Aider	AI pair programming in the terminal. Edits code in your local git repo. Supports GPT, Claude, and local models. GitHub
Codeium (Windsurf)	Free AI code assistant with autocomplete, chat, and search. Individual plan is free forever.
Tabby	Self-hosted AI coding assistant with no dependency on external services. GitHub
Cody (Sourcegraph)	Free tier for individuals. Chat, autocomplete, and commands with codebase context.
Llama Coder (Nutlope)	Free AI code generation tool. Generate entire apps from prompts.
Bolt.new (StackBlitz)	Free tier for AI-powered full-stack web app development in browser.
Claude Code (Anthropic)	Free tier with limited usage for terminal-based AI coding assistant.
Cursor 3	Apr 2026. AI-native code editor with deep model integration and agentic features. Free tier available.
OpenCode	Go-based terminal AI coding assistant. Model-neutral, supports multiple LLM providers, LSP integration, and MCP tools. GitHub
CodeBuff	CLI-based AI coding assistant that understands entire codebases. Multi-agent architecture, works with any model provider through natural language instructions.
Pi	Open-source terminal AI coding agent with a unified multi-provider API. Model-agnostic, supports OpenAI, Anthropic, Google, and any OpenAI-compatible endpoint. Extensible plugin architecture. GitHub
Cline	Popular autonomous VS Code agent. Creates/edits files, runs terminal commands, browses web. Open-source, BYOK (bring your own API key). GitHub
Roo Code	Community fork of Cline with faster feature releases. Open-source VS Code agent with deep model integration.
OpenHands	Autonomous AI software engineer. Navigates file systems, runs shell commands, tests code in browser. Self-hostable. GitHub
Twinny	Local-first AI coding extension for VS Code. Works entirely offline with local LLMs (Ollama, llama.cpp). Zero external dependencies.
Kodu (Claude Coder)	VS Code autonomous coding agent. Builds projects from scratch, handles complex tasks with natural language.
Goose	Open-source CLI agent for complex software engineering tasks. Extensible plugin system. Built by Block/Square. GitHub

📝 Code Models

Specialized for code generation, completion, and analysis.

Name	Description
MAI-Code-1-Flash (Microsoft)	Jun 2026. Microsoft's open-weight coding model for lowering infrastructure costs.
DeepSeek Coder	State-of-the-art open-weight code generation. DeepSeek's coder series leads SWE-bench. MIT license.
Qwen2.5-Coder (Alibaba)	Highly capable code model series (1.5B–32B). Excellent balance of speed and quality. Apache 2.0.
Codestral (Mistral)	Mistral's dedicated code generation model — fill-in-the-middle, completion, and instruction. GitHub
CodeGemma (Google)	Google's Gemma architecture fine-tuned for code completion and instruction. Apache 2.0.
StarCoder2 (BigCode)	Transparently trained code model covering 619 languages. OpenRAIL-M license.
Yi-Coder (01.AI)	Efficient coding model with strong long-context understanding. Yi License (Apache 2.0 compatible).
Granite Code (IBM)	IBM's enterprise-grade code model, available in multiple sizes. Apache 2.0.
Phi-4-mini (Microsoft)	Lightweight model optimized for reasoning and code. Punches above its weight class. MIT license.
Qwen3-Coder-Next (Alibaba)	Early 2026. Latest generation of Qwen's code series. Strong reasoning and long-context coding capabilities. Apache 2.0.
CodeLlama (Meta)	Aug 2023. Llama 2-based code generation pioneer. Supports infilling, completion, and instruction. Llama 2 Community License.
WizardCoder (WizardLM)	2023. Evol-Instruct fine-tuned for complex coding tasks. Strong general code generation performance. Apache 2.0.
OpenCodeInterpreter	2024. Integrates execution feedback to iteratively improve generated code. Bridges generation and execution. Apache 2.0.
Stable Code 3B (Stability AI)	Aug 2023. Lightweight 3B code model optimized for fill-in-the-middle. Efficient for local autocompletion. StabilityAI license.
CodeGeeX2 (THUDM)	2023. Multilingual code model supporting 20+ languages. Strong in both Chinese and English code tasks. Apache 2.0.
CodeT5+ (Salesforce)	2023. Encoder-decoder architecture unifying code generation, completion, and understanding. BSD-3 license.
SantaCoder (BigCode)	2023. Light 1.1B model specialized for Python, Java, and JavaScript. Fast and efficient for IDE integration.

🔍 RAG & Vector Databases

Free tools for building retrieval-augmented generation pipelines — vector storage, embedding search, and document retrieval.

Name	Description
Chroma	AI-native open-source embedding database. Runs in-process, no GPU needed. GitHub
Qdrant	High-performance vector search engine. Free tier on Qdrant Cloud or self-host via Docker. GitHub
pgvector	Vector similarity search inside PostgreSQL. Free if you already run Postgres.
LanceDB	Developer-friendly vector database built on Lance columnar format. Runs locally, no server needed. GitHub
Weaviate	Open-source vector database. Free sandbox tier on Weaviate Cloud. GitHub
Milvus (Zilliz)	Cloud-native vector database. Free tier on Zilliz Cloud or self-host. GitHub
txtai	AI-powered semantic search and RAG in a single Python package. GitHub
R2R (SciPhi)	Production-ready RAG engine with API, user management, and observability.
Docling (IBM)	Document understanding and conversion for RAG pipelines. Extracts PDFs, images, and more. GitHub
Unstructured.io	Preprocessing toolkit for documents (PDF, HTML, Word) for RAG pipelines. Free tier available.
RAGFlow	Open-source RAG engine with deep document parsing, OCR, and knowledge base management. Supports多种 document formats.
RAGatouille	Python package bringing ColBERT-style late interaction retrieval to RAG pipelines. Works as retriever and reranker. Free and open-source.
Canopy (Pinecone)	Open-source RAG framework built on Pinecone. End-to-end retrieval and generation with built-in chat interface.
Ragas	Open-source evaluation framework for RAG pipelines. Measures retrieval accuracy, answer relevance, and faithfulness.

🧩 Agentic Frameworks

Free, open-source frameworks for building AI agents and multi-agent systems.

Name	Description
LangGraph (LangChain)	Low-level framework for building stateful, multi-agent applications. GitHub
CrewAI	Multi-agent framework for orchestrating specialized AI agents to work together. GitHub
AutoGen (Microsoft)	Extensible framework for building multi-agent conversations. GitHub
Agno (formerly Phidata)	Full-stack AI framework for building multimodal agents with memory, knowledge, and tools. GitHub
PydanticAI	Agent framework by Pydantic with type-safe outputs and dependency injection. GitHub
Mastra	TypeScript framework for building AI applications and agent workflows. GitHub
OpenAI Agents SDK	Lightweight SDK for building single and multi-agent systems. GitHub
Semantic Kernel (Microsoft)	SDK for orchestrating AI agents with planners, memory, and connectors. GitHub
Dify	LLM app development platform with visual workflow builder and agent capabilities. GitHub
Flowise	Low-code visual LLM flow builder with drag-and-drop interface. GitHub
TaskWeaver (Microsoft)	Code-first agent framework for planning and executing complex tasks. GitHub
Fazm	Apr 2026. Open-source local computer-use agent for macOS. Drives apps via accessibility APIs, model-agnostic, faster than screenshot-based agents.
Smolagents (Hugging Face)	Minimalist agent library where agents "think in code." Lightweight, zero boilerplate. Supports code agents and tool-calling agents.
Swarms	Enterprise-grade multi-agent orchestration framework. Scalable infrastructure for autonomous agent swarms. Highly modular.
Letta (MemGPT)	Framework for long-term agent memory. Virtual memory management that pages data in/out of context like an OS. Persistent agents.
Griptape	Enterprise agent framework with strictly typed Pipelines, Workflows, and Agents. Structure-first, production-ready.
OpenAI Swarm	Experimental lightweight multi-agent orchestration. Uses Agents and Handoffs abstractions. Educational and minimalist.
Atomic Agents	Framework inspired by Atomic Design. Compose agents from small, reusable, modular components. Testable and scalable.
PraisonAI	Low-code multi-agent framework. Define agent roles, tasks, and flows via YAML configuration. Wraps underlying agent frameworks.
Cognee	GraphRAG framework for agent knowledge management. Builds interconnected knowledge graphs from unstructured data.
AgentZero	Self-healing autonomous agent with web UI. Manages own workflows, tool use, and environment. Self-evolving capabilities.
MetaGPT	Multi-agent framework simulating a full software team. Assigns Agent, Product Manager, Engineer roles. Implements SOPs for end-to-end code generation.
ChatDev (OpenBMB)	Virtual software company driven by multi-agent collaboration. Follows waterfall model through design, coding, testing, and documentation.
AutoGPT	The original autonomous agent experiment. Sets its own goals, iterates on tasks, and executes without continuous human input. Web browsing and file management.
Bee Agent Framework (IBM)	Production-ready framework for building reliable AI agents in Python and TypeScript. Modular, with built-in observability and IBM research optimizations.
Eliza (ai16z)	Multi-platform agent framework for creating character-driven AI agents. Handles social media interaction, complex decision-making, and autonomous behavior across platforms.
SuperAGI	Developer-focused autonomous agent platform with GUI. Built-in resource management, file handling, and multi-tasking for running agents at scale.
AgentVerse (OpenBMB)	Framework for building and evaluating multi-agent environments. Easily configure agent teams and measure collaborative performance.
Qwen-Agent (Alibaba)	Agent framework tightly integrated with the Qwen model family. Optimized for function calling, code execution, RAG, and tool use with Qwen models.
AGiXT	Extensible modular AI agent automation platform. Plugin system for swapping LLMs, memory backends, and tools. Highly customizable agent workflows.

🎛 Fine-tuning Tools

Tools to fine-tune free models on your own data — all free and open-source.

Name	Description
Unsloth	Fast memory-efficient fine-tuning. 2x faster, 50% less memory. Supports QLoRA, LoRA, full fine-tune.
Axolotl	Streamlined fine-tuning framework supporting multiple model architectures and quantization methods.
LLaMA-Factory	Easy-to-use fine-tuning with web UI. Supports 100+ models, multiple training methods.
Hugging Face TRL	Transformer Reinforcement Learning library. SFT, PPO, DPOTrainer, GRPOTrainer for aligning models.
TorchTune (Meta)	Native PyTorch library for fine-tuning LLMs. Simple, extensible, efficient.
AutoTrain (Hugging Face)	No-code fine-tuning platform. Train models with a web UI or API.
XTuner (InternLM)	Efficient fine-tuning toolkit supporting QLoRA, LoRA, and full fine-tune with multiple model architectures.
Ludwig (Predibase)	Declarative ML framework. Fine-tune models with a simple config file. GitHub
PyTorch Lightning	Free deep learning framework for training and fine-tuning. Simplifies distributed training, checkpointing, and logging. GitHub
Hugging Face Accelerate	Zero-config distributed training for PyTorch. Enables easy multi-GPU and TPU training with minimal code changes.
ColossalAI	Open-source distributed training system with parallelism strategies. Supports large model training on limited hardware.
JAX (Google)	High-performance ML framework with automatic differentiation and JIT compilation. Powers many modern training pipelines.
Ray Train	Distributed training framework built on Ray. Supports PyTorch, TensorFlow, and JAX with automatic scaling.
Determined AI	Open-source ML training platform with hyperparameter search, GPU scheduling, and experiment tracking.

✨ Prompt Engineering Tools

Free tools for testing, managing, and optimizing prompts.

Name	Description
Promptfoo	Open-source tool for prompt testing and evaluation. Systematic A/B testing of prompts. GitHub
Fabric (Daniel Miessler)	Open-source framework for augmenting humans with AI. Library of curated prompts (patterns) for common tasks.
LangFuse	Open-source LLM engineering platform with prompt management, versioning, and evaluation. GitHub
OpenPrompt (THUNLP)	Framework for prompt-learning research. Supports template and verbalizer design. GitHub
DSPy (Stanford)	Framework for algorithmically optimizing LM prompts and weights. GitHub
Agenta	Open-source LLM platform for prompt management, evaluation, and deployment. GitHub
ChainForge	Open-source visual programming environment for prompt engineering. Test prompts across multiple LLMs, compare responses, and evaluate robustness. GitHub
Latitude	Open-source prompt engineering platform with versioning, playground, evaluation, and deployment as API endpoints. GitHub
DeepEval	Open-source evaluation framework for LLM outputs. 50+ metrics, pytest integration, and CI/CD support for prompt regression testing.
PromptLayer	Prompt versioning and monitoring platform. Tracks prompt versions, cost, latency, and model behavior. Free tier with 10K calls/month.
OpenPromptHub	Community-driven prompt engineering platform. Discover, share, and contribute prompt patterns. Free and open-source.

📊 Datasets

Free, open datasets for training, fine-tuning, and evaluating models.

Name	Description
Hugging Face Datasets	The standard hub for open datasets. 150,000+ datasets across all tasks.
Common Corpus	Massive open-source dataset for training large language models.
The Stack v2 (BigCode)	Large-scale code dataset covering 619 programming languages. Permissive license.
FineWeb (Hugging Face)	High-quality web dataset for LLM pre-training. 15T tokens.
Dolly (Databricks)	15k instruction-response pairs for fine-tuning. CC-BY-SA.
OpenAssistant Conversations	160k human-generated assistant conversations. Apache 2.0.
ShareGPT (RyokoAI)	Real user-ChatGPT conversations for fine-tuning.
UltraChat (Sean C.)	200k multi-turn conversations synthesized by ChatGPT.
No Robots (Hugging Face)	10k high-quality human-written instructions. Apache 2.0.
RLAIF-V (OpenBMB)	AI-generated preference data for RLHF. Apache 2.0.
MMLU / GSM8K	Standard benchmarks for evaluation.

☁ Model Hosting Platforms

Free platforms that host models — run inference without downloading anything.

Name	Description
Hugging Face Spaces	Free hosting for ML apps (Gradio, Streamlit). Thousands of community demos.
Hugging Face Inference Endpoints (Free Tier)	Deploy models with free trial credits.
Google Colab (Free Tier)	Free GPU (T4, sometimes A100). Perfect for running models and fine-tuning.
Kaggle Notebooks	Free GPU (T4 x2). 30 hours/week. Good for heavier workloads.
Lightning AI Studio	Free tier with GPU access for development and prototyping.
Modal	Free monthly credits for serverless GPU compute.
Replicate (Free Tier)	Free credits for running community models.
Deepnote	Free tier with GPU for data science and ML notebooks.
Beam	$30/mo free credits for serverless GPU compute. Fast cold starts (<1s), auto-scaling, Python SDK. Open-source runtime.
Cerebrium	$30 free trial credits for serverless GPU infrastructure. Sub-second cold starts, pay-per-second billing, auto-scaling. SOC 2 compliant.
Baseten	Free trial credits for serverless GPU inference. Truss open-source framework, auto-scaling, multiple GPU options (T4 to H100).

📚 Learning Resources

Free courses, books, and tutorials for learning AI and LLMs.

Name	Description
Fast.ai	Code-first deep learning education. Practical, free courses from fundamentals to advanced.
Hugging Face NLP Course	Comprehensive free course on transformers, tokenizers, datasets, and deployment.
DeepLearning.AI Short Courses	Free short courses on LLMs, RAG, LangChain, and AI agents.
Full Stack Deep Learning	Free course on ML engineering: training, deploying, and maintaining models.
Andrej Karpathy's Course	From-scratch neural network implementation videos.
Neural Networks: Zero to Hero	YouTube series building neural networks from scratch.
LLM University (Cohere)	Free course on LLMs, embeddings, and RAG.
Prompt Engineering Guide (DAIR.AI)	Comprehensive free guide on prompt engineering techniques.
Anthropic Cookbook	Free recipes and patterns for working with Claude.
OpenAI Cookbook	Free examples and guides for the OpenAI API.

🏆 Resources & Leaderboards

Name	Description
Perplexity	Free AI search and research assistant with real-time answers and source citations.
Hugging Face Open LLM Leaderboard	The primary benchmark for open-weight models. Updated regularly.
LMSYS Chatbot Arena	Human preference rankings of models. Best source for real-world quality comparisons.
Artificial Analysis	Independent benchmarks for speed, pricing, and quality across providers.
Hugging Face Models	Search 1M+ models. Filter by license, task, framework.
OpenRouter Models	Browse models available via API with pricing and free tiers.
Ollama Library	Browse models available for one-command local setup.
cheahjs/free-llm-api-resources	Community-maintained list of free LLM API resources.
SweetTea	Community voting on model quality and preference.

👥 Communities

Name	Description
Hugging Face Discord	Model releases, discussions, and community support.
r/LocalLLaMA	The largest Reddit community for running local LLMs.
Ollama Discord	Ollama community for local model enthusiasts.
LM Studio Discord	LM Studio community.
Hugging Face Forums	Discussions on models, datasets, and Spaces.
r/MachineLearning	General ML/AI research and news.
Discord: AI Agents	Community for AI agent development and agentic frameworks.
r/OpenAI	Official Reddit community for OpenAI models, API discussions, and releases.
r/artificial	General AI discussion covering research, news, and ethics.
OpenAI Developer Forum	Official forum for OpenAI API developers. Share prompts, troubleshoot, and discuss best practices.
Nous Research Discord	Community for open-source AI development, Hermes models, and decentralized training (DisTrO).
Learn AI Together Discord	Active learning community with 10K+ members. Ask questions, find teammates, and share projects.

License

To the extent possible under law, the author has waived all copyright and related or neighboring rights to this work.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Free Models

Contents

🧠 Open-Weight Models

🔌 Free API Providers

💻 Local Inference Tools

💬 AI Chatbot UIs

🤖 AI Coding Assistants

📝 Code Models

🔍 RAG & Vector Databases

🧩 Agentic Frameworks

🎛 Fine-tuning Tools

✨ Prompt Engineering Tools

📊 Datasets

☁ Model Hosting Platforms

📚 Learning Resources

🏆 Resources & Leaderboards

👥 Communities

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Awesome Free Models

Contents

🧠 Open-Weight Models

🔌 Free API Providers

💻 Local Inference Tools

💬 AI Chatbot UIs

🤖 AI Coding Assistants

📝 Code Models

🔍 RAG & Vector Databases

🧩 Agentic Frameworks

🎛 Fine-tuning Tools

✨ Prompt Engineering Tools

📊 Datasets

☁ Model Hosting Platforms

📚 Learning Resources

🏆 Resources & Leaderboards

👥 Communities

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages