Free LLM inference for coding agents and AI-powered IDEs.
FreeInference provides free access to state-of-the-art language models specifically designed for coding agents like Cursor, Codex, Roo Code, and other AI-powered development tools.
Visit our documentation at: https://harvardsys.github.io/free_inference/
- Cursor - AI-powered code editor
- Codex - Terminal-based coding assistant
- Roo Code - VS Code & JetBrains extension
- Kilo Code - AI coding assistant
- And any tool that supports OpenAI-compatible APIs
- Open Settings (
Cmd + ,orCtrl + ,) - Go to API Keys section
- Enter your FreeInference API key
- Click Override OpenAI Base URL
- Enter:
https://freeinference.org/v1 - Enable the toggle and start coding!
- Create
~/.codex/config.toml:
model = "glm-4.7"
model_provider = "free_inference"
[model_providers.free_inference]
name = "FreeInference"
base_url = "https://freeinference.org/v1"
wire_api = "chat"
env_http_headers = { "X-Session-ID" = "CODEX_SESSION_ID", "Authorization" = "FREEINFERENCE_API_KEY" }- Add to
~/.zshrcor~/.bashrc:
export CODEX_SESSION_ID="$(date +%Y%m%d-%H%M%S)-$(uuidgen)"
export FREEINFERENCE_API_KEY="Bearer your-api-key-here"- Reload:
source ~/.zshrc
- Install the extension in your IDE
- Open settings
- Select OpenAI Compatible as provider
- Configure:
- Base URL:
https://freeinference.org/v1 - API Key:
your-api-key-here
- Base URL:
- Select your preferred model
- GLM-4.7 - 200K context, best for long context and bilingual support
- GLM-4.7-Flash - 200K context, fast and cost-effective
- MiniMax M2 - 196K context, best for very large codebases
- Qwen3 Coder 30B - 32K context, specialized for code generation
- Llama 3.3 70B - 131K context, general coding (limited capacity)
- Llama 4 Scout - 128K context, optimized for speed (limited capacity)
- Llama 4 Maverick - 128K context, multimodal support (limited capacity)
See the Models documentation for the complete list.
- Visit https://freeinference.org
- Register for a free account
- Log in and create your API key
- Start using FreeInference with your favorite IDE!
- Documentation: https://harvardsys.github.io/free_inference/
- Issues: GitHub Issues
- Questions: Contact the team