diff --git a/docs/providers/deepinfra.md b/docs/providers/deepinfra.md new file mode 100644 index 00000000..ed8963df --- /dev/null +++ b/docs/providers/deepinfra.md @@ -0,0 +1,90 @@ +--- +sidebar_label: DeepInfra +description: Configure DeepInfra's high-performance AI models in Roo Code. Access Qwen Coder, Llama, and other open-source models with prompt caching and vision capabilities. +keywords: + - deepinfra + - deep infra + - roo code + - api provider + - qwen coder + - llama models + - prompt caching + - vision models + - open source ai +image: /img/social-share.jpg +--- + +# Using DeepInfra With Roo Code + +DeepInfra provides cost-effective access to high-performance open-source models with features like prompt caching, vision support, and specialized coding models. Their infrastructure offers low latency and automatic load balancing across global edge locations. + +**Website:** [https://deepinfra.com/](https://deepinfra.com/) + +--- + +## Getting an API Key + +1. **Sign Up/Sign In:** Go to [DeepInfra](https://deepinfra.com/). Create an account or sign in. +2. **Navigate to API Keys:** Access the API keys section in your dashboard. +3. **Create a Key:** Generate a new API key. Give it a descriptive name (e.g., "Roo Code"). +4. **Copy the Key:** **Important:** Copy the API key immediately. Store it securely. + +--- + +## Supported Models + +Roo Code dynamically fetches available models from DeepInfra's API. The default model is: + +* `Qwen/Qwen3-Coder-480B-A35B-Instruct-Turbo` (256K context, optimized for coding) + +Common models available include: + +* **Coding Models:** Qwen Coder series, specialized for programming tasks +* **General Models:** Llama 3.1, Mixtral, and other open-source models +* **Vision Models:** Models with image understanding capabilities +* **Reasoning Models:** Models with advanced reasoning support + +Browse the full catalog at [deepinfra.com/models](https://deepinfra.com/models). + +--- + +## Configuration in Roo Code + +1. **Open Roo Code Settings:** Click the gear icon () in the Roo Code panel. +2. **Select Provider:** Choose "DeepInfra" from the "API Provider" dropdown. +3. **Enter API Key:** Paste your DeepInfra API key into the "DeepInfra API Key" field. +4. **Select Model:** Choose your desired model from the "Model" dropdown. + - Models will auto-populate after entering a valid API key + - Click "Refresh Models" to update the list + +--- + +## Advanced Features + +### Prompt Caching + +DeepInfra supports prompt caching for eligible models, which: +- Reduces costs for repeated contexts +- Improves response times for similar queries +- Automatically manages cache based on task IDs + +### Vision Support + +Models with vision capabilities can: +- Process images alongside text +- Understand visual content for coding tasks +- Analyze screenshots and diagrams + +### Custom Base URL + +For enterprise deployments, you can configure a custom base URL in the advanced settings. + +--- + +## Tips and Notes + +* **Performance:** DeepInfra offers low latency with automatic load balancing across global locations. +* **Cost Efficiency:** Competitive pricing with prompt caching to reduce costs for repeated contexts. +* **Model Variety:** Access to the latest open-source models including specialized coding models. +* **Context Windows:** Models support context windows up to 256K tokens for large codebases. +* **Pricing:** Pay-per-use model with no minimums. Check [deepinfra.com](https://deepinfra.com/) for current pricing. \ No newline at end of file diff --git a/docs/update-notes/index.md b/docs/update-notes/index.md index fbbe1427..1ceb66d4 100644 --- a/docs/update-notes/index.md +++ b/docs/update-notes/index.md @@ -19,6 +19,7 @@ image: /img/social-share.jpg ### Version 3.26 +* [3.26.7](/update-notes/v3.26.7) (2025-09-05) * [3.26.6](/update-notes/v3.26.6) (2025-09-03) * [3.26.5](/update-notes/v3.26.5) (2025-09-03) * [3.26.4](/update-notes/v3.26.4) (2025-09-01) diff --git a/docs/update-notes/v3.26.7.mdx b/docs/update-notes/v3.26.7.mdx new file mode 100644 index 00000000..16d1dbb4 --- /dev/null +++ b/docs/update-notes/v3.26.7.mdx @@ -0,0 +1,63 @@ +--- +description: Enhanced Kimi K2 models with 256K+ context windows, OpenAI service tiers for flexible pricing, and DeepInfra as a new provider with 100+ models. +keywords: + - roo code 3.26.7 + - kimi k2 models + - openai service tiers + - deepinfra provider + - bug fixes +image: /img/social-share.jpg +--- + +# Roo Code 3.26.7 Release Notes (2025-09-05) + +This release brings enhanced Kimi K2 models with massive context windows, OpenAI service tier selection, and DeepInfra as a new provider offering 100+ models. + +## Kimi K2-0905: Moonshot's Latest Open Source Model is Live in Roo Code + +We've upgraded to the latest Kimi K2-0905 models across multiple providers (thanks CellenLee!) ([#7663](https://github.com/RooCodeInc/Roo-Code/pull/7663), [#7693](https://github.com/RooCodeInc/Roo-Code/pull/7693)): + +K2-0905 comes with three major upgrades: +- **256K Context Window**: Massive context supporting up to 256K-262K tokens, doubling the previous limit for processing much larger documents and conversations +- **Improved Tool Calling**: Enhanced function calling and tool use capabilities for better agentic workflows +- **Enhanced Front-end Development**: Superior HTML, CSS, and JavaScript generation with modern framework support + +Available through Groq, Moonshot, and Fireworks providers. These models excel at handling large codebases, long conversations, and complex multi-file operations. + +## OpenAI Service Tiers + +We've added support for OpenAI's new Responses API service tiers ([#7646](https://github.com/RooCodeInc/Roo-Code/pull/7646)): + +- **Standard Tier**: Default tier with regular pricing +- **Flex Tier**: 50% discount with slightly longer response times for non-urgent tasks +- **Priority Tier**: Faster response times for time-critical operations + +Select your preferred tier directly in the UI based on your needs and budget. This gives you more control over costs while maintaining access to OpenAI's powerful models. + +> **📚 Documentation**: See [OpenAI Provider Guide](/providers/openai) for detailed tier comparison and pricing. + +## DeepInfra Provider + +DeepInfra is now available as a model provider (thanks Thachnh!) ([#7677](https://github.com/RooCodeInc/Roo-Code/pull/7677)): + +- **100+ Models**: Access to a vast selection of open-source and frontier models +- **Competitive Pricing**: Very cost-effective rates compared to other providers +- **Automatic Prompt Caching**: Built-in prompt caching for supported models like Qwen3 Coder +- **Fast Inference**: Optimized infrastructure for quick response times + +DeepInfra is an excellent choice for developers looking for variety and value in their AI model selection. + +> **📚 Documentation**: See [DeepInfra Provider Setup](/providers/deepinfra) to get started. + +## QOL Improvements + +* **Shell Security**: Added shell executable allowlist validation with platform-specific fallbacks for improved command execution safety ([#7681](https://github.com/RooCodeInc/Roo-Code/pull/7681)) + +## Bug Fixes + +* **MCP Tool Validation**: Roo now validates MCP tool existence before execution and shows helpful error messages with available tools (thanks R-omk!) ([#7632](https://github.com/RooCodeInc/Roo-Code/pull/7632)) +* **OpenAI API Key Errors**: Clear error messages now display when API keys contain invalid characters instead of cryptic ByteString errors (thanks A0nameless0man!) ([#7586](https://github.com/RooCodeInc/Roo-Code/pull/7586)) +* **Follow-up Questions**: Fixed countdown timer incorrectly reappearing in task history for already answered follow-up questions (thanks XuyiK!) ([#7686](https://github.com/RooCodeInc/Roo-Code/pull/7686)) +* **Moonshot Token Limit**: Resolved issue where Moonshot models were incorrectly limited to 1024 tokens, now properly respects configured limits (thanks wangxiaolong100, greyishsong!) ([#7673](https://github.com/RooCodeInc/Roo-Code/pull/7673)) +* **Zsh Command Safety**: Improved handling of zsh process substitution and glob qualifiers to prevent auto-execution of potentially dangerous commands ([#7658](https://github.com/RooCodeInc/Roo-Code/pull/7658), [#7667](https://github.com/RooCodeInc/Roo-Code/pull/7667)) +* **Traditional Chinese Localization**: Fixed typo in zh-TW locale text (thanks PeterDaveHello!) ([#7672](https://github.com/RooCodeInc/Roo-Code/pull/7672)) \ No newline at end of file diff --git a/docs/update-notes/v3.26.mdx b/docs/update-notes/v3.26.mdx index cb0d4f0a..e190aee3 100644 --- a/docs/update-notes/v3.26.mdx +++ b/docs/update-notes/v3.26.mdx @@ -94,8 +94,32 @@ PRs: [#7474](https://github.com/RooCodeInc/Roo-Code/pull/7474), [#7492](https:// > **📚 Documentation**: See [Image Generation - Editing Existing Images](/features/image-generation#editing-existing-images) for transformation examples. +### Kimi K2-0905: Moonshot's Latest Open Source Model is Live in Roo Code + +We've upgraded to the latest Kimi K2-0905 models across multiple providers (thanks CellenLee!) ([#7663](https://github.com/RooCodeInc/Roo-Code/pull/7663), [#7693](https://github.com/RooCodeInc/Roo-Code/pull/7693)): + +K2-0905 comes with three major upgrades: +- **256K Context Window**: Massive context supporting up to 256K-262K tokens, doubling the previous limit for processing much larger documents and conversations +- **Improved Tool Calling**: Enhanced function calling and tool use capabilities for better agentic workflows +- **Enhanced Front-end Development**: Superior HTML, CSS, and JavaScript generation with modern framework support + +Available through Groq, Moonshot, and Fireworks providers. These models excel at handling large codebases, long conversations, and complex multi-file operations. + +### OpenAI Service Tiers + +We've added support for OpenAI's new Responses API service tiers ([#7646](https://github.com/RooCodeInc/Roo-Code/pull/7646)): + +- **Standard Tier**: Default tier with regular pricing +- **Flex Tier**: 50% discount with slightly longer response times for non-urgent tasks +- **Priority Tier**: Faster response times for time-critical operations + +Select your preferred tier directly in the UI based on your needs and budget. This gives you more control over costs while maintaining access to OpenAI's powerful models. + +> **📚 Documentation**: See [OpenAI Provider Guide](/providers/openai) for detailed tier comparison and pricing. + ### Provider Updates +* **DeepInfra Provider**: DeepInfra is now available as a model provider with 100+ open-source and frontier models, competitive pricing, and automatic prompt caching for supported models like Qwen3 Coder (thanks Thachnh!) ([#7677](https://github.com/RooCodeInc/Roo-Code/pull/7677)) * **Kimi K2 Turbo Model**: Added support for the high-speed Kimi K2 Turbo model with 60-100 tokens/sec processing and a 131K token context window (thanks wangxiaolong100!) ([#7593](https://github.com/RooCodeInc/Roo-Code/pull/7593)) * **Qwen3 235B Thinking Model**: Added support for Qwen3-235B-A22B-Thinking-2507 model with an impressive 262K context window, enabling processing of extremely long documents and large codebases in a single request through the Chutes provider (thanks mohammad154, apple-techie!) ([#7578](https://github.com/RooCodeInc/Roo-Code/pull/7578)) * **Ollama Turbo Mode**: Added API key support for Turbo mode, enabling faster model execution with datacenter-grade hardware (thanks LivioGama!) ([#7425](https://github.com/RooCodeInc/Roo-Code/pull/7425)) @@ -104,6 +128,7 @@ PRs: [#7474](https://github.com/RooCodeInc/Roo-Code/pull/7474), [#7492](https:// ### QOL Improvements +* **Shell Security**: Added shell executable allowlist validation with platform-specific fallbacks for improved command execution safety ([#7681](https://github.com/RooCodeInc/Roo-Code/pull/7681)) * **Settings Scroll Position**: Settings tabs now remember their individual scroll positions when switching between them (thanks DC-Dancao!) ([#7587](https://github.com/RooCodeInc/Roo-Code/pull/7587)) * **MCP Resource Auto-Approval**: MCP resource access requests are now automatically approved when auto-approve is enabled, eliminating manual approval steps and enabling smoother automation workflows (thanks m-ibm!) ([#7606](https://github.com/RooCodeInc/Roo-Code/pull/7606)) * **Message Queue Performance**: Improved message queueing reliability and performance by moving the queue management to the extension host, making the interface more stable ([#7604](https://github.com/RooCodeInc/Roo-Code/pull/7604)) @@ -122,6 +147,12 @@ PRs: [#7474](https://github.com/RooCodeInc/Roo-Code/pull/7474), [#7492](https:// ### Bug Fixes +* **MCP Tool Validation**: Roo now validates MCP tool existence before execution and shows helpful error messages with available tools (thanks R-omk!) ([#7632](https://github.com/RooCodeInc/Roo-Code/pull/7632)) +* **OpenAI API Key Errors**: Clear error messages now display when API keys contain invalid characters instead of cryptic ByteString errors (thanks A0nameless0man!) ([#7586](https://github.com/RooCodeInc/Roo-Code/pull/7586)) +* **Follow-up Questions**: Fixed countdown timer incorrectly reappearing in task history for already answered follow-up questions (thanks XuyiK!) ([#7686](https://github.com/RooCodeInc/Roo-Code/pull/7686)) +* **Moonshot Token Limit**: Resolved issue where Moonshot models were incorrectly limited to 1024 tokens, now properly respects configured limits (thanks wangxiaolong100, greyishsong!) ([#7673](https://github.com/RooCodeInc/Roo-Code/pull/7673)) +* **Zsh Command Safety**: Improved handling of zsh process substitution and glob qualifiers to prevent auto-execution of potentially dangerous commands ([#7658](https://github.com/RooCodeInc/Roo-Code/pull/7658), [#7667](https://github.com/RooCodeInc/Roo-Code/pull/7667)) +* **Traditional Chinese Localization**: Fixed typo in zh-TW locale text (thanks PeterDaveHello!) ([#7672](https://github.com/RooCodeInc/Roo-Code/pull/7672)) * **Tool Approval Fix**: Fixed an error that occurred when using insert_content and search_and_replace tools on write-protected files - these tools now handle file protection correctly ([#7649](https://github.com/RooCodeInc/Roo-Code/pull/7649)) * **Configurable Embedding Batch Size**: Fixed an issue where users with API providers having stricter batch limits couldn't use code indexing. You can now configure the embedding batch size (1-2048, default: 400) to match your provider's limits (thanks BenLampson!) ([#7464](https://github.com/RooCodeInc/Roo-Code/pull/7464)) * **OpenAI-Native Cache Reporting**: Fixed cache usage statistics and cost calculations when using the OpenAI-Native provider with cached content ([#7602](https://github.com/RooCodeInc/Roo-Code/pull/7602)) diff --git a/sidebars.ts b/sidebars.ts index 51a6853f..e1a045da 100644 --- a/sidebars.ts +++ b/sidebars.ts @@ -164,6 +164,7 @@ const sidebars: SidebarsConfig = { 'providers/claude-code', 'providers/bedrock', 'providers/cerebras', + 'providers/deepinfra', 'providers/deepseek', 'providers/doubao', 'providers/featherless', @@ -221,6 +222,7 @@ const sidebars: SidebarsConfig = { label: '3.26', items: [ { type: 'doc', id: 'update-notes/v3.26', label: '3.26 Combined' }, + { type: 'doc', id: 'update-notes/v3.26.7', label: '3.26.7' }, { type: 'doc', id: 'update-notes/v3.26.6', label: '3.26.6' }, { type: 'doc', id: 'update-notes/v3.26.5', label: '3.26.5' }, { type: 'doc', id: 'update-notes/v3.26.4', label: '3.26.4' },