feat: Add IBM Granite model support via watsonx.ai provider (OSS-35)#5441
feat: Add IBM Granite model support via watsonx.ai provider (OSS-35)#5441iris-clawd wants to merge 1 commit intomainfrom
Conversation
Add native support for IBM Granite models through the watsonx.ai Model Gateway OpenAI-compatible API. Implementation: - New watsonx provider at llms/providers/watsonx/ extending OpenAICompletion - IBM Cloud IAM token exchange with thread-safe caching and auto-refresh - Support for WATSONX_API_KEY, WATSONX_PROJECT_ID, WATSONX_REGION env vars - 12 Granite models in constants (3.x, 4.x, code, guardian families) - Full LLM routing: watsonx/ibm/granite-4-h-small or provider='watsonx' - No new dependencies required (uses existing openai + httpx) Usage: llm = LLM(model='watsonx/ibm/granite-4-h-small') llm = LLM(model='ibm/granite-4-h-small', provider='watsonx') Closes OSS-35
| "crewai.llms.providers.openai.completion.OpenAICompletion.__init__", | ||
| return_value=None, | ||
| ) as mock_init: | ||
| completion = WatsonxCompletion.__new__(WatsonxCompletion) |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 4 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit d3f422a. Configure here.
| api_key=current_token, | ||
| base_url=base_url, | ||
| default_headers=watsonx_headers, | ||
| ) |
There was a problem hiding this comment.
Project ID header never injected into API client
High Severity
The _build_client method injects the critical X-Watsonx-Project-Id header, but it overrides a method that doesn't exist in the parent OpenAICompletion class. The parent creates self.client and self.async_client directly in its __init__ without calling any _build_client hook. As a result, the project ID header is never applied to the actual HTTP clients, and all watsonx.ai API requests will lack the required project context. The WatsonxCompletion.__init__ needs to pass default_headers containing the project ID to super().__init__().
Additional Locations (1)
Reviewed by Cursor Bugbot for commit d3f422a. Configure here.
| """ | ||
| current_token = self._iam_manager.get_token() | ||
| if hasattr(self, "client") and self.client is not None: | ||
| self.client.api_key = current_token |
There was a problem hiding this comment.
Async client token never refreshed before calls
High Severity
_ensure_fresh_token updates self.client.api_key but never updates self.async_client.api_key. When acall() is invoked, it calls _ensure_fresh_token() then delegates to the parent's acall(), which uses self.async_client for the actual API request. After the initial IAM token expires, all async calls will fail with authentication errors because the async client still holds the stale token.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit d3f422a. Configure here.
| Configuration dict with watsonx-specific fields. | ||
| """ | ||
| config = super().to_config_dict() | ||
| config["model"] = f"watsonx/{self.model}" if "/" not in self.model else f"watsonx/{self.model}" |
There was a problem hiding this comment.
Ternary condition with identical branches is redundant
Low Severity
In to_config_dict, the expression f"watsonx/{self.model}" if "/" not in self.model else f"watsonx/{self.model}" has identical true and false branches, making the condition meaningless. This suggests one branch was intended to handle a different case (e.g., not prepending watsonx/ when the model already contains a provider prefix).
Reviewed by Cursor Bugbot for commit d3f422a. Configure here.
| "bedrock": "bedrock", | ||
| "aws": "bedrock", | ||
| "watsonx": "watsonx", | ||
| "ibm": "watsonx", |
There was a problem hiding this comment.
Model name mangled when using ibm/ prefix routing
Medium Severity
Adding "ibm": "watsonx" to the provider_mapping causes LLM(model="ibm/granite-4-h-small") to treat "ibm" as a provider prefix and strip it, passing model="granite-4-h-small" to WatsonxCompletion. However, the watsonx.ai API expects the full model name "ibm/granite-4-h-small" — the "ibm/" is part of the model identifier, not a provider prefix. The _matches_provider_pattern fallback accepts the "granite" prefix, so validation passes silently, but the API call will fail with a model-not-found error.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit d3f422a. Configure here.


Summary
Adds native support for IBM Granite models through the watsonx.ai Model Gateway's OpenAI-compatible API. No new dependencies required.
What's included
New watsonx provider (
llms/providers/watsonx/)WatsonxCompletionclass extendingOpenAICompletion— leverages the OpenAI SDK to talk to watsonx's OpenAI-compatible Model Gatewayproject_idinjection viaX-Watsonx-Project-IdheaderModel constants
12 IBM Granite models added to
WATSONX_MODELS:LLM routing integration
watsonxandibmadded toSUPPORTED_NATIVE_PROVIDERSTests
Unit tests covering IAM token exchange, caching, refresh, URL resolution, model capabilities, and routing.
Usage
Environment variables:
WATSONX_API_KEY— IBM Cloud API key (required)WATSONX_PROJECT_ID— watsonx.ai project ID (required)WATSONX_REGION— IBM Cloud region (optional, default: us-south)WATSONX_URL— Full base URL override (optional)Design decisions
openaiSDK +httpx(already core deps)Closes OSS-35
Note
Medium Risk
Adds a new native LLM provider with IAM token exchange and request header injection, which affects authentication and outbound request behavior. While largely additive and covered by tests, misconfiguration or token-refresh bugs could break watsonx calls at runtime.
Overview
Adds a new native
watsonx/ibmprovider soLLMcan route IBM Granite model names to a dedicatedWatsonxCompletionimplementation.Defines
WATSONX_MODELSconstants and extends provider validation/inference/pattern-matching so Granite models (andgraniteprefixes) are recognized both withwatsonx/...prefixes and via explicitprovider="watsonx".Implements IAM API-key-to-bearer-token exchange with thread-safe caching/refresh, injects
X-Watsonx-Project-Idon requests, supports region/URL configuration, and adds unit tests for token handling, URL/region resolution, model capabilities, and factory routing.Reviewed by Cursor Bugbot for commit d3f422a. Bugbot is set up for automated code reviews on this repo. Configure here.