Skip to content

feat: Add IBM Granite model support via watsonx.ai provider (OSS-35)#5441

Open
iris-clawd wants to merge 1 commit intomainfrom
feature/ibm-granite-support
Open

feat: Add IBM Granite model support via watsonx.ai provider (OSS-35)#5441
iris-clawd wants to merge 1 commit intomainfrom
feature/ibm-granite-support

Conversation

@iris-clawd
Copy link
Copy Markdown
Contributor

@iris-clawd iris-clawd commented Apr 13, 2026

Summary

Adds native support for IBM Granite models through the watsonx.ai Model Gateway's OpenAI-compatible API. No new dependencies required.

What's included

New watsonx provider (llms/providers/watsonx/)

  • WatsonxCompletion class extending OpenAICompletion — leverages the OpenAI SDK to talk to watsonx's OpenAI-compatible Model Gateway
  • IBM Cloud IAM token exchange — transparently exchanges API keys for Bearer tokens with thread-safe caching and auto-refresh before expiry
  • project_id injection via X-Watsonx-Project-Id header
  • Regional URL construction (us-south, eu-de, eu-gb, jp-tok, au-syd)

Model constants

12 IBM Granite models added to WATSONX_MODELS:

  • Granite 4.x hybrid family (micro/tiny/small)
  • Granite 3.x instruct + base models
  • Granite Code (8B)
  • Granite Guardian (safety)

LLM routing integration

  • watsonx and ibm added to SUPPORTED_NATIVE_PROVIDERS
  • Provider mapping, validation, pattern matching, and inference all updated
  • Works with both prefix syntax and explicit provider param

Tests

Unit tests covering IAM token exchange, caching, refresh, URL resolution, model capabilities, and routing.

Usage

from crewai import LLM

# Provider prefix
llm = LLM(model="watsonx/ibm/granite-4-h-small")

# Explicit provider
llm = LLM(model="ibm/granite-4-h-small", provider="watsonx")

Environment variables:

  • WATSONX_API_KEY — IBM Cloud API key (required)
  • WATSONX_PROJECT_ID — watsonx.ai project ID (required)
  • WATSONX_REGION — IBM Cloud region (optional, default: us-south)
  • WATSONX_URL — Full base URL override (optional)

Design decisions

  • No new pip dependencies — uses existing openai SDK + httpx (already core deps)
  • Extends OpenAICompletion rather than creating from scratch — minimal code, inherits streaming, tool calling, structured output
  • Lazy IAM token exchange — only fetches when first call is made
  • ibm-watsonx-ai optional dep untouched — that's for embeddings, this is for chat completions via the gateway

Closes OSS-35


Note

Medium Risk
Adds a new native LLM provider with IAM token exchange and request header injection, which affects authentication and outbound request behavior. While largely additive and covered by tests, misconfiguration or token-refresh bugs could break watsonx calls at runtime.

Overview
Adds a new native watsonx/ibm provider so LLM can route IBM Granite model names to a dedicated WatsonxCompletion implementation.

Defines WATSONX_MODELS constants and extends provider validation/inference/pattern-matching so Granite models (and granite prefixes) are recognized both with watsonx/... prefixes and via explicit provider="watsonx".

Implements IAM API-key-to-bearer-token exchange with thread-safe caching/refresh, injects X-Watsonx-Project-Id on requests, supports region/URL configuration, and adds unit tests for token handling, URL/region resolution, model capabilities, and factory routing.

Reviewed by Cursor Bugbot for commit d3f422a. Bugbot is set up for automated code reviews on this repo. Configure here.

Add native support for IBM Granite models through the watsonx.ai
Model Gateway OpenAI-compatible API.

Implementation:
- New watsonx provider at llms/providers/watsonx/ extending OpenAICompletion
- IBM Cloud IAM token exchange with thread-safe caching and auto-refresh
- Support for WATSONX_API_KEY, WATSONX_PROJECT_ID, WATSONX_REGION env vars
- 12 Granite models in constants (3.x, 4.x, code, guardian families)
- Full LLM routing: watsonx/ibm/granite-4-h-small or provider='watsonx'
- No new dependencies required (uses existing openai + httpx)

Usage:
  llm = LLM(model='watsonx/ibm/granite-4-h-small')
  llm = LLM(model='ibm/granite-4-h-small', provider='watsonx')

Closes OSS-35
@linear
Copy link
Copy Markdown

linear Bot commented Apr 13, 2026

"crewai.llms.providers.openai.completion.OpenAICompletion.__init__",
return_value=None,
) as mock_init:
completion = WatsonxCompletion.__new__(WatsonxCompletion)
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 4 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit d3f422a. Configure here.

api_key=current_token,
base_url=base_url,
default_headers=watsonx_headers,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Project ID header never injected into API client

High Severity

The _build_client method injects the critical X-Watsonx-Project-Id header, but it overrides a method that doesn't exist in the parent OpenAICompletion class. The parent creates self.client and self.async_client directly in its __init__ without calling any _build_client hook. As a result, the project ID header is never applied to the actual HTTP clients, and all watsonx.ai API requests will lack the required project context. The WatsonxCompletion.__init__ needs to pass default_headers containing the project ID to super().__init__().

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d3f422a. Configure here.

"""
current_token = self._iam_manager.get_token()
if hasattr(self, "client") and self.client is not None:
self.client.api_key = current_token
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Async client token never refreshed before calls

High Severity

_ensure_fresh_token updates self.client.api_key but never updates self.async_client.api_key. When acall() is invoked, it calls _ensure_fresh_token() then delegates to the parent's acall(), which uses self.async_client for the actual API request. After the initial IAM token expires, all async calls will fail with authentication errors because the async client still holds the stale token.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d3f422a. Configure here.

Configuration dict with watsonx-specific fields.
"""
config = super().to_config_dict()
config["model"] = f"watsonx/{self.model}" if "/" not in self.model else f"watsonx/{self.model}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ternary condition with identical branches is redundant

Low Severity

In to_config_dict, the expression f"watsonx/{self.model}" if "/" not in self.model else f"watsonx/{self.model}" has identical true and false branches, making the condition meaningless. This suggests one branch was intended to handle a different case (e.g., not prepending watsonx/ when the model already contains a provider prefix).

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d3f422a. Configure here.

"bedrock": "bedrock",
"aws": "bedrock",
"watsonx": "watsonx",
"ibm": "watsonx",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Model name mangled when using ibm/ prefix routing

Medium Severity

Adding "ibm": "watsonx" to the provider_mapping causes LLM(model="ibm/granite-4-h-small") to treat "ibm" as a provider prefix and strip it, passing model="granite-4-h-small" to WatsonxCompletion. However, the watsonx.ai API expects the full model name "ibm/granite-4-h-small" — the "ibm/" is part of the model identifier, not a provider prefix. The _matches_provider_pattern fallback accepts the "granite" prefix, so validation passes silently, but the API call will fail with a model-not-found error.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d3f422a. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant