feat(oci): Add Oracle Cloud Infrastructure (OCI) Generative AI client support#754
Conversation
Adds OciClient (V1 API) and OciClientV2 (V2 API) for the OCI Generative AI service, following the BedrockClient pattern with httpx event hooks. Authentication: config file, custom profiles, session tokens, direct credentials, instance principal, resource principal. API coverage: embed (all models), chat with streaming (OciClient for Command R family, OciClientV2 for Command A). Lazy-loads oci SDK as an optional dependency; install with `pip install cohere[oci]`.
…profile - README: remove specific model names from Supported APIs and Model Availability sections (per mkozakov review — will go out of date) - tests: default OCI_PROFILE to DEFAULT instead of API_KEY_AUTH
The "stream" in endpoint check was dead code — both V1 and V2 SDK always route through endpoint "chat" (v1/chat and v2/chat paths). Streaming is reliably signalled via body["stream"], which the SDK always sets. - Drop "stream" in endpoint guard on is_stream and isStream detection - Remove "chat_stream" from action_map, transform, and response branches - Update unit tests to use "chat" endpoint (the only real one)
_current_content_type now returns None for events with no message content
(e.g. {"finishReason": "COMPLETE"}). The transition branch in
_transform_v2_event is skipped when event_content_type is None, so a
finish-only event after a thinking block no longer opens a spurious empty
text block before emitting content-end.
…urn, and system message
… token per-request - transform_request_to_oci now raises ValueError for endpoints other than 'embed' and 'chat' instead of silently returning the untransformed body - Session token auth uses a refreshing wrapper that re-reads the token file before each signing call, so OCI CLI token refreshes are picked up without restarting the client - Add test_unsupported_endpoint_raises to cover the new explicit error - Update test_session_auth_prefers_security_token_signer to expect multi-call behaviour from the refreshing signer
test_session_token_refreshed_on_subsequent_requests writes a real token file, makes two requests with the file updated between them, and asserts that the second signing call uses the new token — verifying the refreshing signer works end-to-end.
The OCI client code introduced several mypy errors that went unnoticed
because mypy was configured but never enforced in tests or CI.
Type fixes:
- lazy_oci_deps.py: suppress import-untyped for oci SDK (no type stubs)
- oci_client.py: cast response.stream to Iterator[bytes] (httpx types it
as SyncByteStream | AsyncByteStream but it's iterable at runtime)
- oci_client.py: use .get("model", "") to satisfy str expectation
- test_oci_client.py: suppress attr-defined on dynamic module stubs
New test gate (tests/test_oci_mypy.py):
- Runs mypy on OCI source and test files as part of pytest
- Uses --follow-imports=silent to isolate from pre-existing AWS errors
- Skips gracefully if mypy is not on PATH
- Ensures future type regressions fail the test suite immediately
…nion The SDK's EmbedResponse is a discriminated union on response_type (embeddings_floats vs embeddings_by_type). The OCI embed response transformation was missing this field, causing pydantic to return None instead of an EmbedResponse object. This broke V1 embed when the SDK's merge_embed_responses tried to access .meta on None. V1 (flat float arrays) now returns response_type="embeddings_floats", V2 (typed dict) returns response_type="embeddings_by_type".
Risk Assessment Response"mistakes could break OCI auth, payload mapping, or chat/embed streaming behavior" Mitigated — 56 tests cover exactly this: auth mapping (session token, API key, config file), request/response transformations for every endpoint, stream event lifecycle, thinking transitions, malformed JSON handling, and edge cases like empty streams. Integration tests validate against live OCI GenAI (us-chicago-1). "Dependency/lockfile updates may affect packaging/install surfaces" The "introduces a mypy gate that may fail CI if typing drifts" Intentional. Only checks OCI files and skips if mypy isn't on PATH, so it won't break existing CI. If it does fail, that's the point — it already caught a real bug (missing |
|
One area that could be stronger: there's no unit test for the embed |
| self.lines = lines | ||
|
|
||
| def __iter__(self) -> typing.Iterator[bytes]: | ||
| return self.lines |
There was a problem hiding this comment.
Duplicate Streamer class already exists in codebase
Low Severity
The new Streamer class in src/cohere/manually_maintained/streaming.py is an exact duplicate of the existing Streamer class in src/cohere/aws_client.py (line 115). Both extend SyncByteStream with identical __init__ and __iter__ implementations. The OCI client imports from the new file instead of reusing the existing class. Having two identical classes increases maintenance burden — a fix in one won't automatically apply to the other.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit d171e7a. Configure here.
src/cohere/oci_client.py
Outdated
| if "finishReason" in oci_event: | ||
| final_v1_finish_reason = oci_event.get("finishReason", final_v1_finish_reason) | ||
| return _emit_v1_event(event) | ||
| return b"" |
There was a problem hiding this comment.
V1 stream missing stream-start event before text generation
Medium Severity
The V1 stream transformation emits text-generation events followed by a stream-end event, but never emits a stream-start event at the beginning of the stream. The standard Cohere V1 streaming chat format begins with a stream-start event (containing generation_id). Consumers of the V1 stream that rely on stream-start to initialize state — such as extracting the generation_id — will not receive it, potentially causing unexpected behavior or missing metadata.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit d171e7a. Configure here.
Adds unit tests for V1 (embeddings_floats) and V2 (embeddings_by_type) response_type presence, and adds assertions to the live integration embed tests. Ensures the discriminated union field can't be silently removed without test failure.
…eam-start 1. Remove duplicate Streamer class (manually_maintained/streaming.py) and import from aws_client.py instead. Both were identical SyncByteStream wrappers. 2. Emit stream-start event with generation_id at the beginning of V1 streams, matching the standard Cohere V1 streaming chat format. Consumers relying on stream-start for state initialization will now receive it before text-generation events. Updated test_v1_stream_wrapper_preserves_finish_reason to verify stream-start is emitted first.
|
@daniel-cohere — here's a summary of everything addressed in this PR since the original #718: Bug fixes
Code quality
New test coverage
Test results60/60 passed (~12.5s) — 20 integration (live OCI GenAI), 38 unit, 2 mypy gate. |
| } | ||
| }, | ||
| } | ||
| ) |
There was a problem hiding this comment.
V2 stream content-delta dropped for empty-string text chunks
Low Severity
In transform_stream_event, the if content_value: check skips emitting content-delta events when the content is an empty string. Combined with how _transform_v2_event uses _current_content_type to detect type transitions, this means an OCI event that signals a transition from THINKING to TEXT but carries an empty text chunk will correctly trigger a type transition (via _current_content_type) but emit no content-delta. This is mostly benign but breaks symmetry with the Cohere native API, where every content block includes at least one delta.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit b1ed6d5. Configure here.
When oci is installed but lacks stubs, mypy raises import-untyped. When oci is not installed (optional dep), mypy raises import-not-found. Cover both cases since cohere[oci] is optional.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
There are 5 total unresolved issues (including 3 from previous reviews).
Reviewed by Cursor Bugbot for commit 17d4647. Configure here.
| "finish_reason": final_v1_finish_reason, | ||
| }, | ||
| } | ||
| ) |
There was a problem hiding this comment.
V1 stream emits stream-end without stream-start guard
Medium Severity
The V1 [DONE] handler unconditionally emits a stream-end event without checking emitted_start. If OCI sends an empty stream (only [DONE] with no data events), a stream-end is produced without a preceding stream-start, which could crash downstream SDK parsing. The V2 path correctly guards this with if emitted_start: but the V1 else branch at line 1076 lacks the same check.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 17d4647. Configure here.
| if "safety_mode" in cohere_body: | ||
| chat_request["safetyMode"] = cohere_body["safety_mode"] | ||
| if "priority" in cohere_body: | ||
| chat_request["priority"] = cohere_body["priority"] |
There was a problem hiding this comment.
Repetitive V1/V2 chat parameter mapping blocks
Low Severity
The V1 and V2 chat parameter mapping in transform_request_to_oci contains ~15 nearly identical parameter-to-camelCase conversions duplicated across two branches (e.g., temperature, maxTokens, topK, topP, seed, frequencyPenalty, presencePenalty, stopSequences, tools, documents, responseFormat, safetyMode, priority). Extracting the shared mappings into a helper would reduce maintenance burden and the risk of inconsistent updates.
Reviewed by Cursor Bugbot for commit 17d4647. Configure here.


Summary
Continuation of #718. Adds first-class Oracle Cloud Infrastructure (OCI) support to the Cohere Python SDK via two new clients:
OciClient(V1 API) — for Command R family models and embeddings usingCOHEREAPI formatOciClientV2(V2 API) — for Command A family models usingCOHEREV2API format with full streaming supportBoth clients transparently transform Cohere request/response payloads to/from OCI Generative AI format and handle OCI request signing, so users interact with the standard Cohere SDK interface.
Key capabilities
embed,chat,chat_stream(V1 and V2)thinkingparameter with Command A Reasoning modelscommand-a-03-2025), prefixed (cohere.command-a-03-2025), or OCIDscohere[oci]optional extra, no hard dependency onocipackageNew files
src/cohere/oci_client.py— OciClient, OciClientV2, and all request/response/stream transformationssrc/cohere/manually_maintained/lazy_oci_deps.py— lazy import for optional OCI SDKtests/test_oci_client.py— 56 tests (20 integration against live OCI GenAI, 36 unit)tests/test_oci_mypy.py— mypy type-checking gate for OCI code (prevents type regressions)Bug fixes included
response_type: Added missingresponse_typefield (embeddings_floats/embeddings_by_type) to OCI embed responses — required by the SDK's discriminated union after Fern regeneration[tool.poetry.extras]: Merged duplicate TOML sections introduced by the merge with mainTest plan
poetry run pytest tests/test_oci_mypy.py -v— 2/2 passed (mypy gate)poetry run pytest tests/test_oci_client.py::TestOciClientTransformations -v— 36/36 passed (unit)Note
Medium Risk
Adds a new OCI transport layer that rewrites and signs requests plus transforms streaming responses, which could affect request correctness and streaming semantics; also introduces a new optional dependency and large lockfile churn.
Overview
Adds new OCI-backed clients (
OciClient,OciClientV2) that routeembed,chat, andchat_streamthrough OCI Generative AI by transforming Cohere payloads to OCI format, signing requests with OCI auth, and mapping OCI responses back to Cohere schemas (including SSE stream event conversion and V2 thinking/content type transitions).Introduces lazy-loading for the optional
ocidependency (cohere[oci]), exposes the new clients fromcohere.__init__, and documents OCI setup/auth methods inREADME.md. Updates packaging to add theocioptional extra and refreshespoetry.lockaccordingly.Reviewed by Cursor Bugbot for commit 17d4647. Bugbot is set up for automated code reviews on this repo. Configure here.