Skip to content

feat(oci): Add Oracle Cloud Infrastructure (OCI) Generative AI client support#754

Merged
daniel-cohere merged 15 commits intocohere-ai:mainfrom
fede-kamel:feat/oci-client
Apr 9, 2026
Merged

feat(oci): Add Oracle Cloud Infrastructure (OCI) Generative AI client support#754
daniel-cohere merged 15 commits intocohere-ai:mainfrom
fede-kamel:feat/oci-client

Conversation

@fede-kamel
Copy link
Copy Markdown
Contributor

@fede-kamel fede-kamel commented Apr 9, 2026

Summary

Continuation of #718. Adds first-class Oracle Cloud Infrastructure (OCI) support to the Cohere Python SDK via two new clients:

  • OciClient (V1 API) — for Command R family models and embeddings using COHERE API format
  • OciClientV2 (V2 API) — for Command A family models using COHEREV2 API format with full streaming support

Both clients transparently transform Cohere request/response payloads to/from OCI Generative AI format and handle OCI request signing, so users interact with the standard Cohere SDK interface.

Key capabilities

  • Authentication: config file, session token, direct credentials, instance principals, resource principals
  • Endpoints: embed, chat, chat_stream (V1 and V2)
  • Streaming: Full SSE stream transformation with content-delta, thinking blocks, and type transitions
  • Thinking/reasoning: Support for thinking parameter with Command A Reasoning models
  • Tool calling: V2 tool_calls/tool_plan conversion between Cohere and OCI formats
  • Model normalization: Accepts plain names (command-a-03-2025), prefixed (cohere.command-a-03-2025), or OCIDs
  • Lazy OCI SDK loading: cohere[oci] optional extra, no hard dependency on oci package

New files

  • src/cohere/oci_client.py — OciClient, OciClientV2, and all request/response/stream transformations
  • src/cohere/manually_maintained/lazy_oci_deps.py — lazy import for optional OCI SDK
  • tests/test_oci_client.py — 56 tests (20 integration against live OCI GenAI, 36 unit)
  • tests/test_oci_mypy.py — mypy type-checking gate for OCI code (prevents type regressions)

Bug fixes included

  • Embed response_type: Added missing response_type field (embeddings_floats / embeddings_by_type) to OCI embed responses — required by the SDK's discriminated union after Fern regeneration
  • Mypy compliance: Fixed 7 type errors across OCI source and test files (casts, type ignores, default values)
  • Duplicate [tool.poetry.extras]: Merged duplicate TOML sections introduced by the merge with main

Test plan

  • poetry run pytest tests/test_oci_mypy.py -v — 2/2 passed (mypy gate)
  • poetry run pytest tests/test_oci_client.py::TestOciClientTransformations -v — 36/36 passed (unit)
  • Integration tests against live OCI GenAI (us-chicago-1) — 20/20 passed
  • Full suite: 58/58 passed

Note

Medium Risk
Adds a new OCI transport layer that rewrites and signs requests plus transforms streaming responses, which could affect request correctness and streaming semantics; also introduces a new optional dependency and large lockfile churn.

Overview
Adds new OCI-backed clients (OciClient, OciClientV2) that route embed, chat, and chat_stream through OCI Generative AI by transforming Cohere payloads to OCI format, signing requests with OCI auth, and mapping OCI responses back to Cohere schemas (including SSE stream event conversion and V2 thinking/content type transitions).

Introduces lazy-loading for the optional oci dependency (cohere[oci]), exposes the new clients from cohere.__init__, and documents OCI setup/auth methods in README.md. Updates packaging to add the oci optional extra and refreshes poetry.lock accordingly.

Reviewed by Cursor Bugbot for commit 17d4647. Bugbot is set up for automated code reviews on this repo. Configure here.

Adds OciClient (V1 API) and OciClientV2 (V2 API) for the OCI Generative
AI service, following the BedrockClient pattern with httpx event hooks.

Authentication: config file, custom profiles, session tokens, direct
credentials, instance principal, resource principal.

API coverage: embed (all models), chat with streaming (OciClient for
Command R family, OciClientV2 for Command A). Lazy-loads oci SDK as an
optional dependency; install with `pip install cohere[oci]`.
…profile

- README: remove specific model names from Supported APIs and Model
  Availability sections (per mkozakov review — will go out of date)
- tests: default OCI_PROFILE to DEFAULT instead of API_KEY_AUTH
The "stream" in endpoint check was dead code — both V1 and V2 SDK always
route through endpoint "chat" (v1/chat and v2/chat paths). Streaming is
reliably signalled via body["stream"], which the SDK always sets.

- Drop "stream" in endpoint guard on is_stream and isStream detection
- Remove "chat_stream" from action_map, transform, and response branches
- Update unit tests to use "chat" endpoint (the only real one)
_current_content_type now returns None for events with no message content
(e.g. {"finishReason": "COMPLETE"}). The transition branch in
_transform_v2_event is skipped when event_content_type is None, so a
finish-only event after a thinking block no longer opens a spurious empty
text block before emitting content-end.
… token per-request

- transform_request_to_oci now raises ValueError for endpoints other than
  'embed' and 'chat' instead of silently returning the untransformed body
- Session token auth uses a refreshing wrapper that re-reads the token file
  before each signing call, so OCI CLI token refreshes are picked up without
  restarting the client
- Add test_unsupported_endpoint_raises to cover the new explicit error
- Update test_session_auth_prefers_security_token_signer to expect multi-call
  behaviour from the refreshing signer
test_session_token_refreshed_on_subsequent_requests writes a real token file,
makes two requests with the file updated between them, and asserts that the
second signing call uses the new token — verifying the refreshing signer works
end-to-end.
The OCI client code introduced several mypy errors that went unnoticed
because mypy was configured but never enforced in tests or CI.

Type fixes:
- lazy_oci_deps.py: suppress import-untyped for oci SDK (no type stubs)
- oci_client.py: cast response.stream to Iterator[bytes] (httpx types it
  as SyncByteStream | AsyncByteStream but it's iterable at runtime)
- oci_client.py: use .get("model", "") to satisfy str expectation
- test_oci_client.py: suppress attr-defined on dynamic module stubs

New test gate (tests/test_oci_mypy.py):
- Runs mypy on OCI source and test files as part of pytest
- Uses --follow-imports=silent to isolate from pre-existing AWS errors
- Skips gracefully if mypy is not on PATH
- Ensures future type regressions fail the test suite immediately
…nion

The SDK's EmbedResponse is a discriminated union on response_type
(embeddings_floats vs embeddings_by_type). The OCI embed response
transformation was missing this field, causing pydantic to return None
instead of an EmbedResponse object. This broke V1 embed when the SDK's
merge_embed_responses tried to access .meta on None.

V1 (flat float arrays) now returns response_type="embeddings_floats",
V2 (typed dict) returns response_type="embeddings_by_type".
@fede-kamel fede-kamel changed the title fix(oci): resolve mypy type errors and add type-checking gate feat(oci): Add Oracle Cloud Infrastructure (OCI) Generative AI client support Apr 9, 2026
@fede-kamel
Copy link
Copy Markdown
Contributor Author

Risk Assessment Response

"mistakes could break OCI auth, payload mapping, or chat/embed streaming behavior"

Mitigated — 56 tests cover exactly this: auth mapping (session token, API key, config file), request/response transformations for every endpoint, stream event lifecycle, thinking transitions, malformed JSON handling, and edge cases like empty streams. Integration tests validate against live OCI GenAI (us-chicago-1).

"Dependency/lockfile updates may affect packaging/install surfaces"

The oci dep is optional (cohere[oci]), lazy-loaded with a clear error message — can't break anyone who isn't opting in. Lockfile churn is mostly from Fern regenerations on main.

"introduces a mypy gate that may fail CI if typing drifts"

Intentional. Only checks OCI files and skips if mypy isn't on PATH, so it won't break existing CI. If it does fail, that's the point — it already caught a real bug (missing response_type on embed responses).

@fede-kamel
Copy link
Copy Markdown
Contributor Author

One area that could be stronger: there's no unit test for the embed response_type fix specifically. If someone removes that field, only the integration tests would catch it. Adding a unit test in TestOciClientTransformations asserting response_type is present for both V1 (embeddings_floats) and V2 (embeddings_by_type).

self.lines = lines

def __iter__(self) -> typing.Iterator[bytes]:
return self.lines
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate Streamer class already exists in codebase

Low Severity

The new Streamer class in src/cohere/manually_maintained/streaming.py is an exact duplicate of the existing Streamer class in src/cohere/aws_client.py (line 115). Both extend SyncByteStream with identical __init__ and __iter__ implementations. The OCI client imports from the new file instead of reusing the existing class. Having two identical classes increases maintenance burden — a fix in one won't automatically apply to the other.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d171e7a. Configure here.

if "finishReason" in oci_event:
final_v1_finish_reason = oci_event.get("finishReason", final_v1_finish_reason)
return _emit_v1_event(event)
return b""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V1 stream missing stream-start event before text generation

Medium Severity

The V1 stream transformation emits text-generation events followed by a stream-end event, but never emits a stream-start event at the beginning of the stream. The standard Cohere V1 streaming chat format begins with a stream-start event (containing generation_id). Consumers of the V1 stream that rely on stream-start to initialize state — such as extracting the generation_id — will not receive it, potentially causing unexpected behavior or missing metadata.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d171e7a. Configure here.

Adds unit tests for V1 (embeddings_floats) and V2 (embeddings_by_type)
response_type presence, and adds assertions to the live integration
embed tests. Ensures the discriminated union field can't be silently
removed without test failure.
…eam-start

1. Remove duplicate Streamer class (manually_maintained/streaming.py)
   and import from aws_client.py instead. Both were identical
   SyncByteStream wrappers.

2. Emit stream-start event with generation_id at the beginning of V1
   streams, matching the standard Cohere V1 streaming chat format.
   Consumers relying on stream-start for state initialization will now
   receive it before text-generation events.

Updated test_v1_stream_wrapper_preserves_finish_reason to verify
stream-start is emitted first.
@fede-kamel
Copy link
Copy Markdown
Contributor Author

@daniel-cohere — here's a summary of everything addressed in this PR since the original #718:

Bug fixes

  • Embed response_type: Added missing response_type field (embeddings_floats / embeddings_by_type) to OCI embed responses — required by the SDK's discriminated union after Fern regeneration. Without it, pydantic returned None and merge_embed_responses crashed.
  • V1 stream-start event: V1 streams now emit a stream-start event with generation_id before text-generation events, matching the standard Cohere V1 streaming format.
  • Mypy type errors: Fixed 7 type errors across OCI source and test files (casts, type ignores, default values).
  • Duplicate [tool.poetry.extras]: Merged duplicate TOML sections introduced by the merge with main.

Code quality

  • Deduplicated Streamer: Removed manually_maintained/streaming.py (exact duplicate of aws_client.py:Streamer) and import from the existing class instead.
  • Mypy test gate (tests/test_oci_mypy.py): Runs mypy on OCI source and test files as part of pytest. Uses --follow-imports=silent to isolate from pre-existing AWS errors. Catches type regressions before they accumulate.

New test coverage

  • test_embed_response_includes_response_type_v1 / _v2 — unit tests for embed discriminated union field
  • response_type assertions added to V1 and V2 embed integration tests
  • stream-start assertion added to V1 stream wrapper test

Test results

60/60 passed (~12.5s) — 20 integration (live OCI GenAI), 38 unit, 2 mypy gate.

}
},
}
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V2 stream content-delta dropped for empty-string text chunks

Low Severity

In transform_stream_event, the if content_value: check skips emitting content-delta events when the content is an empty string. Combined with how _transform_v2_event uses _current_content_type to detect type transitions, this means an OCI event that signals a transition from THINKING to TEXT but carries an empty text chunk will correctly trigger a type transition (via _current_content_type) but emit no content-delta. This is mostly benign but breaks symmetry with the Cohere native API, where every content block includes at least one delta.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit b1ed6d5. Configure here.

When oci is installed but lacks stubs, mypy raises import-untyped.
When oci is not installed (optional dep), mypy raises import-not-found.
Cover both cases since cohere[oci] is optional.
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 5 total unresolved issues (including 3 from previous reviews).

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 17d4647. Configure here.

"finish_reason": final_v1_finish_reason,
},
}
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V1 stream emits stream-end without stream-start guard

Medium Severity

The V1 [DONE] handler unconditionally emits a stream-end event without checking emitted_start. If OCI sends an empty stream (only [DONE] with no data events), a stream-end is produced without a preceding stream-start, which could crash downstream SDK parsing. The V2 path correctly guards this with if emitted_start: but the V1 else branch at line 1076 lacks the same check.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 17d4647. Configure here.

if "safety_mode" in cohere_body:
chat_request["safetyMode"] = cohere_body["safety_mode"]
if "priority" in cohere_body:
chat_request["priority"] = cohere_body["priority"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repetitive V1/V2 chat parameter mapping blocks

Low Severity

The V1 and V2 chat parameter mapping in transform_request_to_oci contains ~15 nearly identical parameter-to-camelCase conversions duplicated across two branches (e.g., temperature, maxTokens, topK, topP, seed, frequencyPenalty, presencePenalty, stopSequences, tools, documents, responseFormat, safetyMode, priority). Extracting the shared mappings into a helper would reduce maintenance burden and the risk of inconsistent updates.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 17d4647. Configure here.

@daniel-cohere daniel-cohere enabled auto-merge (squash) April 9, 2026 19:56
@daniel-cohere daniel-cohere merged commit 15ac93d into cohere-ai:main Apr 9, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants