Conversation
WalkthroughThe OpenAPI specification for the Cohere API's embedding endpoints has been updated to support more flexible and detailed input structures. The Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant API
participant EmbedModel
Client->>API: POST /v1/embed or /v2/embed (inputs, max_tokens, output_dimension)
API->>EmbedModel: Process array of EmbedInput (text/image), apply max_tokens, output_dimension
EmbedModel-->>API: Embedding results
API-->>Client: Embedding response
Possibly related PRs
Poem
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/libs/Cohere/openapi.yaml (2)
8016-8020:output_dimensionparameter: well-defined, but consider enumerating allowed values.The
output_dimensionproperty is described as only available forembed-v4and newer models, with allowed values of 256, 512, 1024, and 1536. However, the schema currently only specifiestype: integerand a description.Recommendation:
To improve API documentation and client validation, consider using anenumto explicitly list the allowed values foroutput_dimension. This will help prevent invalid requests and make the contract clearer for SDK generators.Example:
output_dimension: type: integer enum: [256, 512, 1024, 1536] description: ...
15271-15275:EmbedInput.contentproperty: description is accurate, but clarify structure for mixed content.The description for the
contentproperty ("An array of objects containing the input data for the model to embed.") is accurate, but could be more explicit about the expected structure, especially for clients unfamiliar with multimodal input.Recommendation:
Consider clarifying that each item in the array can be either a text or image object, and provide a short example in the description. This will help API consumers understand how to construct mixed-content inputs.Example addition:
Each item in the array should be an object of type
EmbedTextorEmbedImage. For example:[{"type": "text", "text": "hello"}, {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (8)
src/libs/Cohere/Generated/Cohere.CohereClient.Embedv2.g.csis excluded by!**/generated/**src/libs/Cohere/Generated/Cohere.ICohereClient.Embedv2.g.csis excluded by!**/generated/**src/libs/Cohere/Generated/Cohere.Models.EmbedContent.g.csis excluded by!**/generated/**src/libs/Cohere/Generated/Cohere.Models.EmbedImage.g.csis excluded by!**/generated/**src/libs/Cohere/Generated/Cohere.Models.EmbedInput.g.csis excluded by!**/generated/**src/libs/Cohere/Generated/Cohere.Models.EmbedText.g.csis excluded by!**/generated/**src/libs/Cohere/Generated/Cohere.Models.Embedv2Request.g.csis excluded by!**/generated/**src/libs/Cohere/Generated/JsonSerializerContextTypes.g.csis excluded by!**/generated/**
📒 Files selected for processing (1)
src/libs/Cohere/openapi.yaml(4 hunks)
🔇 Additional comments (2)
src/libs/Cohere/openapi.yaml (2)
15255-15255:EmbedImageschema: description addition is clear and helpful.The addition of the description "Image content of the input." to the
EmbedImageschema is clear and improves the documentation.
15360-15360:EmbedTextschema: description addition is clear and helpful.The addition of the description "Text content of the input." to the
EmbedTextschema is clear and improves the documentation.
| inputs: | ||
| type: array | ||
| items: | ||
| $ref: '#/components/schemas/EmbedInput' | ||
| description: An array of inputs for the model to embed. Maximum number of inputs per call is `96`. An input can contain a mix of text and image components. | ||
| x-fern-audiences: | ||
| - public | ||
| max_tokens: | ||
| type: integer | ||
| description: 'The maximum number of tokens to embed per input. If the input text is longer than this, it will be truncated according to the `truncate` parameter.' | ||
| x-fern-audiences: | ||
| - public |
There was a problem hiding this comment.
💡 Verification agent
❓ Verification inconclusive
New inputs, max_tokens, and output_dimension fields: strong improvement, but clarify interaction with legacy fields.
The addition of the inputs array (of EmbedInput), max_tokens, and output_dimension to the /v2/embed request schema is well-structured and aligns with modern multimodal embedding API design. The descriptions are clear and the types are correct.
However, there is potential for confusion regarding the interaction between the new inputs field and the legacy texts and images fields, which are still present in the schema. It's not clear:
- Whether
inputscan be used together withtexts/images, or if they are mutually exclusive. - What happens if both are provided—does one take precedence, or is it an error?
- If
inputsis now the preferred way, consider markingtexts/imagesas deprecated or clarifying in the documentation.
Recommendation:
Add explicit documentation (either in the property descriptions or as a top-level note) clarifying the intended usage and precedence of inputs vs. texts/images. If possible, mark texts and images as deprecated for new clients.
Would you like a suggested wording for this clarification, or a diff to mark the legacy fields as deprecated?
Action Required: Clarify Interaction Between New and Legacy Fields
The new addition of the inputs array (containing EmbedInput objects), max_tokens, and output_dimension to the /v2/embed request schema is well structured and clearly described. However, there remains some ambiguity regarding how these new fields interact with the legacy texts and images fields. In particular, it’s not clear:
- Whether
inputsis intended to be used exclusively, or if it can be combined withtexts/images. - Which field takes precedence if both are provided, or if using them together should trigger an error.
- If
inputsshould be promoted as the standard input method, whether the legacy fields ought to be marked as deprecated.
Recommendation:
Update the OpenAPI documentation by adding an explicit note—either as a property description or a top-level comment—detailing the intended usage and precedence. If appropriate, mark texts and images as deprecated for new clients.
Would you like a suggested diff to implement these clarifications?
Summary by CodeRabbit
New Features
Documentation