From d98a134c24c872e0da71a2e8989accd807a558a3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Wed, 22 Jan 2025 15:32:17 +0100 Subject: [PATCH 1/3] [DOCS] Adds note about differences between chat completion and stream API. --- docs/reference/inference/stream-inference.asciidoc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/reference/inference/stream-inference.asciidoc b/docs/reference/inference/stream-inference.asciidoc index 4a3ce31909712..b5926b399a0fc 100644 --- a/docs/reference/inference/stream-inference.asciidoc +++ b/docs/reference/inference/stream-inference.asciidoc @@ -40,6 +40,8 @@ However, if you do not plan to use the {infer} APIs to use these models or if yo The stream {infer} API enables real-time responses for completion tasks by delivering answers incrementally, reducing response times during computation. It only works with the `completion` and `chat_completion` task types. +The Chat completion {infer} API and the Stream {infer} API differ in their response structure. If you use the `openai` service or the `elastic` service, use the Chat completion {infer} API. + [NOTE] ==== include::inference-shared.asciidoc[tag=chat-completion-docs] From dd63503c7d5bc0aec1e004f3ef6d8b0abccc8bbe Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Wed, 22 Jan 2025 16:03:12 +0100 Subject: [PATCH 2/3] [DOCS] More edits. --- docs/reference/inference/chat-completion-inference.asciidoc | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/reference/inference/chat-completion-inference.asciidoc b/docs/reference/inference/chat-completion-inference.asciidoc index 83a8f94634f2f..e29a0aafd7caa 100644 --- a/docs/reference/inference/chat-completion-inference.asciidoc +++ b/docs/reference/inference/chat-completion-inference.asciidoc @@ -34,9 +34,11 @@ However, if you do not plan to use the {infer} APIs to use these models or if yo The chat completion {infer} API enables real-time responses for chat completion tasks by delivering answers incrementally, reducing response times during computation. It only works with the `chat_completion` task type for `openai` and `elastic` {infer} services. + [NOTE] ==== -The `chat_completion` task type is only available within the _unified API and only supports streaming. +* The `chat_completion` task type is only available within the _unified API and only supports streaming. +* The Chat completion {infer} API and the Stream {infer} API differ in their response structure. If you use the `openai` service or the `elastic` service, use the Chat completion {infer} API. ==== [discrete] From 244f2988c323ed80b78c894e11773a3ccc8a7896 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Thu, 23 Jan 2025 11:39:36 +0100 Subject: [PATCH 3/3] [DOCS] Addresses feedback. --- docs/reference/inference/chat-completion-inference.asciidoc | 4 +++- docs/reference/inference/stream-inference.asciidoc | 4 +++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/docs/reference/inference/chat-completion-inference.asciidoc b/docs/reference/inference/chat-completion-inference.asciidoc index e29a0aafd7caa..1d7d05b0f7d82 100644 --- a/docs/reference/inference/chat-completion-inference.asciidoc +++ b/docs/reference/inference/chat-completion-inference.asciidoc @@ -38,7 +38,9 @@ It only works with the `chat_completion` task type for `openai` and `elastic` {i [NOTE] ==== * The `chat_completion` task type is only available within the _unified API and only supports streaming. -* The Chat completion {infer} API and the Stream {infer} API differ in their response structure. If you use the `openai` service or the `elastic` service, use the Chat completion {infer} API. +* The Chat completion {infer} API and the Stream {infer} API differ in their response structure and capabilities. +The Chat completion {infer} API provides more comprehensive customization options through more fields and function calling support. +If you use the `openai` service or the `elastic` service, use the Chat completion {infer} API. ==== [discrete] diff --git a/docs/reference/inference/stream-inference.asciidoc b/docs/reference/inference/stream-inference.asciidoc index b5926b399a0fc..bfcead654258d 100644 --- a/docs/reference/inference/stream-inference.asciidoc +++ b/docs/reference/inference/stream-inference.asciidoc @@ -40,7 +40,9 @@ However, if you do not plan to use the {infer} APIs to use these models or if yo The stream {infer} API enables real-time responses for completion tasks by delivering answers incrementally, reducing response times during computation. It only works with the `completion` and `chat_completion` task types. -The Chat completion {infer} API and the Stream {infer} API differ in their response structure. If you use the `openai` service or the `elastic` service, use the Chat completion {infer} API. +The Chat completion {infer} API and the Stream {infer} API differ in their response structure and capabilities. +The Chat completion {infer} API provides more comprehensive customization options through more fields and function calling support. +If you use the `openai` service or the `elastic` service, use the Chat completion {infer} API. [NOTE] ====