From d98a134c24c872e0da71a2e8989accd807a558a3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Wed, 22 Jan 2025 15:32:17 +0100
Subject: [PATCH 1/3] [DOCS] Adds note about differences between chat
 completion and stream API.

---
 docs/reference/inference/stream-inference.asciidoc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/docs/reference/inference/stream-inference.asciidoc b/docs/reference/inference/stream-inference.asciidoc
index 4a3ce31909712..b5926b399a0fc 100644
--- a/docs/reference/inference/stream-inference.asciidoc
+++ b/docs/reference/inference/stream-inference.asciidoc
@@ -40,6 +40,8 @@ However, if you do not plan to use the {infer} APIs to use these models or if yo
 The stream {infer} API enables real-time responses for completion tasks by delivering answers incrementally, reducing response times during computation.
 It only works with the `completion` and `chat_completion` task types.
 
+The Chat completion {infer} API and the Stream {infer} API differ in their response structure. If you use the `openai` service or the `elastic` service, use the Chat completion {infer} API.
+
 [NOTE]
 ====
 include::inference-shared.asciidoc[tag=chat-completion-docs]

From dd63503c7d5bc0aec1e004f3ef6d8b0abccc8bbe Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Wed, 22 Jan 2025 16:03:12 +0100
Subject: [PATCH 2/3] [DOCS] More edits.

---
 docs/reference/inference/chat-completion-inference.asciidoc | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/docs/reference/inference/chat-completion-inference.asciidoc b/docs/reference/inference/chat-completion-inference.asciidoc
index 83a8f94634f2f..e29a0aafd7caa 100644
--- a/docs/reference/inference/chat-completion-inference.asciidoc
+++ b/docs/reference/inference/chat-completion-inference.asciidoc
@@ -34,9 +34,11 @@ However, if you do not plan to use the {infer} APIs to use these models or if yo
 The chat completion {infer} API enables real-time responses for chat completion tasks by delivering answers incrementally, reducing response times during computation.
 It only works with the `chat_completion` task type for `openai` and `elastic` {infer} services.
 
+
 [NOTE]
 ====
-The `chat_completion` task type is only available within the _unified API and only supports streaming.
+* The `chat_completion` task type is only available within the _unified API and only supports streaming.
+* The Chat completion {infer} API and the Stream {infer} API differ in their response structure. If you use the `openai` service or the `elastic` service, use the Chat completion {infer} API.
 ====
 
 [discrete]

From 244f2988c323ed80b78c894e11773a3ccc8a7896 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Thu, 23 Jan 2025 11:39:36 +0100
Subject: [PATCH 3/3] [DOCS] Addresses feedback.

---
 docs/reference/inference/chat-completion-inference.asciidoc | 4 +++-
 docs/reference/inference/stream-inference.asciidoc          | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/docs/reference/inference/chat-completion-inference.asciidoc b/docs/reference/inference/chat-completion-inference.asciidoc
index e29a0aafd7caa..1d7d05b0f7d82 100644
--- a/docs/reference/inference/chat-completion-inference.asciidoc
+++ b/docs/reference/inference/chat-completion-inference.asciidoc
@@ -38,7 +38,9 @@ It only works with the `chat_completion` task type for `openai` and `elastic` {i
 [NOTE]
 ====
 * The `chat_completion` task type is only available within the _unified API and only supports streaming.
-* The Chat completion {infer} API and the Stream {infer} API differ in their response structure. If you use the `openai` service or the `elastic` service, use the Chat completion {infer} API.
+* The Chat completion {infer} API and the Stream {infer} API differ in their response structure and capabilities.
+The Chat completion {infer} API provides more comprehensive customization options through more fields and function calling support.
+If you use the `openai` service or the `elastic` service, use the Chat completion {infer} API.
 ====
 
 [discrete]
diff --git a/docs/reference/inference/stream-inference.asciidoc b/docs/reference/inference/stream-inference.asciidoc
index b5926b399a0fc..bfcead654258d 100644
--- a/docs/reference/inference/stream-inference.asciidoc
+++ b/docs/reference/inference/stream-inference.asciidoc
@@ -40,7 +40,9 @@ However, if you do not plan to use the {infer} APIs to use these models or if yo
 The stream {infer} API enables real-time responses for completion tasks by delivering answers incrementally, reducing response times during computation.
 It only works with the `completion` and `chat_completion` task types.
 
-The Chat completion {infer} API and the Stream {infer} API differ in their response structure. If you use the `openai` service or the `elastic` service, use the Chat completion {infer} API.
+The Chat completion {infer} API and the Stream {infer} API differ in their response structure and capabilities.
+The Chat completion {infer} API provides more comprehensive customization options through more fields and function calling support.
+If you use the `openai` service or the `elastic` service, use the Chat completion {infer} API.
 
 [NOTE]
 ====