openshift · rh-tokeefe · Sep 4, 2025 · Aug 12, 2025
diff --git a/configure/ols-configuring-openshift-lightspeed.adoc b/configure/ols-configuring-openshift-lightspeed.adoc
@@ -32,7 +32,8 @@ include::modules/ols-about-the-byo-knowledge-tool.adoc[leveloffset=+1]
 include::modules/ols-providing-custom-knowledge-to-the-llm.adoc[leveloffset=+2]
 include::modules/ols-about-cluster-interaction.adoc[leveloffset=+1]
 include::modules/ols-enabling-cluster-interaction.adoc[leveloffset=+2]
+include::modules/ols-enabling-custom-mcp-server.adoc[leveloffset=+2]
 include::modules/ols-tokens-and-token-quota-limits.adoc[leveloffset=+1]
 include::modules/ols-activating-token-quota-limits.adoc[leveloffset=+2]
 include::modules/ols-about-postgresql-persistence.adoc[leveloffset=+1]
-include::modules/ols-enabling-postgresql-persistence.adoc[leveloffset=+2]
+include::modules/ols-enabling-postgresql-persistence.adoc[leveloffset=+2]
diff --git a/modules/ols-about-cluster-interaction.adoc b/modules/ols-about-cluster-interaction.adoc
@@ -5,11 +5,20 @@
 [id="about-cluster-interaction_{context}"]
 = About cluster interaction
 
-A large language model (LLM) is used with the {ols-long} service to generate responses to questions. Use the cluster interaction feature to enhance the knowledge available to the LLM with information about an {ocp-product-title} cluster. Providing cluster information, such as the namespaces or pods that the cluster contains, enables the LLM to generate highly customized responses for your environment.
+The {ols-long} Service uses a large language model (LLM) to generate responses to questions. You can enable the cluster interaction feature to enhance the knowledge available to the LLM with information about your {ocp-product-title} cluster. Providing information about the Kubernetes objects that the cluster contains enables the LLM to generate highly specific responses for your environment.
 
-Function calling, also known as tool calling, is a capability that enables an LLM to interact with external APIs. {ols-long} uses the `auto` setting for the tool choice parameter of the LLM to make API calls to the LLM provider. To activate the cluster interaction feature in the {ols-long} service, tool calling must be enabled in the LLM provider.
+The Model Context Protocol (MCP) is an open protocol that standardizes how applications provide context to an LLM. Using the protocol, an MCP server offers a standardized way for an LLM to increase context by requesting and receiving real-time information from external resources.
+
+When you enable cluster interaction, the {ols-long} Operator installs an MCP server. The MCP server provides the {ols-long} Service with access to the {ocp-short-name} API. Through this access, the Service performs read operations to gather more context for the LLM, enabling the service to answer questions about the Kubernetes objects that reside in your {ocp-short-name} cluster.
+
+[NOTE]
+====
+The ability of {ols-long} to choose and use a tool effectively is very sensitive to the large language (LLM) model. In general, a larger model with more parameters performs better, and the best performance comes from an extremely large frontier model that represents the latest AI capabilities. When using a small model, you might notice poor performance in tool selection or other aspects of cluster interaction.
+====
+
+To activate the cluster interaction feature in the {ols-long} Service, tool calling must be enabled in the LLM provider.
 
 [NOTE]
 ====
-Enabling tool calling can dramatically increase token usage. When you use public model providers, increased token usage can result in greater billing costs.
+Enabling tool calling can dramatically increase token usage. When you use public model providers, increased token usage can increase billing costs.
 ====
diff --git a/modules/ols-enabling-cluster-interaction.adoc b/modules/ols-enabling-cluster-interaction.adoc
@@ -30,7 +30,6 @@ include::snippets/technology-preview.adoc[]
 
 . Set the `spec.ols.introspectionEnabled` parameter to `true` to enable cluster interaction: 
 +
-.Example `OLSconfig` CR file
 [source,yaml,subs="attributes,verbatim"]
 ----
 apiVersion: ols.openshift.io/v1alpha1
@@ -46,6 +45,6 @@ spec:
 
 .Verification
 
-* Access the {ols-long} virtual assistant and submit a question associated with the custom content that was added to the LLM.
+* Access the {ols-long} virtual assistant and submit a question associated with your cluster.
 +
-The {ols-long} virtual assistant generates a response based on the custom content.
+The {ols-long} virtual assistant generates a highly refined response specific to your environment.
diff --git a/modules/ols-enabling-custom-mcp-server.adoc b/modules/ols-enabling-custom-mcp-server.adoc
@@ -0,0 +1,72 @@
+// Module included in the following assemblies:
+// * lightspeed-docs-main/configure/ols-configuring-openshift-lightspeed.adoc
+
+:_mod-docs-content-type: PROCEDURE
+[id="ols-enabling-mcp-server_{context}"]
+= Enabling a custom MCP server
+
+Add an additional MCP server that interfaces with a tool in your environment so that the large language model (LLM) uses the tool to generate answers to your questions.
+
+:FeatureName: The cluster interaction feature
+include::snippets/technology-preview.adoc[]
+
+.Prerequisites
+
+* You have installed the {ols-long} Operator.
+
+* You have configured a large language model provider.
+
+* You have deployed the {ols-long} service.
+
+.Procedure
+
+. Open the {ols-long} `OLSconfig` custom resource (CR) file by running the following command:
++
+[source,terminal]
+----
+$ oc edit olsconfig cluster
+----
+
+. Add `MCPServer` to the `spec.ols.featureGates` specification file and include the MCP server information.
++
+[source,yaml]
+----
+apiVersion: ols.openshift.io/v1alpha1
+kind: OLSConfig
+metadata:
+  name: cluster
+spec:
+  featureGate:
+    - MCPServer <1>
+  mcpServers:
+    - name: mcp-server-1 <2>
+      streamableHTTP:
+        url: http://localhost:8080/mcp <3>
+        timeout: 30
+        sseReadTimeout: 10
+        headers:
+          - Authorization: Bearer <token>
+          - Content-Type: application/json
+          - Accept: application/json
+        enableSSE: true
+    - name: mcp-server-2
+      streamableHTTP:
+        url: http://localhost:8080/mcp
+        timeout: 30 <4>
+        sseReadTimeout: 10 <5>
+        headers:
+          - <key1>: <value1> <6>
+          - <key2>: <value2>
+        enableSSE: true <7>
+----
+<1> Specifies the MCP server functionality.
+<2> Specifies the name of the MCP server.
+<3> Specifies the URL path that the MCP server uses to communicate.
+<4> Specifies the time that the MCP server has to respond to a query. If the client does not receive a query within the time specified, the MCP server times out. In this example, the timeout is 30 seconds.
+<5> Specifies the amount of time a client waits for new data from a Server-Sent Events (SSE) connection. If the client does not receive data within that time, the client closes the connection.
+<6> Specifies the additional header that the HTTP request sends to the MCP server. 
+<7> When you set `enableSSE` to `true`, the MCP server establishes a one-way channel that the MCP server uses to push updates to the client whenever the server has new information. The default setting is `false`. 
+
+. Click *Save*. 
++
+The save operation saves the file and applies the changes so that the MCP server is available to the {ols-long} service.