Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion configure/ols-configuring-openshift-lightspeed.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@ include::modules/ols-about-the-byo-knowledge-tool.adoc[leveloffset=+1]
include::modules/ols-providing-custom-knowledge-to-the-llm.adoc[leveloffset=+2]
include::modules/ols-about-cluster-interaction.adoc[leveloffset=+1]
include::modules/ols-enabling-cluster-interaction.adoc[leveloffset=+2]
include::modules/ols-enabling-custom-mcp-server.adoc[leveloffset=+2]
include::modules/ols-tokens-and-token-quota-limits.adoc[leveloffset=+1]
include::modules/ols-activating-token-quota-limits.adoc[leveloffset=+2]
include::modules/ols-about-postgresql-persistence.adoc[leveloffset=+1]
include::modules/ols-enabling-postgresql-persistence.adoc[leveloffset=+2]
include::modules/ols-enabling-postgresql-persistence.adoc[leveloffset=+2]
15 changes: 12 additions & 3 deletions modules/ols-about-cluster-interaction.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,20 @@
[id="about-cluster-interaction_{context}"]
= About cluster interaction

A large language model (LLM) is used with the {ols-long} service to generate responses to questions. Use the cluster interaction feature to enhance the knowledge available to the LLM with information about an {ocp-product-title} cluster. Providing cluster information, such as the namespaces or pods that the cluster contains, enables the LLM to generate highly customized responses for your environment.
The {ols-long} Service uses a large language model (LLM) to generate responses to questions. You can enable the cluster interaction feature to enhance the knowledge available to the LLM with information about your {ocp-product-title} cluster. Providing information about the Kubernetes objects that the cluster contains enables the LLM to generate highly specific responses for your environment.

Function calling, also known as tool calling, is a capability that enables an LLM to interact with external APIs. {ols-long} uses the `auto` setting for the tool choice parameter of the LLM to make API calls to the LLM provider. To activate the cluster interaction feature in the {ols-long} service, tool calling must be enabled in the LLM provider.
The Model Context Protocol (MCP) is an open protocol that standardizes how applications provide context to an LLM. Using the protocol, an MCP server offers a standardized way for an LLM to increase context by requesting and receiving real-time information from external resources.

When you enable cluster interaction, the {ols-long} Operator installs an MCP server. The MCP server provides the {ols-long} Service with access to the {ocp-short-name} API. Through this access, the Service performs read operations to gather more context for the LLM, enabling the service to answer questions about the Kubernetes objects that reside in your {ocp-short-name} cluster.

[NOTE]
====
The ability of {ols-long} to choose and use a tool effectively is very sensitive to the large language (LLM) model. In general, a larger model with more parameters performs better, and the best performance comes from an extremely large frontier model that represents the latest AI capabilities. When using a small model, you might notice poor performance in tool selection or other aspects of cluster interaction.
====

To activate the cluster interaction feature in the {ols-long} Service, tool calling must be enabled in the LLM provider.

[NOTE]
====
Enabling tool calling can dramatically increase token usage. When you use public model providers, increased token usage can result in greater billing costs.
Enabling tool calling can dramatically increase token usage. When you use public model providers, increased token usage can increase billing costs.
====
5 changes: 2 additions & 3 deletions modules/ols-enabling-cluster-interaction.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@ include::snippets/technology-preview.adoc[]

. Set the `spec.ols.introspectionEnabled` parameter to `true` to enable cluster interaction:
+
.Example `OLSconfig` CR file
[source,yaml,subs="attributes,verbatim"]
----
apiVersion: ols.openshift.io/v1alpha1
Expand All @@ -46,6 +45,6 @@ spec:

.Verification

* Access the {ols-long} virtual assistant and submit a question associated with the custom content that was added to the LLM.
* Access the {ols-long} virtual assistant and submit a question associated with your cluster.
+
The {ols-long} virtual assistant generates a response based on the custom content.
The {ols-long} virtual assistant generates a highly refined response specific to your environment.
72 changes: 72 additions & 0 deletions modules/ols-enabling-custom-mcp-server.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
// Module included in the following assemblies:
// * lightspeed-docs-main/configure/ols-configuring-openshift-lightspeed.adoc

:_mod-docs-content-type: PROCEDURE
[id="ols-enabling-mcp-server_{context}"]
= Enabling a custom MCP server

Add an additional MCP server that interfaces with a tool in your environment so that the large language model (LLM) uses the tool to generate answers to your questions.

:FeatureName: The cluster interaction feature
include::snippets/technology-preview.adoc[]

.Prerequisites

* You have installed the {ols-long} Operator.

* You have configured a large language model provider.

* You have deployed the {ols-long} service.

.Procedure

. Open the {ols-long} `OLSconfig` custom resource (CR) file by running the following command:
+
[source,terminal]
----
$ oc edit olsconfig cluster
----

. Add `MCPServer` to the `spec.ols.featureGates` specification file and include the MCP server information.
+
[source,yaml]
----
apiVersion: ols.openshift.io/v1alpha1
kind: OLSConfig
metadata:
name: cluster
spec:
featureGate:
- MCPServer <1>
mcpServers:
- name: mcp-server-1 <2>
streamableHTTP:
url: http://localhost:8080/mcp <3>
timeout: 30
sseReadTimeout: 10
headers:
- Authorization: Bearer <token>
- Content-Type: application/json
- Accept: application/json
enableSSE: true
- name: mcp-server-2
streamableHTTP:
url: http://localhost:8080/mcp
timeout: 30 <4>
sseReadTimeout: 10 <5>
headers:
- <key1>: <value1> <6>
- <key2>: <value2>
enableSSE: true <7>
----
<1> Specifies the MCP server functionality.
<2> Specifies the name of the MCP server.
<3> Specifies the URL path that the MCP server uses to communicate.
<4> Specifies the time that the MCP server has to respond to a query. If the client does not receive a query within the time specified, the MCP server times out. In this example, the timeout is 30 seconds.
<5> Specifies the amount of time a client waits for new data from a Server-Sent Events (SSE) connection. If the client does not receive data within that time, the client closes the connection.
<6> Specifies the additional header that the HTTP request sends to the MCP server.
<7> When you set `enableSSE` to `true`, the MCP server establishes a one-way channel that the MCP server uses to push updates to the client whenever the server has new information. The default setting is `false`.

. Click *Save*.
+
The save operation saves the file and applies the changes so that the MCP server is available to the {ols-long} service.