diff --git a/configure/ols-configuring-openshift-lightspeed.adoc b/configure/ols-configuring-openshift-lightspeed.adoc index a7057e26313a..6b869be8719a 100644 --- a/configure/ols-configuring-openshift-lightspeed.adoc +++ b/configure/ols-configuring-openshift-lightspeed.adoc @@ -32,7 +32,8 @@ include::modules/ols-about-the-byo-knowledge-tool.adoc[leveloffset=+1] include::modules/ols-providing-custom-knowledge-to-the-llm.adoc[leveloffset=+2] include::modules/ols-about-cluster-interaction.adoc[leveloffset=+1] include::modules/ols-enabling-cluster-interaction.adoc[leveloffset=+2] +include::modules/ols-enabling-custom-mcp-server.adoc[leveloffset=+2] include::modules/ols-tokens-and-token-quota-limits.adoc[leveloffset=+1] include::modules/ols-activating-token-quota-limits.adoc[leveloffset=+2] include::modules/ols-about-postgresql-persistence.adoc[leveloffset=+1] -include::modules/ols-enabling-postgresql-persistence.adoc[leveloffset=+2] +include::modules/ols-enabling-postgresql-persistence.adoc[leveloffset=+2] \ No newline at end of file diff --git a/modules/ols-about-cluster-interaction.adoc b/modules/ols-about-cluster-interaction.adoc index 9efd4048b7ad..9125d05e7ebd 100644 --- a/modules/ols-about-cluster-interaction.adoc +++ b/modules/ols-about-cluster-interaction.adoc @@ -5,11 +5,20 @@ [id="about-cluster-interaction_{context}"] = About cluster interaction -A large language model (LLM) is used with the {ols-long} service to generate responses to questions. Use the cluster interaction feature to enhance the knowledge available to the LLM with information about an {ocp-product-title} cluster. Providing cluster information, such as the namespaces or pods that the cluster contains, enables the LLM to generate highly customized responses for your environment. +The {ols-long} Service uses a large language model (LLM) to generate responses to questions. You can enable the cluster interaction feature to enhance the knowledge available to the LLM with information about your {ocp-product-title} cluster. Providing information about the Kubernetes objects that the cluster contains enables the LLM to generate highly specific responses for your environment. -Function calling, also known as tool calling, is a capability that enables an LLM to interact with external APIs. {ols-long} uses the `auto` setting for the tool choice parameter of the LLM to make API calls to the LLM provider. To activate the cluster interaction feature in the {ols-long} service, tool calling must be enabled in the LLM provider. +The Model Context Protocol (MCP) is an open protocol that standardizes how applications provide context to an LLM. Using the protocol, an MCP server offers a standardized way for an LLM to increase context by requesting and receiving real-time information from external resources. + +When you enable cluster interaction, the {ols-long} Operator installs an MCP server. The MCP server provides the {ols-long} Service with access to the {ocp-short-name} API. Through this access, the Service performs read operations to gather more context for the LLM, enabling the service to answer questions about the Kubernetes objects that reside in your {ocp-short-name} cluster. + +[NOTE] +==== +The ability of {ols-long} to choose and use a tool effectively is very sensitive to the large language (LLM) model. In general, a larger model with more parameters performs better, and the best performance comes from an extremely large frontier model that represents the latest AI capabilities. When using a small model, you might notice poor performance in tool selection or other aspects of cluster interaction. +==== + +To activate the cluster interaction feature in the {ols-long} Service, tool calling must be enabled in the LLM provider. [NOTE] ==== -Enabling tool calling can dramatically increase token usage. When you use public model providers, increased token usage can result in greater billing costs. +Enabling tool calling can dramatically increase token usage. When you use public model providers, increased token usage can increase billing costs. ==== \ No newline at end of file diff --git a/modules/ols-enabling-cluster-interaction.adoc b/modules/ols-enabling-cluster-interaction.adoc index d71f7e923d93..10758ba355ca 100644 --- a/modules/ols-enabling-cluster-interaction.adoc +++ b/modules/ols-enabling-cluster-interaction.adoc @@ -30,7 +30,6 @@ include::snippets/technology-preview.adoc[] . Set the `spec.ols.introspectionEnabled` parameter to `true` to enable cluster interaction: + -.Example `OLSconfig` CR file [source,yaml,subs="attributes,verbatim"] ---- apiVersion: ols.openshift.io/v1alpha1 @@ -46,6 +45,6 @@ spec: .Verification -* Access the {ols-long} virtual assistant and submit a question associated with the custom content that was added to the LLM. +* Access the {ols-long} virtual assistant and submit a question associated with your cluster. + -The {ols-long} virtual assistant generates a response based on the custom content. +The {ols-long} virtual assistant generates a highly refined response specific to your environment. diff --git a/modules/ols-enabling-custom-mcp-server.adoc b/modules/ols-enabling-custom-mcp-server.adoc new file mode 100644 index 000000000000..645049fa2117 --- /dev/null +++ b/modules/ols-enabling-custom-mcp-server.adoc @@ -0,0 +1,72 @@ +// Module included in the following assemblies: +// * lightspeed-docs-main/configure/ols-configuring-openshift-lightspeed.adoc + +:_mod-docs-content-type: PROCEDURE +[id="ols-enabling-mcp-server_{context}"] += Enabling a custom MCP server + +Add an additional MCP server that interfaces with a tool in your environment so that the large language model (LLM) uses the tool to generate answers to your questions. + +:FeatureName: The cluster interaction feature +include::snippets/technology-preview.adoc[] + +.Prerequisites + +* You have installed the {ols-long} Operator. + +* You have configured a large language model provider. + +* You have deployed the {ols-long} service. + +.Procedure + +. Open the {ols-long} `OLSconfig` custom resource (CR) file by running the following command: ++ +[source,terminal] +---- +$ oc edit olsconfig cluster +---- + +. Add `MCPServer` to the `spec.ols.featureGates` specification file and include the MCP server information. ++ +[source,yaml] +---- +apiVersion: ols.openshift.io/v1alpha1 +kind: OLSConfig +metadata: + name: cluster +spec: + featureGate: + - MCPServer <1> + mcpServers: + - name: mcp-server-1 <2> + streamableHTTP: + url: http://localhost:8080/mcp <3> + timeout: 30 + sseReadTimeout: 10 + headers: + - Authorization: Bearer + - Content-Type: application/json + - Accept: application/json + enableSSE: true + - name: mcp-server-2 + streamableHTTP: + url: http://localhost:8080/mcp + timeout: 30 <4> + sseReadTimeout: 10 <5> + headers: + - : <6> + - : + enableSSE: true <7> +---- +<1> Specifies the MCP server functionality. +<2> Specifies the name of the MCP server. +<3> Specifies the URL path that the MCP server uses to communicate. +<4> Specifies the time that the MCP server has to respond to a query. If the client does not receive a query within the time specified, the MCP server times out. In this example, the timeout is 30 seconds. +<5> Specifies the amount of time a client waits for new data from a Server-Sent Events (SSE) connection. If the client does not receive data within that time, the client closes the connection. +<6> Specifies the additional header that the HTTP request sends to the MCP server. +<7> When you set `enableSSE` to `true`, the MCP server establishes a one-way channel that the MCP server uses to push updates to the client whenever the server has new information. The default setting is `false`. + +. Click *Save*. ++ +The save operation saves the file and applies the changes so that the MCP server is available to the {ols-long} service. \ No newline at end of file