# Embedding semantic search for issue resolution

This chart illustrates a workflow for generating solutions to incidents using a combination of Retrieval and Generation techniques, potentially leveraging tools like BQML (BigQuery ML) and LLMs (Large Language Models).

### Process Flow

1. **Document Processing:**
   * **Input:** The process starts with raw incident resolution data,  stored in PDF files, this data is stored on a BigQuery object table.
   * **DocAI Processor:** A DocAI Processor is used to extract structured information from the documents. The output is a parsed table containing key details about the incidents. We call DocAI from wthin BigQuery.

2. **Retrieval:**
   * **Embedding Generation:**
     * **BQML GENERATE EMBEDDING:** The incident text (or parsed information) is passed through a BQML model to generate an embedding vector. This vector represents the semantic meaning of the incident in a numerical format.
   * **Vector Search:**
     * **BQML VECTOR INDEX:** The embedding vector is used to query a BQML VECTOR INDEX . This index stores pre-computed embeddings of existing incident resolutions or knowledge base articles. 
     * **BQML VECTOR SEARCH:** The search retrieves the top K most similar items (resolutions) from the index based on the similarity between the query embedding and the stored embeddings.

3. **Generation:**
   * **Prompt Construction:**
     * A prompt is created for the LLM. This prompt includes:
       * Instruction to produce a solution for the incident
       * The retrieved top K resolutions (or their summaries) as context
   * **LLM Generation:**
     * **BQML GENERATE TEXT:** The LLM processes the prompt and generates a solution (or response) to the incident. The generated text leverages both the information from the retrieved resolutions and the LLM's own language understanding capabilities.

### Key Components

* **DocAI Processor:** Extracts structured data from unstructured incident documents.
* **BQML:**
    * **GENERATE EMBEDDING:** Creates embedding vectors representing the semantic meaning of text.
    * **VECTOR INDEX:** Stores pre-computed embeddings for efficient similarity search
    * **VECTOR SEARCH:** Retrieves similar items from the vector index
    * **GENERATE TEXT:**  Generates text using an LLM
* **LLM:**  Large Language Model (Gemini) used for generating the final incident solution based on the retrieved context.

### Benefits of this Approach

* **Leverages Existing Knowledge:** Retrieval from a knowledge base ensures that the generated solution is informed by past experiences and best practices.
* **Improves Accuracy:** The LLM's output is grounded in relevant context, reducing the chances of generating irrelevant or inaccurate solutions.
* **Efficient:** The vector index allows for fast retrieval of relevant information, even from large knowledge bases. 

![gen_ai_bq_00](../../assets/gen_ai_bq_00.png)

In [None]:
import sys
import os

In [None]:
sys.path.append(os.path.dirname(os.getcwd()))
from utils import run_query, load_constants
from IPython.display import IFrame

In [None]:
constants = load_constants()

GOOGLE_CLOUD_PROJECT = constants["GCP"]["GOOGLE_CLOUD_PROJECT"]
GOOGLE_CLOUD_LOCATION = constants["GCP"]["GOOGLE_CLOUD_LOCATION"]
GOOGLE_CLOUD_LOCATION_MULTI_REGION = constants["GCP"]["GOOGLE_CLOUD_LOCATION_MULTI_REGION"]
GOOGLE_CLOUD_GCS_BUCKET = constants["GCP"]["GOOGLE_CLOUD_GCS_BUCKET"]
GOOGLE_CLOUD_GCS_BUCKET_MULTI_REGION = constants["GCP"][
    "GOOGLE_CLOUD_GCS_BUCKET_MULTI_REGION"
]
GOOGLE_GEMINI_MODEL_15 = constants["VERTEX"]["GOOGLE_GEMINI_MODEL_15"]
GOOGLE_GEMINI_MODEL_10 = constants["VERTEX"]["GOOGLE_GEMINI_MODEL_10"]

GOOGLE_CLOUD_BIGQUERY_PROJECT = constants["BIGQUERY"]["GOOGLE_CLOUD_BIGQUERY_PROJECT"]
GOOGLE_CLOUD_BIGQUERY_DATASET = constants["BIGQUERY"]["GOOGLE_CLOUD_BIGQUERY_DATASET"]
GOOGLE_CLOUD_BIGQUERY_DATASET_MULTI_REGION = constants["BIGQUERY"][
    "GOOGLE_CLOUD_BIGQUERY_DATASET_MULTI_REGION"
]


BASE_TABLE_NAME_EVENTS = constants["BIGQUERY"]["BASE_TABLE_NAME_EVENTS"]
BASE_TABLE_NAME_INCIDENTS = constants["BIGQUERY"]["BASE_TABLE_NAME_INCIDENTS"]

DOC_AI_PROCESSOR_URI = constants["DOC_AI"]["DOC_AI_PROCESSOR_URI"]

Lets have a look to one of our incident resolution documents

In [None]:
!gcloud storage cp gs://{GOOGLE_CLOUD_GCS_BUCKET_MULTI_REGION}/rca/incident_resolution_20.pdf .

In [None]:
IFrame("incident_resolution_20.pdf", width=800, height=640)

In [None]:
query_cext = f"""CREATE OR REPLACE EXTERNAL TABLE `{GOOGLE_CLOUD_BIGQUERY_PROJECT}.{GOOGLE_CLOUD_BIGQUERY_DATASET_MULTI_REGION}.{BASE_TABLE_NAME_INCIDENTS}_docs`
  WITH CONNECTION `{GOOGLE_CLOUD_BIGQUERY_PROJECT}.{GOOGLE_CLOUD_LOCATION_MULTI_REGION}.genai`
  OPTIONS (
    object_metadata = 'SIMPLE',
    uris = ['gs://{GOOGLE_CLOUD_GCS_BUCKET_MULTI_REGION}/rca/*'],
    metadata_cache_mode= 'AUTOMATIC',
    max_staleness= INTERVAL 1 HOUR
  );"""

In [None]:
query_cmodel = f"""
  CREATE OR REPLACE MODEL `{GOOGLE_CLOUD_BIGQUERY_DATASET_MULTI_REGION}.rca_processor`
  REMOTE WITH CONNECTION `{GOOGLE_CLOUD_BIGQUERY_PROJECT}.{GOOGLE_CLOUD_LOCATION_MULTI_REGION}.genai`
  OPTIONS (
    remote_service_type = 'CLOUD_AI_DOCUMENT_V1',   
    document_processor='{DOC_AI_PROCESSOR_URI}'
  );"""

In [None]:
run_query(query_cext)

In [None]:
run_query(query_cmodel)

In [None]:
query_parse = f"""
  CREATE OR REPLACE TABLE `{GOOGLE_CLOUD_BIGQUERY_PROJECT}.{GOOGLE_CLOUD_BIGQUERY_DATASET_MULTI_REGION}.{BASE_TABLE_NAME_INCIDENTS}_docs_parsed` AS
  SELECT *
  FROM ML.PROCESS_DOCUMENT(
    MODEL `{GOOGLE_CLOUD_BIGQUERY_DATASET_MULTI_REGION}.rca_processor`,
    TABLE `{GOOGLE_CLOUD_BIGQUERY_PROJECT}.{GOOGLE_CLOUD_BIGQUERY_DATASET_MULTI_REGION}.{BASE_TABLE_NAME_INCIDENTS}_docs`)
  WHERE content_type = 'application/pdf';"""

In [None]:
run_query(query_parse)

In [None]:
query_emodel = f"""
CREATE OR REPLACE MODEL `{GOOGLE_CLOUD_BIGQUERY_DATASET_MULTI_REGION}.gecko_embedder`
  REMOTE WITH CONNECTION `{GOOGLE_CLOUD_BIGQUERY_PROJECT}.{GOOGLE_CLOUD_LOCATION_MULTI_REGION}.genai`
  OPTIONS (ENDPOINT = "textembedding-gecko-multilingual");"""

In [None]:
run_query(query_emodel)

In [None]:
query_genembs = f"""
CREATE OR REPLACE TABLE `{GOOGLE_CLOUD_BIGQUERY_PROJECT}.{GOOGLE_CLOUD_BIGQUERY_DATASET_MULTI_REGION}.{BASE_TABLE_NAME_INCIDENTS}_docs_embedded` AS
SELECT * FROM ML.GENERATE_EMBEDDING(
  MODEL `{GOOGLE_CLOUD_BIGQUERY_DATASET_MULTI_REGION}.gecko_embedder`,
  (
    SELECT  JSON_VALUE(ml_process_document_result, '$.text') AS content, uri as title
    FROM `{GOOGLE_CLOUD_BIGQUERY_PROJECT}.{GOOGLE_CLOUD_BIGQUERY_DATASET_MULTI_REGION}.{BASE_TABLE_NAME_INCIDENTS}_docs_parsed`
  )
)
WHERE LENGTH(ml_generate_embedding_status) = 0;"""

In [None]:
run_query(query_genembs)

In [None]:
query_emodel = f"""
CREATE OR REPLACE MODEL `{GOOGLE_CLOUD_BIGQUERY_DATASET_MULTI_REGION}.gemini_model`
  REMOTE WITH CONNECTION `{GOOGLE_CLOUD_BIGQUERY_PROJECT}.{GOOGLE_CLOUD_LOCATION_MULTI_REGION}.genai`
  OPTIONS (ENDPOINT = '{GOOGLE_GEMINI_MODEL_10}');"""

In [None]:
run_query(query_emodel)

In [None]:
user_query = 'Im having a high CPU utilization incident together with  Network Congestion Alert and High Active Connection Count Alert'
query_search = f"""
SELECT *
FROM VECTOR_SEARCH(
  TABLE `{GOOGLE_CLOUD_BIGQUERY_PROJECT}.{GOOGLE_CLOUD_BIGQUERY_DATASET_MULTI_REGION}.{BASE_TABLE_NAME_INCIDENTS}_docs_embedded`, 'ml_generate_embedding_result',
  (
  SELECT ml_generate_embedding_result, content AS query
  FROM ML.GENERATE_EMBEDDING(
   MODEL `{GOOGLE_CLOUD_BIGQUERY_DATASET_MULTI_REGION}.gecko_embedder`,
  (SELECT '{user_query}' AS content))
  ),
  top_k => 5);"""

In [None]:
run_query(query_search)

In [None]:
user_query = "Im having a high CPU utilization incident together with  Network Congestion Alert and High Active Connection Count Alert"
query_rag = f"""SELECT ml_generate_text_result.candidates[0].content.parts[0].text
FROM ML.GENERATE_TEXT(
  MODEL `{GOOGLE_CLOUD_BIGQUERY_DATASET_MULTI_REGION}.gemini_model`,
  (
    SELECT CONCAT(
      'Detail how to solve the issue using the following articles, produce a step by step guide ',
      STRING_AGG(base.content)
      ) AS prompt,
    FROM VECTOR_SEARCH(
  TABLE `{GOOGLE_CLOUD_BIGQUERY_PROJECT}.{GOOGLE_CLOUD_BIGQUERY_DATASET_MULTI_REGION}.{BASE_TABLE_NAME_INCIDENTS}_docs_embedded`, 'ml_generate_embedding_result',
  (
  SELECT ml_generate_embedding_result, content AS query
  FROM ML.GENERATE_EMBEDDING(
   MODEL `{GOOGLE_CLOUD_BIGQUERY_DATASET_MULTI_REGION}.gecko_embedder`,
  (SELECT '{user_query}' AS content))
  ),
  top_k => 10)
  ), STRUCT(8192 as max_output_tokens));"""

In [None]:
print(run_query(query_rag)['text'].iloc[0])

We have also included a simple webapp to see this lab in an interactive manner.

Go back to the Google Cloud Console and open a Cloud Shell, execute the following commands from the terminal.

```bash
git clone https://github.com/velascoluis/telco_data_ai_lab.git
cd telco_data_ai_lab/src/gen_ai_docs/webapp
source launch_local_test.sh
```
Click on web preview to display the webapp on the browser

![gen_ai_bq_01](../../assets/gen_ai_bq_01.png)

