# Getting Started with AI Observability

To run, first install the following packages: `snowflake-ml-python`, `snowflake.core`, `trulens-core`, `trulens-providers-cortex`, `trulens-connectors-snowflake`

https://github.com/Snowflake-Labs/sfguide-getting-started-with-ai-observability

## Create the database, tables and warehouse

In [None]:
from snowflake.snowpark.context import get_active_session
session = get_active_session()

In [None]:
CREATE DATABASE IF NOT EXISTS cortex_search_tutorial_db;

CREATE OR REPLACE WAREHOUSE cortex_search_tutorial_wh WITH
     WAREHOUSE_SIZE='X-SMALL'
     AUTO_SUSPEND = 120
     AUTO_RESUME = TRUE
     INITIALLY_SUSPENDED=TRUE;

 USE WAREHOUSE cortex_search_tutorial_wh;

Note:

The CREATE DATABASE statement creates a database. The database automatically includes a schema named PUBLIC.

The CREATE WAREHOUSE statement creates an initially suspended warehouse.

## Get PDF data

You will use a sample dataset of the Federal Open Market Committee (FOMC) meeting minutes for this example. This is a sample of twelve 10-page documents with meeting notes from FOMC meetings from 2023 and 2024. Download the files directly from your browser by following this link:

[FOMC minutes sample](https://drive.google.com/file/d/1C6TdVjy6d-GnasGO6ZrIEVJQRcedDQxG/view)

The complete set of FOMC minutes can be found at the [US Federal Reserve’s website](https://www.federalreserve.gov/monetarypolicy/fomccalendars.htm).

Note: In a non-classroom setting, you would bring your own data, possibly already in a Snowflake stage.

## Load data into Snowflake stage

In [None]:
CREATE OR REPLACE STAGE cortex_search_tutorial_db.public.fomc
    DIRECTORY = (ENABLE = TRUE)
    ENCRYPTION = (TYPE = 'SNOWFLAKE_SSE');

In [None]:
--esto elimina la necesidad de hacer la carga manualmente

CREATE OR REPLACE STAGE MGG_GENAI_OBSERVABILITY
 URL = 's3://mggsnowflake/genaiobservability';

COPY FILES INTO @FOMC
FROM @MGG_GENAI_OBSERVABILITY;

ALTER STAGE FOMC REFRESH;

Now upload the dataset. You can upload the dataset in Snowsight or using SQL. To upload in Snowsight:

1. Sign in to Snowsight.

2. Select Data in the left-side navigation menu.

3. Select your database cortex_search_tutorial_db.

4. Select your schema public.

5. Select Stages and select fomc.

6. On the top right, Select the + Files button.

7. Drag and drop files into the UI or select Browse to choose a file from the dialog window.

8. Select Upload to upload your file.

## Verify the PDF Files are uploaded to stage

In [None]:
ls @cortex_search_tutorial_db.public.fomc

## Parse PDF Files

In [None]:
CREATE OR REPLACE TABLE CORTEX_SEARCH_TUTORIAL_DB.PUBLIC.PARSED_FOMC_CONTENT AS SELECT 
      relative_path,
      TO_VARCHAR(
        SNOWFLAKE.CORTEX.PARSE_DOCUMENT(
          @cortex_search_tutorial_db.public.fomc, 
          relative_path, 
          {'mode': 'LAYOUT'}
        ) :content
      ) AS parsed_text
    FROM directory(@cortex_search_tutorial_db.public.fomc)
    WHERE relative_path LIKE '%.pdf'

In [None]:
SELECT * FROM CORTEX_SEARCH_TUTORIAL_DB.PUBLIC.PARSED_FOMC_CONTENT LIMIT 2

Valide el stage y haga refresh, para que el query anterior marque texto parseado

## Chunk text

In [None]:
CREATE OR REPLACE TABLE CORTEX_SEARCH_TUTORIAL_DB.PUBLIC.CHUNKED_FOMC_CONTENT (
    file_name VARCHAR,
    CHUNK VARCHAR
);

INSERT INTO CORTEX_SEARCH_TUTORIAL_DB.PUBLIC.CHUNKED_FOMC_CONTENT (file_name, CHUNK)
SELECT
    relative_path,
    c.value AS CHUNK
FROM
    CORTEX_SEARCH_TUTORIAL_DB.PUBLIC.PARSED_FOMC_CONTENT,
    LATERAL FLATTEN( input => SNOWFLAKE.CORTEX.SPLIT_TEXT_RECURSIVE_CHARACTER (
        parsed_text,
        'markdown',
        1800,
        250
    )) c;

In [None]:
SELECT * FROM CORTEX_SEARCH_TUTORIAL_DB.PUBLIC.CHUNKED_FOMC_CONTENT LIMIT 10

## Create Search Service

In [None]:
CREATE OR REPLACE CORTEX SEARCH SERVICE CORTEX_SEARCH_TUTORIAL_DB.PUBLIC.FOMC_SEARCH_SERVICE
    ON chunk
    WAREHOUSE = cortex_search_tutorial_wh
    TARGET_LAG = '1 minute'
    EMBEDDING_MODEL = 'snowflake-arctic-embed-l-v2.0'
    AS (
    SELECT
        file_name,
        chunk
    FROM CORTEX_SEARCH_TUTORIAL_DB.PUBLIC.CHUNKED_FOMC_CONTENT
    );

## Use the Search Service

In [None]:
import os
from snowflake.core import Root
from typing import List
from snowflake.snowpark.session import Session

class CortexSearchRetriever:

    def __init__(self, snowpark_session: Session, limit_to_retrieve: int = 4):
        self._snowpark_session = snowpark_session
        self._limit_to_retrieve = limit_to_retrieve

    def retrieve(self, query: str) -> List[str]:
        root = Root(session)

        search_service = (root
          .databases["CORTEX_SEARCH_TUTORIAL_DB"]
          .schemas["PUBLIC"]
          .cortex_search_services["FOMC_SEARCH_SERVICE"]
        )
        resp = search_service.search(
          query=query,
          columns=["chunk"],
          limit=self._limit_to_retrieve
        )

        if resp.results:
            return [curr["chunk"] for curr in resp.results]
        else:
            return []

## Turn on OpenTelemetry Tracing

Before we build the RAG, we want to enable TruLens-OpenTelemetry for tracing and observability.

In [None]:
import os
os.environ["TRULENS_OTEL_TRACING"] = "1"

Create a database and schema to store our traces and evaluations

In [None]:
create database if not exists observability_db;
create schema if not exists observability_db.observability_schema;

In [None]:
session.use_schema("observability_db.observability_schema")
session.get_current_database() + '.' + session.get_current_schema()

## Create the RAG with instrumentation

Develop the RAG system with integrated instrumentation. Including the span type and attributes in instrumentation will power evaluations of the spans captured.

In [None]:
from snowflake.cortex import complete
from trulens.core.otel.instrument import instrument
from trulens.otel.semconv.trace import SpanAttributes

class RAG:

    def __init__(self):
        self.retriever = CortexSearchRetriever(snowpark_session=session, limit_to_retrieve=4)

    @instrument(
        span_type=SpanAttributes.SpanType.RETRIEVAL,
        attributes={
            SpanAttributes.RETRIEVAL.QUERY_TEXT: "query",
            SpanAttributes.RETRIEVAL.RETRIEVED_CONTEXTS: "return",
            }
    )
    def retrieve_context(self, query: str) -> list:
        """
        Retrieve relevant text from vector store.
        """
        return self.retriever.retrieve(query)


    @instrument(
        span_type=SpanAttributes.SpanType.GENERATION)
    def generate_completion(self, query: str, context_str: list) -> str:
        """
        Generate answer from context.
        """
        prompt = f"""
          You are an expert assistant extracting information from context provided.
          Answer the question in long-form, fully and completely, based on the context. Do not hallucinate.
          If you don´t have the information just say so.
          Context: {context_str}
          Question:
          {query}
          Answer:
        """
        response = ""
        stream = complete("mistral-large2", prompt, stream = True)
        for update in stream:    
          response += update
          print(update, end = '')
        return response

    @instrument(
        span_type=SpanAttributes.SpanType.RECORD_ROOT, 
        attributes={
            SpanAttributes.RECORD_ROOT.INPUT: "query",
            SpanAttributes.RECORD_ROOT.OUTPUT: "return",
        })
    def query(self, query: str) -> str:
        context_str = self.retrieve_context(query)
        return self.generate_completion(query, context_str)


rag = RAG()

## Register the App

Set metadata including application name and version, along with the snowpark session to store the experiments.

In [None]:
from trulens.apps.app import TruApp
from trulens.connectors.snowflake import SnowflakeConnector

tru_snowflake_connector = SnowflakeConnector(snowpark_session=session)

app_name = "fed_reserve_rag"
app_version = "cortex_search"

tru_rag = TruApp(
        rag,
        app_name=app_name,
        app_version=app_version,
        connector=tru_snowflake_connector
    )

## Configure and add experiment run

Prepare a set of test queries to evaluate the RAG system.

The test set can be either a dataframe in python or a table in Snowflake. In this example, we'll use a table in snowflake.

First, download the [dataset provided](https://github.com/Snowflake-Labs/sfguide-getting-started-with-ai-observability/blob/main/fomc_dataset.csv).

Then, upload `fomc_dataset.csv` to Snowflake:

1. Select Data -> Add Data
2. Choose the tile: Load data into a Table
3. Upload `fomc_dataset.csv` from the [github repository]()
4. Choose `OBSERVABILITY_DB.OBSERVABILITY_SCHEMA`, create a new table
5. Name the new table `FOMC_DATA` , then click next.
6. Update the column names to `QUERY`, and `GROUND_TRUTH_RESPONSE` and select Load.

Set up the configuration for running experiments and add the run to TruLens.

In [None]:
-- ESTO ELIMINA LA NECESIDAD DE HACER EL ANTERIOR PASO MANUALMENTE
CREATE OR REPLACE STAGE "OBSERVABILITY_DB"."OBSERVABILITY_SCHEMA"."MGG_GENAI_OBSERVABILITY_DATA"
 URL = 's3://mggsnowflake/genaiobservabilitydata';

CREATE OR REPLACE TABLE "OBSERVABILITY_DB"."OBSERVABILITY_SCHEMA"."FOMC_DATA" ( QUERY VARCHAR , GROUND_TRUTH_RESPONSE VARCHAR ); 

CREATE TEMP FILE FORMAT "OBSERVABILITY_DB"."OBSERVABILITY_SCHEMA"."CSV"
	TYPE=CSV
    SKIP_HEADER=0
    FIELD_DELIMITER=','
    TRIM_SPACE=TRUE
    FIELD_OPTIONALLY_ENCLOSED_BY='"'
    REPLACE_INVALID_CHARACTERS=TRUE
    DATE_FORMAT=AUTO
    TIME_FORMAT=AUTO
    TIMESTAMP_FORMAT=AUTO; 

copy into "OBSERVABILITY_DB"."OBSERVABILITY_SCHEMA"."FOMC_DATA" 
from @MGG_GENAI_OBSERVABILITY_DATA
file_format=CSV;

In [None]:
from trulens.core.run import Run
from trulens.core.run import RunConfig

run_name = "experiment_1_run"

run_config = RunConfig(
    run_name=run_name,
    dataset_name="FOMC_DATA",
    description="Questions about the Federal Open Market Committee meetings",
    label="fomc_rag_eval",
    source_type="TABLE",
    dataset_spec={
        "input": "QUERY",
        "ground_truth_output":"GROUND_TRUTH_RESPONSE",
    },
)

run: Run = tru_rag.add_run(run_config=run_config)

## Start the run

Start the experiment run with the prepared test set. Doing so will invoke the application in batch using the inputs in the dataset you provided in the run.

In [None]:
import pandas as pd

session.sql(
    f"ALTER WAREHOUSE {session.get_current_warehouse()[1:-1]} \
    SET WAREHOUSE_SIZE='X-Large';"
).collect()

whs = pd.DataFrame(session.sql("show warehouses").collect())[['name', 'size']]
current_wh = session.get_current_warehouse().strip('"')
current_wh_size = whs.loc[whs['name'] == current_wh, 'size'].iloc[0]

print(f"Current Warehouse: {current_wh} ({current_wh_size})")

In [None]:
run.start()

In [None]:
import pandas as pd

session.sql(
    f"ALTER WAREHOUSE {session.get_current_warehouse()[1:-1]} \
    SET WAREHOUSE_SIZE='Small';"
).collect()

whs = pd.DataFrame(session.sql("show warehouses").collect())[['name', 'size']]
current_wh = session.get_current_warehouse().strip('"')
current_wh_size = whs.loc[whs['name'] == current_wh, 'size'].iloc[0]

print(f"Warehouse Downsized: {current_wh} ({current_wh_size})")

## Compute metrics on the run

In [None]:
run.compute_metrics([
    "answer_relevance",
    "context_relevance",
    "groundedness",
])

## Evaluation Results

To view evaluation results:
* Login to [Snowsight](https://app.snowflake.com/).
* Navigate to **AI & ML** -> **Evaluations** from the left navigation menu.
* Select “FOMC RAG CHATBOT” to view the runs, see detailed traces and compare runs.