# Evaluate gen AI apps with Snowflake Cortex AI and TruLens
This notebook demonstrates how AI Observability in Snowflake Cortex AI helps quantitatively measure the performance of a RAG applications using  different LLMs, providing insights into application behavior and helping the user select the best model for their use case.

### Required Packages
* trulens-core (1.4.5 or above)
* trulens-connectors-snowflake (1.4.5 or above)
* trulens-providers-cortex (1.4.5 or above)
* snowflake.core (1.0.5 or above)


## Session Information
Fetches the current session information and the connection details for the Snowflake account. This connection details will be used to ingest application traces and trigger metric computation jobs.

In [None]:
import os

os.environ["SNOWFLAKE_ACCOUNT"] = "..."
os.environ["SNOWFLAKE_USER"] = "..."
os.environ["SNOWFLAKE_USER_PASSWORD"] = "..."
os.environ["SNOWFLAKE_DATABASE"] = "..."
os.environ["SNOWFLAKE_SCHEMA"] = "..."
os.environ["SNOWFLAKE_WAREHOUSE"] = "..."
os.environ["SNOWFLAKE_ROLE"] = "..."

In [None]:
from snowflake.snowpark import Session
from trulens.connectors.snowflake import SnowflakeConnector

snowflake_connection_parameters = {
    "account": os.environ["SNOWFLAKE_ACCOUNT"],
    "user": os.environ["SNOWFLAKE_USER"],
    "password": os.environ["SNOWFLAKE_USER_PASSWORD"],
    "database": os.environ["SNOWFLAKE_DATABASE"],
    "schema": os.environ["SNOWFLAKE_SCHEMA"],
    "role": os.environ["SNOWFLAKE_ROLE"],
    "warehouse": os.environ["SNOWFLAKE_WAREHOUSE"],
}
snowpark_session = Session.builder.configs(
    snowflake_connection_parameters
).create()

# TruSession is no longer required as long as snowflake connector exists
sf_connector = SnowflakeConnector(snowpark_session=snowpark_session)

In [None]:
# Simplest Virtual Run approach!
import uuid

from trulens.apps.app import TruApp
from trulens.core.run import RunConfig

APP_NAME = "RAG evaluation run on existing data"
APP_VERSION = "V1"

# Create TruApp with None - no app object or main_method needed!
tru_app = TruApp(
    app=None,  # No app object needed for virtual runs
    app_name=APP_NAME,
    app_version=APP_VERSION,
    connector=sf_connector,
)

# Create run config with dataset specification
run_name = f"virtual_run_{uuid.uuid4()}"

run_config = RunConfig(
    run_name=run_name,
    dataset_name="virtual_run_test",  # Your Snowflake table name
    source_type="TABLE",
    dataset_spec={
        # The dataset_spec maps span attribute paths to column names
        # This creates spans dynamically based on what you define here!
        "record_root.input": "QUERY_STRING",  # Root span input
        "record_root.output": "OUTPUT_STRING",  # Root span output
        "retrieved_contexts": "CONTEXTS",  # Retrieval span contexts
    },
)

# Use the existing add_run() flow
virtual_run = tru_app.add_run(run_config=run_config)

print(f"Created virtual run: {run_name}")

In [None]:
# Start the virtual run - this will create OTEL spans from existing data
virtual_run.start(virtual=True)

## Virtual Run - New Feature!
With the new virtual run feature, you can now ingest existing data directly into the Event Table without creating a dummy app. This approach is much cleaner and avoids the awkward pattern of creating fake app methods.

The example schema used is shown below:
```sql
create table YOUR_TABLE_NAME (
    query_string VARCHAR,
    output_string VARCHAR, 
    contexts VARCHAR
);
```


In [None]:
# Check virtual run status
import time

while virtual_run.get_status() == "INVOCATION_IN_PROGRESS":
    print("Waiting for ingestion to complete...")
    time.sleep(2)

print(f"Virtual run status: {virtual_run.get_status()}")

## Compute Metrics

Computes the RAG triad metrics for both runs to measure the quality of response in the RAG application.

In [None]:
# Compute metrics for the virtual run
virtual_run.compute_metrics([
    "answer_relevance",
    "context_relevance",
    "groundedness",
])

print("Metrics computation started for virtual run!")

## Evaluation Results

To view evaluation results:
* Login to [Snowsight](https://app.snowflake.com/).
* Navigate to **AI & ML** -> **Evaluations** from the left navigation menu.
* Select “RAG evaluation run on existing data” to view the runs, see detailed traces and compare runs.