# Basalt Observability End-to-End Playbook

This notebook demonstrates how to combine Basalt's observability helpers with common LLM flows, evaluators,
experiment tagging, and Google AI Studio (Gemini) interactions. Each scenario demonstrates modern OpenTelemetry-based
observability patterns with the Basalt SDK.

## Prerequisites

- Install the SDK in editable mode with dev extras: `uv pip install -e ".[dev]"`
- (Optional) Install the Google Generative AI SDK: `pip install google-generativeai`
- Set the following environment variables before running the notebook:
  - `BASALT_API_KEY` – your Basalt API key
  - `GOOGLE_API_KEY` – Google AI Studio key (Gemini)
  - `TRACELOOP_TRACE_CONTENT=1` if you want prompts/completions captured


In [None]:
import os

from basalt import Basalt
from basalt.observability import evaluate, observe, start_observe

try:
    from google import genai
except ImportError:  # pragma: no cover - optional dependency
    genai = None

## 1. Configure the Basalt client

Initialize the Basalt client with telemetry configuration. Use `start_observe` to create root spans with identity and experiment tracking.

In [None]:
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

from basalt import TelemetryConfig

# Create telemetry configuration
exporter = OTLPSpanExporter(endpoint="http://127.0.0.1:4317", insecure=True)
telemetry = TelemetryConfig(service_name="notebook", exporter=exporter)

# Initialize Basalt client
basalt_client = Basalt(
    api_key=os.getenv("BASALT_API_KEY", "not-set"),
    telemetry_config=telemetry,
)

basalt_client

## 2. Decorator-driven LLM spans with identity and evaluators

Use `@start_observe` as the root span decorator with identity tracking.

In [None]:
@evaluate(slugs=["accuracy", "toxicity"])
def summarize_with_gemini(prompt: str, *, model: str = "gemini-2.5-flash-lite") -> str | None:
    """
    Summarize text using Gemini with automatic tracing.

    The @evaluate decorator attaches evaluators to the span.
    """
    if genai is None:
        raise RuntimeError("google-genai is not installed")

    client = genai.Client(api_key=os.getenv("GOOGLE_API_KEY", "fake-key"))
    response = client.models.generate_content(model=model, contents=prompt)
    return response.text

# Wrap the call in a root span with identity
@start_observe(
    name="notebook.workflow",
    identity={
        "organization": {"id": "123", "name": "ACME"},
        "user": {"id": "456", "name": "John Doe"}
    },
    experiment={"id": "exp-observability", "name": "demo-agent"},
    metadata={"environment": "notebook", "workspace": "demo"}
)
def run_gemini_demo():
    try:
        return summarize_with_gemini("Summarize the benefits of synthetic monitoring.")
    except Exception as exc:
        observe.fail(exc)
        return {"error": str(exc)}

gemini_result = run_gemini_demo()

## 3. Manual spans for orchestrating workflow stages

Use `start_observe` for the root workflow span, then nest `observe(kind=...)` for retrieval, tool, generation, and event spans.

In [None]:
with start_observe(
    name="workflow.rag",
    identity={
        "organization": {"id": "123", "name": "ACME"},
        "user": {"id": "456", "name": "John Doe"}
    },
    # Only attach the experiment id; descriptive fields moved to metadata
    experiment={"id": "exp-rag-001"},
    metadata={
        "feature": "support-bot",
        "experiment.name": "support-bot"
    }
):
    # Set evaluators on root span
    observe.evaluate("latency-budget")
    observe.input({"query": "error connecting to database"})

    # Step 1: Retrieval span for vector database search
    with observe(kind="retrieval", name="workflow.rag.retrieve") as ret_span:
        ret_span.set_attribute("query", "database_error")
        ret_span.set_attribute("results_count", 3)
        ret_span.set_attribute("top_k", 5)


    # Step 2: Tool span for external tool call (e.g., web search)
    with observe(kind="tool", name="workflow.rag.tool") as tool_span:
        tool_span.set_attribute("tool_name", "web-search")
        tool_span.set_input({"query": "database connection refused troubleshooting"})
        tool_span.set_output({"summary": "Check credentials and firewall rules."})

    # Step 3: LLM generation span for final answer
    with observe(kind="generation", name="workflow.rag.answer") as llm_span:
        llm_span.set_attribute("model", "gemini-1.5-flash")
        llm_span.set_attribute("prompt", "Provide mitigation steps")
        llm_span.set_attribute("completion", "1. Verify credentials...")
        llm_span.set_attribute("tokens_input", 200)
        llm_span.set_attribute("tokens_output", 180)

    # Step 4: Event span for workflow state tracking
    with observe(kind="event", name="workflow.rag.event") as event_span:
        event_span.set_attribute("event_type", "handoff")
        event_span.set_attribute("payload", {"team": "support", "status": "ready"})

    observe.output({"status": "completed", "steps": 4})

## 4. Dynamic experiment and metadata updates

Use `observe.experiment()` to attach or override experiment metadata, and `observe.metadata()` to add custom attributes to the current span.

In [None]:
with start_observe(
    name="workflow.ab-test",
    identity={
        "organization": {"id": "123", "name": "ACME"},
        "user": {"id": "456", "name": "John Doe"}
    },
    # Keep only the experiment id; move name & feature slug to metadata
    experiment={"id": "exp-baseline"},
    metadata={
        "experiment.name": "baseline-playbook",
        "feature.slug": "demo-agent"
    }
):
    # Override experiment dynamically (variant stored as attribute on override span)
    observe.experiment("exp-variant", variant="demo-agent-b")

    # Add evaluators and metadata
    observe.evaluate("judge-hallucination")

    observe.metadata({"latency_ms": 245, "variant_override": True})
    observe.output({"status": "ab_test_complete"})

## 5. Shutdown and cleanup

Flush telemetry buffers when the workflow completes.

In [None]:
basalt_client.shutdown()