# Basalt Observability End-to-End Playbook

This notebook demonstrates how to combine Basalt's observability helpers with common LLM flows, evaluators,
experiment tagging, and Google AI Studio (Gemini) interactions. Each scenario demonstrates modern OpenTelemetry-based
observability patterns with the Basalt SDK.

## Prerequisites

- Install the SDK in editable mode with dev extras: `uv pip install -e ".[dev]"`
- (Optional) Install the Google Generative AI SDK: `pip install google-generativeai`
- Set the following environment variables before running the notebook:
  - `BASALT_API_KEY` – your Basalt API key
  - `GOOGLE_API_KEY` – Google AI Studio key (Gemini)
  - `TRACELOOP_TRACE_CONTENT=1` if you want prompts/completions captured


In [None]:
import os
from datetime import datetime, timezone

from basalt import Basalt
from basalt.observability import (
    add_default_evaluators,
    attach_trace_experiment,
    configure_trace_defaults,
    trace_event,
    trace_generation,  # Modern naming (replaces trace_llm_call)
    trace_retrieval,
    trace_span,
    trace_tool,
)
from basalt.observability.decorators import trace_generation as trace_generation_decorator

try:
    from google import genai
except ImportError:  # pragma: no cover - optional dependency
    genai = None

## 1. Configure the Basalt client and default trace context

We prime global defaults so that every span carries user, organization, and evaluator metadata.


In [None]:
# Configure defaults before instantiating the client
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

from basalt import TelemetryConfig

configure_trace_defaults(
    user={"id": "user-notebook", "name": "Analyst"},
    organization={"id": "org-research", "name": "Basalt Research"},
    metadata={"environment": "notebook", "workspace": "demo"},
    evaluators=["accuracy"],
)
# add_default_evaluators takes *args, so pass as a positional argument
add_default_evaluators("toxicity")

exporter = OTLPSpanExporter(endpoint="http://127.0.0.1:4317", insecure=True)
telemetry = TelemetryConfig(service_name="notebook", exporter=exporter)

basalt_client = Basalt(
    api_key=os.getenv("BASALT_API_KEY", "not-set"),
    trace_experiment={"id": "exp-observability", "feature_slug": "demo-agent"},
    trace_metadata={"notebook": "observability-playbook"},
    telemetry_config=telemetry,
)

basalt_client

## 2. Decorator-driven LLM spans with evaluators

The `@trace_generation` decorator automatically captures prompts, completions, and token usage for LLM calls.

In [None]:
@trace_generation_decorator(name="notebook.gemini.summarize")
def summarize_with_gemini(prompt: str, *, model: str = "gemini-2.5-flash-lite") -> str | None:
    """
    Summarize text using Gemini with automatic tracing.

    The @trace_generation decorator creates an LLM span and captures:
    - The prompt text
    - The completion/response
    - Model information
    - Token usage (if available)
    """
    if genai is None:
        raise RuntimeError("google-genai is not installed")

    client = genai.Client(api_key=os.getenv("GOOGLE_API_KEY", "fake-key"))
    response = client.models.generate_content(model=model, contents=prompt)
    # Convert response to dict so the decorator can introspect usage
    return response.text

try:
    gemini_result = summarize_with_gemini("Summarize the benefits of synthetic monitoring.")
except Exception as exc:
    gemini_result = {"error": str(exc)}

gemini_result

## 3. Manual spans for orchestrating workflow stages

We combine `trace_span`, `trace_tool`, and `trace_event` to follow a retrieval-augmented generation pipeline.


In [None]:
with trace_span("workflow.rag", attributes={"feature": "support-bot"}) as span:
    span.add_evaluator("latency-budget")
    span.set_experiment("exp-rag-001", feature_slug="support-bot")

    # Step 1: Retrieval span for vector database search
    with trace_retrieval("workflow.rag.retrieve") as ret_span:
        ret_span.set_query("error connecting to database")
        ret_span.set_results_count(3)
        ret_span.set_top_k(5)

    # Step 2: Tool span for external tool call (e.g., web search)
    with trace_tool("workflow.rag.tool") as tool_span:
        tool_span.set_tool_name("web-search")
        tool_span.set_input({"query": "database connection refused troubleshooting"})
        tool_span.set_output({"summary": "Check credentials and firewall rules."})

    # Step 3: LLM generation span for final answer
    with trace_generation("workflow.rag.answer") as llm_span:
        llm_span.set_model("gemini-1.5-flash")
        llm_span.set_prompt("Provide mitigation steps")
        llm_span.set_completion("1. Verify credentials...")
        llm_span.set_tokens(input=200, output=180)

    # Step 4: Event span for workflow state tracking
    with trace_event("workflow.rag.event") as event_span:
        event_span.set_event_type("handoff")
        event_span.set_payload({"team": "support", "status": "ready"})

## 4. Experiments and trace enrichment APIs

Attach experiment metadata globally, then override within the active span when needed.


In [None]:
attach_trace_experiment("exp-baseline", name="baseline-playbook", feature_slug="demo-agent")

with trace_span("workflow.ab-test") as span:
    span.set_experiment("exp-variant", name="variant-b", feature_slug="demo-agent-b")
    span.add_evaluator("judge-hallucination")
    # Use timezone-aware datetime (modern best practice)
    span.add_event("scoring_started", {"timestamp": datetime.now(timezone.utc).isoformat()})
    span.set_attribute("basalt.metric.latency_ms", 245)

## 5. Shutdown and cleanup

Flush telemetry buffers when the workflow completes.

In [31]:
basalt_client.shutdown()
