# Tracing LlamaIndex with OTEL Spans using _TruLens_

This notebook demonstrates the "otel-tracing" experimental feature in _TruLens_.
This enables the collection of _OpenTelemetry_ spans during app execution. Data
that is collected by _TruLens_ is recorded as spans. Spans created by other tools
can also be made available alongside those created by TruLens. Spans can be
exported via an OTEL exporter to other tools in the ecosystem.

- Spans demonstrated in this notebook are:

  - OTEL `sqlalchemy` module instrumentation. Note that `sqlalchemy` is used
    internally by _TruLens_ for storage.

  - OTEL `requests` module instrumentation. `requests` is used by TruLens to
    make requests in the _HuggingFace_ provider.

  - _Traceloop_ LlamaIndex and OpenAI instrumentation. See
    [OpenLLMetry](https://github.com/traceloop/openllmetry) for other
    instrumentation supported by _Traceloop_.

  - Arize _OpenInference_ LlamaIndex instrumentation. See
    [OpenInference](https://github.com/Arize-ai/openinference) for other
    instrumentation supported by _OpenInference_.

- OTEL exporters demonstrated in this notebook are:

  - Console exporter (prints exported spans in the console or stream).

  - In-memory exporter. This stores spans in python list you can access in this
    notebook.

  - _Zipkin_ exporter. Setup below includes `docker` commands to download and
    start a _Zipkin_ collector for demonstration purposes. To open the UI for
    this exporter, open _Docker Desktop_, click on the triple dots under
    "Actions" for the zipkin container and select "Open with browser".
  

In [None]:
# python deps, OTEL:
# ! pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp

# OTEL contrib instrumentors
#  ! pip install opentelemetry-instrumentation-sqlalchemy opentelemetry-instrumentation-requests

# Traceloop instrumentors
# ! pip install opentelemetry-instrumentation-llamaindex opentelemetry-instrumentation-openai

# Arize openinference instrumentors
# ! pip install "openinference-instrumentation-llama-index>=2"

# OTEL zipkin exporter
# ! pip install opentelemetry-exporter-zipkin-proto-http

# Start the zipkin docker container:
# ! docker run --rm -d -p 9411:9411 --name zipkin openzipkin/zipkin

# Stop the zipkin docker container:
# ! docker stop $(docker ps -a -q --filter ancestor=openzipkin/zipkin)

In [None]:
# ruff: noqa: F401
from io import StringIO
import json
import os
import re
import urllib.request

import dotenv
from llama_index.core import Settings
from llama_index.core import SimpleDirectoryReader
from llama_index.core import VectorStoreIndex
from llama_index.llms.openai import OpenAI
from openinference.instrumentation.llama_index import (
    LlamaIndexInstrumentor as oi_LlamaIndexInstrumentor,
)

# arize openinference instrumentor
from opentelemetry import trace
from opentelemetry.exporter.zipkin.json import ZipkinExporter  # zipkin exporter
from opentelemetry.instrumentation.llamaindex import (
    LlamaIndexInstrumentor,  # traceloop instrumentors
)
from opentelemetry.instrumentation.openai import (
    OpenAIInstrumentor,  # traceloop instrumentors
)
from opentelemetry.instrumentation.requests import (
    RequestsInstrumentor,  # otel contrib instrumentors
)
from opentelemetry.instrumentation.sqlalchemy import (
    SQLAlchemyInstrumentor,  # otel contrib instrumentors:
)
from opentelemetry.sdk.trace.export import (
    ConsoleSpanExporter,  # console exporter
)
from opentelemetry.sdk.trace.export.in_memory_span_exporter import (
    InMemorySpanExporter,  # in-memory exporter
)
from trulens.apps.llamaindex import TruLlama
from trulens.core import Feedback
from trulens.core import Select
from trulens.core.session import TruSession
from trulens.experimental.otel_tracing.core.trace import TracerProvider
from trulens.providers.huggingface import Huggingface

# This is needed due to zipkin issues related to protobuf.
os.environ["OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED"] = "true"

dotenv.load_dotenv()

In [None]:
# Sets the global default tracer provider to be the trulens one.
trace.set_tracer_provider(TracerProvider())

# Creates a tracer for custom spans below.
tracer = trace.get_tracer(__name__)

In [None]:
# Download some base data for query engine.

url = "https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt"
file_path = "data/paul_graham_essay.txt"

if not os.path.exists("data"):
    os.makedirs("data")

if not os.path.exists(file_path):
    urllib.request.urlretrieve(url, file_path)

In [None]:
# Setup in-memory span exporter.
exporter = InMemorySpanExporter()

# Setup console/file/string exporter
# stream = StringIO()

# Will print A LOT to stdout unless we set a different stream.
# exporter = ConsoleSpanExporter(out=stream)

# Setup zipkin exporter
# exporter = ZipkinExporter(endpoint="http://localhost:9411/api/v2/spans")

# Create a TruLens session.
session = TruSession()

# To export spans to an external OTEL SpanExporter tool, set it here:
session.experimental_otel_exporter = exporter

# (Optional) Enable otel_tracing. Note that this is not required if you set the
# exporter above. If you would like to trace using spans without an exporter,
# this step is required.
session.experimental_enable_feature("otel_tracing")

session.reset_database()
session.start_dashboard()

In [None]:
# enable otel contrib instrumentation
SQLAlchemyInstrumentor().instrument()
RequestsInstrumentor().instrument()

# enable traceloop instrumentation
LlamaIndexInstrumentor().instrument()
OpenAIInstrumentor().instrument()

# enable arize open inference instrumentation
oi_LlamaIndexInstrumentor().instrument()

In [None]:
# Create query engine

Settings.llm = OpenAI()

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine(similarity_top_k=3)

In [None]:
# Create a feedback function and wrap app with trulens recorder.

provider = Huggingface()

f_lang_match = (
    Feedback(provider.language_match)
    .on(
        Select.RecordSpans.trulens.call.query.attributes[
            "trulens.bindings"
        ].str_or_query_bundle
    )
    .on(
        Select.RecordSpans.trulens.call.query.attributes["trulens.ret"].response
    )
)
# The parts of the selector are:
#
# - Select.RecordSpans - The spans organized by span name.
#
# - trulens.call.query - The span name we are interested in. TruLens names all
#   call spans with the name "trulens.call.<methodname>".
#
# - attributes - the attributes of the span.
#
# - "trulens.bindings" - The attribute we are interested in. TruLens puts the
#   call arguments in the attribute called "trulens.bindings".
#
#    - str_or_query_bundle - The call argument.
#
# - "trulens.ret" - The return value of the method call.
#
#    - response - The response key assuming the return value is a dictionary.
#
# - (not shown) "trulens.error" - For calls that do not return and raise an
#   exception instead, that exception is stored in this attribute.

tru_query_engine_recorder = TruLlama(
    query_engine,
    app_name="LlamaIndex_App",
    app_version="base",
    feedbacks=[f_lang_match],
)

In [None]:
# Normal trulens recording usage

with tru_query_engine_recorder as recording:
    # Custom spans can be included:
    with tracer.start_as_current_span("Querying LlamaIndex") as span:
        # With custom attributes.
        span.set_attribute("custom_attribute", "This can by anything.")

        # Query the engine as normal.
        res = query_engine.query("What did the author do growing up?")

In [None]:
# Get the record from the recording.

rec = recording.get()

In [None]:
# Check the feedback result.

rec.feedback_results[0].result()

In [None]:
# Show all spans in the record. Here we are using a selector to retrieve the
# spans from within the record.

rec.get(Select.RecordSpans)

# Alternatively, spans can be accessed directly in the record as a list. The
# above indexes them by name instead.

# rec.experimental_otel_spans

In [None]:
# Check the attributes we used to define the feedback functions.

print(
    rec.get(
        Select.RecordSpans.trulens.call.query.attributes[
            "trulens.bindings"
        ].str_or_query_bundle
    )
)

print(
    rec.get(
        Select.RecordSpans.trulens.call.query.attributes["trulens.ret"].response
    )
)

In [None]:
# All of the spans listed above should be visible in the chosen exporter.

# The InMemorySpanExporter stores the spans in memory. Lets read them back here
# to inspect them:

if "exporter" in locals():
    print(f"Spans exported to {exporter}:")

    if isinstance(exporter, InMemorySpanExporter):
        spans = exporter.get_finished_spans()

        for span in spans:
            print(span.name)

    # The ConsoleSpanExporter writes json dumps of each span. Lets read those back
    # here to inspect them:

    if isinstance(exporter, ConsoleSpanExporter):
        match_root_json = re.compile(r"(?:(^|\n))\{.+?\n\}", re.DOTALL)

        if "stream" in locals():
            dumps = match_root_json.finditer(stream.getvalue())  # noqa: F821

            for dump in dumps:
                span = json.loads(dump.group())
                print(span["name"])

    elif isinstance(exporter, ZipkinExporter):
        print(
            "The spans should be visible in the zipkin dashboard at http://localhost:9411/zipkin/"
        )

# This should include:
#
# - 0: a special span made by TruLens that indicates a recording context. This
#   is named "trulens.recording".
#
# - 1: the custom span entitled "Querying LlamaIndex" made above.
#
# - 2: the span made by TruLens that corresponds to the call to
#  `query_engine.query`.
#
# - 3,4: two of the spans produced by the two LlamaIndex instrumentors that
#   represents that same call.
#
# - A bunch more spans.

In [None]:
# Check a spans produced by TruLens. Note that span instances created by TruLens
# are represented as:
#
#  <class name>(<name>, <trace_id>/<span_id> -> <parent trace_id>/<parent span_id>)
#
# where trace_id and span_id are only the last byte of each for easier readability.

rec.get(Select.RecordSpans.trulens)

In [None]:
# Check details of one the main span (representing the call to `query`).

rec.get(Select.RecordSpans.trulens.call.query.attributes)

In [None]:
# Check attributes of the same information as instrumented by OpenInference:

rec.get(Select.RecordSpans.OpenAI.chat.attributes)

In [None]:
# Check attributes of the same information as instrumented by TraceLoop:

rec.get(Select.RecordSpans.RetrieverQueryEngine.workflow.attributes)

In [None]:
# Check for spans that were produced outside of the recording. Here we print all
# of the root spans (those that do not have a parent). This should include the
# special TruLens span that corresponds to a recording but also other spans
# produced before and after the recording.

for span in tracer.spans.values():
    if span.parent is None:
        print(span, span.status, span.attributes.keys())

In [None]:
# Check some of the specific spans.

# SQLAlchmey spans:

for span in tracer.spans.values():
    if span.name == "connect":
        print(span, span.status, span.attributes)

In [None]:
# requests spans:

for span in tracer.spans.values():
    if span.name in ["POST", "GET"]:
        print(span, span.status, span.attributes)

## Stream spans

In [None]:
chat_engine = index.as_chat_engine(similarity_top_k=3)
tru_chat_engine_recorder = TruLlama(
    chat_engine, app_name="LlamaIndex_App", app_version="chat"
)

In [None]:
with tru_chat_engine_recorder as recording:
    response = chat_engine.stream_chat("What did the author do growing up?")
    for chunk in response.response_gen:
        print(chunk, end="")

record = recording.get()

In [None]:
# Check the main span:
record.get(Select.RecordSpans.trulens.call.stream_chat.attributes)