

<center> <img src="https://storage.googleapis.com/arize-assets/arize-logo-white.jpg" width="300"/> </center>

# <center>Tracing via OTLP using Arize</center>

This guide demonstrates how to use Arize for monitoring and debugging your LLM using Traces and Spans. We're going to build a simple question-and-answer application using LangChain and retrieval-augmented generation (RAG) to answer questions about the [Arize documentation](https://docs.arize.com/arize/). Arize makes your LLM applications observable by visualizing the underlying structure of each call to your query engine and surfacing problematic `spans` of execution based on latency, token count, or other evaluation metrics. You can read more about LLM tracing [here](https://docs.arize.com/arize/llm-large-language-models/llm-traces).

In this tutorial, you will:
1. Use opentelemetry and [openinference](https://github.com/Arize-ai/openinference/tree/main) to instrument our application in order to send traces via OTLP to Arize.
2. Build a simple question-and-answer application using LangChain and RAG to answer questions about the Arize documentation
3. Inspect the traces and spans of your application to identify sources of latency and cost

ℹ️ This notebook requires:
- An OpenAI API key
- An Arize Space & API Key (explained below)


## Step 1: Install Dependencies 📚
Let's get the notebook setup with dependencies.

In [None]:
# Dependencies needed to build the Langchain application
!pip install -q "langchain>=0.1.0" langchain-openai

# Dependencies needed to export spans and send them to our collectors: Arize
!pip install -q opentelemetry-exporter-otlp 'openinference-instrumentation-langchain>=0.1.15'

## Step 2: OTLP Instrumentation

### Step 2.a: Define an exporter to Arize
Creating an Arize exporter is very simple. We just need 2 things:
* Space and API keys, that will be sent as headers
* Model ID and version, sent as resource attributes

Copy the Arize API_KEY and SPACE_KEY from your Space Settings page (shown below) to the variables in the cell below. We will also be setting up some metadata to use across all logging.

<center><img src="https://storage.googleapis.com/arize-assets/fixtures/copy-keys.png" width="700"></center>

In [None]:
SPACE_KEY = "SPACE_KEY" # Change this line
API_KEY = "API_KEY" # Change this line

model_id = "tutorial-otlp-tracing-langchain-rag"
model_version = "1.0"

if SPACE_KEY == "SPACE_KEY" or API_KEY == "API_KEY":
    raise ValueError("❌ NEED TO CHANGE SPACE AND/OR API_KEY")
else:
    print("✅ Import and Setup Arize Client Done! Now we can start using Arize!")

Next, we create an OTLP exporter with the Arize endpoint detailed above. Note that we use GRPC to export traces to Arize, which acts as a collector.

In [None]:
import os
from opentelemetry.sdk.resources import Resource
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace.export import SimpleSpanProcessor

In [None]:
# Set the Space and API keys as headers
os.environ['OTEL_EXPORTER_OTLP_TRACES_HEADERS']=f"space_key={SPACE_KEY},api_key={API_KEY}"

# Set the model id and version as resource attributes
resource = Resource(
    attributes={
        "model_id":model_id,
        "model_version":model_version,
    }
)

endpoint = "https://otlp.arize.com/v1"
span_exporter = OTLPSpanExporter(endpoint=endpoint)
span_processor = SimpleSpanProcessor(span_exporter=span_exporter)

### Step 2.b: Define a trace provider and initiate the instrumentation


In [None]:
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry import trace as trace_api
from openinference.instrumentation.langchain import LangChainInstrumentor

In [None]:
tracer_provider = trace_sdk.TracerProvider(resource=resource)
tracer_provider.add_span_processor(span_processor=span_processor)
trace_api.set_tracer_provider(tracer_provider=tracer_provider)

In [None]:
# If you are running the instrumentation from a Colab environment, set skip_dep_check to True
# For more information check https://github.com/Arize-ai/openinference/issues/100
try:
  import google.colab
  IN_COLAB = True
except:
  IN_COLAB = False

LangChainInstrumentor().instrument(skip_dep_check=IN_COLAB)

## Step 3: Build Your LangChain RAG Application 📁
Let's import the dependencies we need

In [None]:
import pandas as pd
import numpy as np
from langchain.chains import RetrievalQA
from langchain.retrievers import KNNRetriever
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

Set your OpenAI API key if it is not already set as an environment variable.

In [None]:
from getpass import getpass

if not (openai_api_key := os.getenv("OPENAI_API_KEY")):
    openai_api_key = getpass("🔑 Enter your OpenAI API key: ")

os.environ["OPENAI_API_KEY"] = openai_api_key

This example uses a `RetrievalQA` chain over a pre-built index of the Arize documentation, but you can use whatever LangChain application you like.

Download the pre-built index from cloud storage and instantiate your storage context.

In [None]:
df = pd.read_parquet(
    "http://storage.googleapis.com/arize-phoenix-assets/datasets/"
    "unstructured/llm/context-retrieval/langchain-pinecone/database.parquet"
)
knn_retriever = KNNRetriever(
    index=np.stack(df["text_vector"]),
    texts=df["text"].tolist(),
    embeddings=OpenAIEmbeddings(),
)
chain_type = "stuff"  # stuff, refine, map_reduce, and map_rerank
chat_model_name = "gpt-3.5-turbo"
llm = ChatOpenAI(model_name=chat_model_name)
chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type=chain_type,
    retriever=knn_retriever,
    metadata={"application_type": "question_answering"},
)

Let's test our app by asking a question about the Arize documentation:

In [None]:
response = chain.invoke("What is Arize and how can it help me as an AI Engineer?")     
print(response['result'])

Great! Our application works!

## Step 4: Use our instrumented chain

We will download a dataset of queries for our RAG application to answer and see the traces appear in Arize.

In [None]:
from urllib.request import urlopen
import json

queries_url = "http://storage.googleapis.com/arize-phoenix-assets/datasets/unstructured/llm/context-retrieval/arize_docs_queries.jsonl"
queries = []
with urlopen(queries_url) as response:
    for line in response:
        line = line.decode("utf-8").strip()
        data = json.loads(line)
        queries.append(data["query"])

queries[:5]

In [None]:
from tqdm import tqdm
from openinference.instrumentation import using_attributes

N1 = 5 # Number of traces for your first session
SESSION_ID_1 = "session-id-1" # Identifer for your first session
USER_ID_1 = "john_smith" # Identifer for your first session
METADATA = {
    "key_bool": True,
    "key_str": "value1",
    "key_int": 1
}

qa_pairs = []
for query in tqdm(queries[:N1]):
    with using_attributes(
        session_id=SESSION_ID_1,
        user_id=USER_ID_1,
        metadata=METADATA,
    ):
        resp = chain.invoke(query)
        qa_pairs.append((query,resp['result']))

In [None]:
N2 = 3 # Number of traces for your second session
SESSION_ID_2 = "session-id-2" # Identifer for your second session
USER_ID_2 = "jane_doe" # Identifer for your second session

for query in tqdm(queries[N1:N1+N2]):
    with using_attributes(
        session_id=SESSION_ID_2,
        user_id=USER_ID_2,
        metadata=METADATA
    ):
        resp = chain.invoke(query)
        qa_pairs.append((query,resp['result']))

In [None]:
for q,a in qa_pairs:
    q_msg = f">> QUESTION: {q}"
    print(f"{'-'*len(q_msg)}")
    print(q_msg)
    print(f">> ANSWER: {a}\n")

## Step 5: Log into Arize and explore your application traces 🚀

Log into your Arize account, and look for the model with the same `model_id`. You are likely to see the following page if you are sending a brand new model. Arize is processing your data and your model will be accessible for you to explore your traces in no time. 

<center><img src="https://storage.googleapis.com/arize-assets/fixtures/Embeddings/GENERATIVE/model-loading-tutorial-otlp-langchain.png" width="700"></center>

After the timer is completed, you are ready to navigate and explore your traces

<center><img src="https://storage.googleapis.com/arize-assets/fixtures/Embeddings/GENERATIVE/llm-tracing-overview-langchain.png" width="700"></center>

<center><img src="https://storage.googleapis.com/arize-assets/fixtures/Embeddings/GENERATIVE/llm-tracing-detail-langchain.png" width="700"></center>

