<center>
    <p style="text-align:center">
        <img alt="phoenix logo" src="https://storage.googleapis.com/arize-assets/phoenix/assets/phoenix-logo-light.svg" width="200"/>
        <br>
        <a href="https://docs.arize.com/phoenix/">Docs</a>
        |
        <a href="https://github.com/Arize-ai/phoenix">GitHub</a>
        |
        <a href="https://join.slack.com/t/arize-ai/shared_invite/zt-1px8dcmlf-fmThhDFD_V_48oU7ALan4Q">Community</a>
    </p>
</center>
<h1 align="center">Tracing a LangChain and VertexAI Application</h1>

LLM orchestration frameworks such as LangChain provide abstractions that enable users to build powerful applications in a few lines of code. However, the same abstractions can make it difficult to understand what is going on under the hood and to pinpoint the cause of issues.

Phoenix makes your LLM applications *observable* by visualizing the underlying structure of each call to your chain and surfacing problematic "spans" of execution based on latency, token count, or other evaluation metrics.

In this tutorial, you will:
- Build a simple retrieval-augmented generation application over the Arize documentation using LangChain and VertexAI, in particular, using "textembedding-gecko" for embeddings and "chat-bison" for chat,
- Record trace data in OpenInference format,
- Inspect the traces and spans of your application to identify uncaught exceptions and sources of latency and cost.

‚ÑπÔ∏èÔ∏è This notebook requires access to the [Vertex AI API](https://cloud.google.com/vertex-ai/docs).

## 1. Install Dependencies and Import Libraries

Downgrade Colab's pre-installed version of `shapely` for compatibility reasons.

‚ö†Ô∏è If you run into a version compatibility error in a later cell, try restarting the runtime and re-running the notebook.

In [1]:
!pip install --force-reinstall "shapely<2.0.0"

Collecting shapely<2.0.0
  Using cached Shapely-1.8.5.post1-cp38-cp38-macosx_10_9_x86_64.whl (1.2 MB)
Installing collected packages: shapely
  Attempting uninstall: shapely
    Found existing installation: Shapely 1.8.5.post1
    Uninstalling Shapely-1.8.5.post1:
      Successfully uninstalled Shapely-1.8.5.post1
Successfully installed shapely-1.8.5.post1


Install Phoenix, LangChain, and the Google Cloud AI Platform SDK.

In [2]:
!pip install -qq "arize-phoenix[experimental]" google-api-python-client "google-cloud-aiplatform[preview]" langchain

Import libraries.

In [3]:
import json
from urllib.request import urlopen

import numpy as np
import pandas as pd
import phoenix as px
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatVertexAI
from langchain.embeddings import VertexAIEmbeddings
from langchain.retrievers import KNNRetriever
from phoenix.experimental.evals import (
    VertexAIModel,
    compute_precisions_at_k,
    run_relevance_eval
)
from phoenix.trace.langchain import OpenInferenceTracer
from tqdm import tqdm

try:
    from google.colab.auth import authenticate_user

    IS_COLAB = True
except ImportError:
    IS_COLAB = False

## 2. Set Configuration and Authenticate with Vertex AI

If you are running this notebook in Colab, sign in with your Gmail credentials. If running locally, you'll need to ensure that your `gcloud` is correctly configured to run Vertex AI.

In [4]:
if IS_COLAB:
    authenticate_user()
else:
    print(
        "If running locally, ensure that your gcloud is correctly configured to run with Vertex AI."
    )

If running locally, ensure that your gcloud is correctly configured to run with Vertex AI.


Enter your project ID and location:
* `project_id`: The default project to use when making API calls.
* `location`: The default location to use when making API calls. If not set defaults to us-central-1.

In [5]:
project_id = input("Enter your GCP project ID and press enter:\n")
location = input("Enter your GCP location and press enter:\n")

Enter your GCP project ID and press enter:
 primal-oxide-268801
Enter your GCP location and press enter:
 us-central1


## 3. Launch Phoenix

You can run Phoenix in the background to collect trace data emitted by any LlamaIndex application that has been instrumented with the `OpenInferenceTraceCallbackHandler`.

Launch Phoenix and follow the instructions in the cell output to open the Phoenix UI (the UI should be empty because we have yet to run our LangChain application).

In [6]:
session = px.launch_app()

üåç To view the Phoenix app in your browser, visit http://127.0.0.1:6060/
üì∫ To view the Phoenix app in a notebook, run `px.active_session().view()`
üìñ For more information on how to use Phoenix, check out https://docs.arize.com/phoenix


## 4. Instantiate Your OpenInference Tracer

Instantiate a tracer to record your trace data in [OpenInference format](https://arize-ai.github.io/open-inference-spec/), an open standard for capturing and storing AI model inferences that enables production LLMapp servers to seamlessly integrate with LLM observability solutions such as Phoenix.

In [7]:
tracer = OpenInferenceTracer()

## 5. Build Your LLM Application

Define a `RetrievalQA` chain leveraging "textembedding-gecko" and "chat-bison" from the VertexAI API. The knowledge base of this chain is built over the Arize documentation.

In [8]:
embeddings = VertexAIEmbeddings(
    model_name="textembedding-gecko",
    project=project_id,
    location=location,
)
database_df = pd.read_parquet(
    "http://storage.googleapis.com/arize-assets/phoenix/datasets/unstructured/llm/context-retrieval/langchain-pinecone-vertexai/database.parquet"
)
knn_retriever = KNNRetriever(
    index=np.stack(database_df["text_vector"]),
    texts=database_df["text"].tolist(),
    embeddings=embeddings,
)
llm = ChatVertexAI(
    model_name="chat-bison",
    project=project_id,
    location=location,
)
chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=knn_retriever,
)

# 6. Run the Chain

Download a small dataset of user queries to ask your application.

In [9]:
url = "http://storage.googleapis.com/arize-assets/phoenix/datasets/unstructured/llm/context-retrieval/arize_docs_queries.jsonl"
queries = []
with urlopen(url) as response:
    for line in response:
        line = line.decode("utf-8").strip()
        data = json.loads(line)
        queries.append(data["query"])
queries[:10]

['How do I use the SDK to upload a ranking model?',
 'What drift metrics are supported in Arize?',
 'Does Arize support batch models?',
 'Does Arize support training data?',
 'How do I configure a threshold if my data has seasonality trends?',
 'How are clusters in the UMAP calculated? When are the clusters refreshed?',
 'How does Arize calculate AUC?',
 'Can I send truth labels to Arize separtely? ',
 'How do I send embeddings to Arize?',
 'Can I copy a dashboard']

Run your chain against ten queries. Notice that the tracer is attached in the `run` call.

In [None]:
for query in tqdm(queries[:10]):
    chain.run(query, callbacks=[tracer])

 70%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñâ                                         | 7/10 [00:16<00:09,  3.05s/it]

Check out your traces in Phoenix!

In [None]:
print(f"Open the Phoenix UI if you haven't already: {session.url}")

## 6. Export and Evaluate Your Trace Data

You can export your trace data as a pandas dataframe for further analysis and evaluation.

In this case, we will export our retriever spans and view them in a pandas dataframe.

In [None]:
trace_df = px.active_session().get_spans_dataframe('span_kind == "RETRIEVER"')
trace_df

Evaluate your retrieval spans and surface problematic spans:

- Make LLM calls to classify each retrieved document as relevant or irrelevant to the corresponding query,
- Compute the precision@k for k = 1, 2 for each document,
- Sort your spans by precision@2 to surface up the most problematic spans.


In [None]:
model = VertexAIModel(
    project=project_id,
    location=location,
    model_name="text-bison",
    temperature=0.0,
)

trace_df["llm_assisted_relevance"] = run_relevance_eval(trace_df, model=model)
trace_df["llm_assisted_precision_at_k"] = trace_df["llm_assisted_relevance"].map(
    lambda x: compute_precisions_at_k(x) if x else float("nan")
)
trace_df = trace_df.sort_values(
    by="llm_assisted_precision_at_k",
    key=lambda col: col.map(lambda x: x[-1] if isinstance(x, list) else 0.0),
    ascending=True,
)
trace_df[
    [
        "attributes.input.value",
        "attributes.retrieval.documents",
        "llm_assisted_relevance",
        "llm_assisted_precision_at_k",
    ]
]