# Distributed Tracing with LastMile Tracing SDK

In this example notebook, we showcase how to implement distributed tracing using the LastMile Tracing SDK.

## Notebook Outline
* [Introduction](#intro)
* [Setup](#setup)
* [Step 1: Instrument server code](#step1)
* [Step 2: Instrument client code](#step2)
* [Step 3: View Trace Data in UI](#step3)

<a name="intro"></a>
# Introduction
**Distributed tracing** allows you to track and analyze the flow of requests across multiple services or components in a distributed system.

This example involves the client sending a request to a server to generate a riddle with `gpt-3.5-turbo`. The server uses OpenAI's `gpt-3.5-turbo` to generate the riddle and returns it to the client. The LastMile Tracing SDK is used to to instrument tracing and capture trace information.

<a name="setup"></a>

# Setup

To begin, we need to install lastmile-eval library and a couple other libraries for the example.

In [3]:
!pip install lastmile-eval --upgrade
# fork of multiprocessing, uses dill instead of pickle.
!pip install multiprocess
!pip install flask

Collecting lastmile-eval
  Downloading lastmile_eval-0.0.51-py3-none-any.whl (2.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.1/2.1 MB[0m [31m7.7 MB/s[0m eta [36m0:00:00[0m
Collecting python-dotenv (from lastmile-eval)
  Downloading python_dotenv-1.0.1-py3-none-any.whl (19 kB)
Collecting result (from lastmile-eval)
  Downloading result-0.16.1-py3-none-any.whl (11 kB)
Collecting lastmile-utils (from lastmile-eval)
  Downloading lastmile_utils-0.0.24-py3-none-any.whl (15 kB)
Collecting llama-index (from lastmile-eval)
  Downloading llama_index-0.10.39-py3-none-any.whl (6.8 kB)
Collecting evaluate==0.4.1 (from lastmile-eval)
  Downloading evaluate-0.4.1-py3-none-any.whl (84 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.1/84.1 kB[0m [31m9.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting arize-phoenix-evals==0.5.0 (from lastmile-eval)
  Downloading arize_phoenix_evals-0.5.0-py3-none-any.whl (45 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━

Before we start this tutorial, we need the following tokens/keys:

* LastMile AI API Token: Go to the [LastMile Settings page](https://lastmileai.dev/settings?page=tokens). You will need to first create a LastMile AI account.
* OpenAI API Key: Go to [OpenAI API Keys page](https://platform.openai.com/account/api-keys) to create and access your OpenAI API Key.

We're using Google Colab's Secret Manager to set our tokens in this notebook.

In [2]:
from google.colab import userdata
import os

os.environ['OPENAI_API_KEY'] =  userdata.get('OPENAI_API_KEY')
os.environ['LASTMILE_API_TOKEN'] =  userdata.get('LASTMILE_API_TOKEN')

<a name="step1"></a>
# Step 1: Instrument Server Code
In this step, we are starting a Flask server that expsoses an endpoint for generating riddles. The server uses a `LastMile Tracer` for distributed tracing (across the client).


In [4]:
import json
import os

import multiprocess
from flask import Flask, request
from openai import OpenAI

from lastmile_eval.rag.debugger.api import LastMileTracer
from lastmile_eval.rag.debugger.tracing import get_lastmile_tracer


def server():
    # Instantiate LastMile Tracer
    tracer: LastMileTracer = get_lastmile_tracer(
        tracer_name="generate_riddle",
    )

    app = Flask(__name__)

    @app.route("/generate")
    def generate()-> str:
        """
        Endpoint that generates a riddle using OpenAI's GPT-3.5-turbo model.
        """
        span_context = request.headers.get("span") # Expect a span to be passed to this endpoint

        # Trace the span
        with tracer.start_as_current_span("generate_endpoint", context = span_context):
            response = OpenAI().chat.completions.create(messages = [{"role": "user", "content":"tell me a riddle"}], model = "gpt-3.5-turbo")
            riddle = response.choices[0].message.content
            return json.dumps(riddle)

    app.run(port=1234, debug=False)

# Start Server in a Subprocess to avoid blocking execution of succeeding cells
process = multiprocess.Process(target=server)
process.start()

  self.pid = os.fork()


<a name="step2"></a>
# Step 2: Instrument Client Code
In order distribute the trace context, we need to export the span context to a string and pass it as a header to the server.

In [6]:
import requests
from opentelemetry import trace

from lastmile_eval.rag.debugger.api import LastMileTracer
from lastmile_eval.rag.debugger.tracing import export_span, get_lastmile_tracer

# Instantiate Tracer2
tracer2: LastMileTracer = get_lastmile_tracer(
    tracer_name="generate_riddle", # project name
)

@tracer2.start_as_current_span("client")
def client_say_riddle():
    try:
        print("Message: Sending request to subprocess server, with span context.")

        # Export Span
        response = requests.get('http://127.0.0.1:1234/generate', headers={"span": export_span(trace.get_current_span())})
        return response.text

    except Exception as e:
        print(e)

client_say_riddle()



Message: Sending request to subprocess server, with span context.


'"I am taken from a mine, and shut up in a wooden case, from which I am never released, and yet I am used by almost every person. What am I?"'

<a name="step3"></a>

# Step 3: View Trace Data in UI
Now we can view our distributed trace in the RAG Debugger UI!
#### From your terminal:

Export your LASTMILE_API_TOKEN

```bash
export LASTMILE_API_TOKEN="<your-api-token>"
```

Run this CLI command to access the UI

```bash
rag-debug launch
```
Navigate to the 'Traces' Page where you see all the Traces listed including the Distributed Trace we set up in this example.

<img width="973" alt="Screen Shot 2024-05-28 at 1 28 58 AM" src="https://github.com/lastmile-ai/aiconfig/assets/81494782/35393b82-4d1e-4346-afc3-e4f6d99cd765/">

Let's click into the Trace.

<img width="973" alt="Screen Shot 2024-05-28 at 1 30 29 AM" src="https://github.com/lastmile-ai/aiconfig/assets/81494782/d25b9b96-2e30-4dfe-8245-076ae860c50c"/>

Here we can see that the server span with `generate_endpoint` has been successfully propagated to the client span with the Tracer. This shows that we're able to trace across multiple services or components in a distributed system.