## Initial Setup


In [1]:
from dotenv import load_dotenv
import os

from ddtrace.llmobs import LLMObs

load_dotenv()

LLMObs.enable(
    api_key=os.environ.get("DD_API_KEY"),
    site=os.environ.get("DD_SITE", "datadoghq.com"),
    ml_app="test-onboarding-app",
    agentless_enabled=True,
)

## Creating and tracing a simple LLM service

In this notebook, we are building a service that takes a free text query about art from a user, and feeds it into the Metropolitan Museum of Art API to get a list of artwork.

The steps are:

1. Take a query from a user
2. Parse that query via a call to OpenAI
3. Send the parsed query to the [Metropolitan Museum of Art API](https://metmuseum.github.io/#search)
4. Return a list of urls to the user


### 1. Creating the tool to fetch data from the Met API

In the next cell, we create and instrument a "tool": a function that can send a query to the MET API's `/search` endpoint. The actual query will be created by a LLM call in a following cell.


In [2]:
import requests
from ddtrace.llmobs.decorators import *

SEARCH_ENDPOINT = "https://collectionapi.metmuseum.org/public/collection/v1/search"
MAX_RESULTS = 5


# learn more about instrumenting tool calls in our docs:
# https://docs.datadoghq.com/tracing/llm_observability/sdk/#tool-span
@tool()
def fetch_met_urls(query_parameters):
    LLMObs.annotate(
        input_data=query_parameters,
    )
    response = requests.get(SEARCH_ENDPOINT, params=query_parameters)
    response.raise_for_status()
    object_ids = response.json().get("objectIDs")
    objects_to_return = object_ids[:MAX_RESULTS] if object_ids else []
    urls = [
        f"https://www.metmuseum.org/art/collection/search/{objectId}"
        for objectId in objects_to_return
    ]
    LLMObs.annotate(
        output_data=urls,
    )
    return urls



In [3]:
# https://metmuseum.github.io/#search
fetch_met_urls_schema = {
    "type": "function",
    "function": {
        "name": "fetch_met_urls",
        "description": "Submits a query to the MET API and returns urls of relevant artworks",
        "parameters": {
            "type": "object",
            "properties": {
                "query_parameters": {
                    "type": "object",
                    "properties": {
                        "q": {
                            "type": "string",
                            "description": "Represents the users query. Required. Add as many search terms from the query as you can. 'medieval portraits', 'french impressionist paintings', etc.",
                        },
                        "title": {
                            "type": "boolean",
                            "description": "Limits the query to only apply to the title field.",
                        },
                        "tags": {
                            "type": "boolean",
                            "description": "Limits the query to only apply to the tags field.",
                        },
                        "isOnView": {
                            "type": "boolean",
                            "description": "Returns objects that match the query and are on view in the museum.",
                        },
                        "artistOrCulture": {
                            "type": "boolean",
                            "description": "Returns objects that match the query, specifically searching against the artist name or culture field for objects.",
                        },
                        "medium": {
                            "type": "string",
                            "description": 'Returns objects that match the query and are of the specified medium or object type. Examples include: "Ceramics", "Furniture", "Paintings", "Sculpture", "Textiles", etc.',
                        },
                        "geoLocation": {
                            "type": "string",
                            "description": 'Returns objects that match the query and the specified geographic location. Examples include: "Europe", "France", "Paris", "China", "New York", etc.',
                        },
                        "dateBegin": {
                            "type": "number",
                            "description": "You must use both dateBegin and dateEnd, or neither. Returns objects that match the query and fall between the dateBegin and dateEnd parameters. Examples include: dateBegin=1700&dateEnd=1800 for objects from 1700 A.D. to 1800 A.D., dateBegin=-100&dateEnd=100 for objects between 100 B.C. to 100 A.D.",
                        },
                        "dateEnd": {
                            "type": "number",
                            "description": "You must use both dateBegin and dateEnd, or neither. Returns objects that match the query and fall between the dateBegin and dateEnd parameters. Examples include: dateBegin=1700&dateEnd=1800 for objects from 1700 A.D. to 1800 A.D., dateBegin=-100&dateEnd=100 for objects between 100 B.C. to 100 A.D.",
                        },
                    },
                    "required": ["q"],
                },
            },
        },
    },
}

### 2. Creating an LLM call to handle parsing user input into a standardized query


In [4]:
from openai import OpenAI
import json

oai_client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

system_prompt = """
Example query inputs and outputs for the fetch_met_urls function:

query: medieval french tapestry painting
output: {'q': 'medieval french tapestry painting', geoLocation: 'France', medium: 'Textiles', dateBegin: 1000, dateEnd: 1500}

query: etruscan urns
output: {'q': 'etruscan urn', geoLocation: 'Italy', medium: 'Travertine'}

query: Cambodian hats from the 18th and 19th centuries
output: {'q': 'Cambodian hats', geolocation: 'Cambodia', 'dateBegin': 1700, 'dateEnd': 1900}

"""


def parse_query(message):
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": message},
    ]
    response_message = (
        oai_client.chat.completions.create(
            messages=messages,
            model="gpt-3.5-turbo",
            tools=[fetch_met_urls_schema],
            tool_choice={"type": "function", "function": {"name": "fetch_met_urls"}},
        )
        .choices[0]
        .message
    )
    if response_message.tool_calls:
        arguments = json.loads(response_message.tool_calls[0].function.arguments)
    return arguments["query_parameters"]

### 3. Creating the `find_artworks` function

Finally, we are creating a `find_artworks` function here that will tie the LLM call and tool call together. We annotate this as a workflow span:


In [5]:
# learn more about instrumenting workflow spans in our docs:
# https://docs.datadoghq.com/tracing/llm_observability/sdk/#workflow-span
@workflow()
def find_artworks(question):
    LLMObs.annotate(
        input_data=question,
    )
    query = parse_query(question)
    print("Parsed query parameters", query)
    urls = fetch_met_urls(query)
    LLMObs.annotate(
        output_data=urls,
    )
    return urls

Let's try it out:


In [6]:
urls = find_artworks("paintings of the french revolution")

Parsed query parameters {'q': 'paintings french revolution', 'dateBegin': 1789, 'dateEnd': 1799, 'geoLocation': 'France', 'medium': 'Paintings'}


In [7]:
import pprint

pprint.pp(urls)

['https://www.metmuseum.org/art/collection/search/437900',
 'https://www.metmuseum.org/art/collection/search/436875',
 'https://www.metmuseum.org/art/collection/search/193808',
 'https://www.metmuseum.org/art/collection/search/200414',
 'https://www.metmuseum.org/art/collection/search/193975']


## Viewing the trace in Datadog

Now, try checking out the LLM Observability interface in Datadog. You should see a trace that describes the workflow we just ran.
