# Tracing for Different Types of Runs

### Types of Runs

LangSmith supports many different types of Runs - you can specify what type your Run is in the @traceable decorator. The types of runs are:

- LLM: Invokes an LLM
- Retriever: Retrieves documents from databases or other sources
- Tool: Executes actions with function calls
- Chain: Default type; combines multiple Runs into a larger process
- Prompt: Hydrates a prompt to be used with an LLM
- Parser: Extracts structured data

### Setup

In [None]:
# You can set them inline!
import os
os.environ["OPENAI_API_KEY"] = ""
os.environ["LANGSMITH_API_KEY"] = ""
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_PROJECT"] = "langsmith-academy"

In [1]:
# Or you can use a .env file
from dotenv import load_dotenv
load_dotenv(dotenv_path="../../.env", override=True)

True

### LLM Runs for Chat Models

LangSmith provides special rendering and processing for LLM traces. In order to make the most of this feature, you must log your LLM traces in a specific format.

For chat-style models, inputs must be a list of messages in OpenAI-compatible format, represented as Python dictionaries or TypeScript object. Each message must contain the key role and content.

The output is accepted in any of the following formats:

- A dictionary/object that contains the key choices with a value that is a list of dictionaries/objects. Each dictionary/object must contain the key message, which maps to a message object with the keys role and content.
- A dictionary/object that contains the key message with a value that is a message object with the keys role and content.
- A tuple/array of two elements, where the first element is the role and the second element is the content.
- A dictionary/object that contains the key role and content.
The input to your function should be named messages.

You can also provide the following metadata fields to help LangSmith identify the model and calculate costs. If using LangChain or OpenAI wrapper, these fields will be automatically populated correctly.
- ls_provider: The provider of the model, eg "openai", "anthropic", etc.
- ls_model_name: The name of the model, eg "gpt-4o-mini", "claude-3-opus-20240307", etc.

In [8]:
from langsmith import traceable

# Custom input for a tutoring assistant scenario
inputs = [
  {"role": "system", "content": "You are a helpful programming tutor specializing in Python."},
  {"role": "user", "content": "Can you explain what a dictionary comprehension is?"},
]

# Custom output for the tutoring scenario
output = {
  "choices": [
      {
          "message": {
              "role": "assistant", 
              "content": "A dictionary comprehension is a concise way to create dictionaries in Python. It follows the syntax {key: value for item in iterable}. For example, {x: x**2 for x in range(5)} creates a dictionary mapping numbers to their squares."
          }
      }
  ]
}

# Can also use one of:
# output = {
#     "message": {
#         "role": "assistant",
#         "content": "A dictionary comprehension is a concise way to create dictionaries..."
#     }
# }
#
# output = {
#     "role": "assistant",
#     "content": "A dictionary comprehension is a concise way to create dictionaries..."
# }
#
# output = ["assistant", "A dictionary comprehension is a concise way to create dictionaries..."]

@traceable(
  # TODO: Add an run_type="llm", and metadata for ls_provider, and ls_model_name
  run_type="llm",
  metadata={"ls_provider": "custom_tutor", "ls_model_name": "python_tutor_v1", "subject": "programming"}
)
def programming_tutor_chat(messages: list):
  return output

programming_tutor_chat(inputs)

{'choices': [{'message': {'role': 'assistant',
    'content': 'A dictionary comprehension is a concise way to create dictionaries in Python. It follows the syntax {key: value for item in iterable}. For example, {x: x**2 for x in range(5)} creates a dictionary mapping numbers to their squares.'}}]}

### Handling Streaming LLM Runs

For streaming, you can "reduce" the outputs into the same format as the non-streaming version. This is currently only supported in Python.

In [9]:
def _reduce_study_chunks(chunks: list):
    all_text = "".join([chunk["choices"][0]["message"]["content"] for chunk in chunks])
    return {"choices": [{"message": {"content": all_text, "role": "assistant"}}]}

@traceable(
    run_type="llm",
    metadata={"ls_provider": "study_assistant", "ls_model_name": "streaming_tutor", "mode": "interactive"},
    # TODO: Add a reduce_fn
    reduce_fn=_reduce_study_chunks,
)
def my_streaming_study_assistant(messages: list):
    # Simulate streaming response for study assistance
    for chunk in ["Great question! ", "Let me break this down for you step by step..."]:
        yield {
            "choices": [
                {
                    "message": {
                        "content": chunk,
                        "role": "assistant",
                    }
                }
            ]
        }

list(
    my_streaming_study_assistant(
        [
            {"role": "system", "content": "You are a helpful study assistant. Break down complex topics clearly."},
            {"role": "user", "content": "machine learning concepts"},
        ],
    )
)

[{'choices': [{'message': {'content': 'Great question! ',
     'role': 'assistant'}}]},
 {'choices': [{'message': {'content': 'Let me break this down for you step by step...',
     'role': 'assistant'}}]}]

### Retriever Runs + Documents

Many LLM applications require looking up documents from vector databases, knowledge graphs, or other types of indexes. Retriever traces are a way to log the documents that are retrieved by the retriever. LangSmith provides special rendering for retrieval steps in traces to make it easier to understand and diagnose retrieval issues. In order for retrieval steps to be rendered correctly, a few small steps need to be taken.

1. Annotate the retriever step with run_type="retriever".
2. Return a list of Python dictionaries or TypeScript objects from the retriever step. Each dictionary should contain the following keys:
    - page_content: The text of the document.
    - type: This should always be "Document".
    - metadata: A python dictionary or TypeScript object containing metadata about the document. This metadata will be displayed in the trace.

In [10]:
from langsmith import traceable

def _convert_study_docs(results):
  return [
      {
          "page_content": r,
          "type": "Document", # This is the correct format for document type
          "metadata": {"subject": "computer_science", "difficulty": "beginner", "source": "custom_db"}
      }
      for r in results
  ]

@traceable(
    # TODO: Add an run_type="retriever"
    run_type="retriever",
    metadata={"database_type": "educational_content", "retrieval_method": "semantic_search"}
)
def retrieve_study_materials(query):
  # Custom retriever returning educational content about programming
  # In production, this could be a real educational database or knowledge base
  contents = [
      "Python functions are reusable blocks of code that perform specific tasks",
      "Variables in Python are containers for storing data values", 
      "Loops allow you to execute code repeatedly based on certain conditions"
  ]
  return _convert_study_docs(contents)

retrieve_study_materials("Python programming basics")

[{'page_content': 'Python functions are reusable blocks of code that perform specific tasks',
  'type': 'Document',
  'metadata': {'subject': 'computer_science',
   'difficulty': 'beginner',
   'source': 'custom_db'}},
 {'page_content': 'Variables in Python are containers for storing data values',
  'type': 'Document',
  'metadata': {'subject': 'computer_science',
   'difficulty': 'beginner',
   'source': 'custom_db'}},
 {'page_content': 'Loops allow you to execute code repeatedly based on certain conditions',
  'type': 'Document',
  'metadata': {'subject': 'computer_science',
   'difficulty': 'beginner',
   'source': 'custom_db'}}]

### Tool Calling

LangSmith has custom rendering for Tool Calls made by the model to make it clear when provided tools are being used.

In [11]:
from langsmith import traceable
from openai import OpenAI
from typing import List, Optional
import json

openai_client = OpenAI()

@traceable(
  # TODO: Add an run_type="tool"
  run_type="tool",
  metadata={"tool_category": "educational", "subject": "programming"}
)
def get_coding_difficulty(topic: str, language: str):
    """Tool to assess the difficulty level of a programming topic"""
    difficulty_map = {
        "variables": "beginner",
        "functions": "beginner", 
        "loops": "intermediate",
        "recursion": "advanced",
        "algorithms": "advanced"
    }
    return difficulty_map.get(topic.lower(), "intermediate")

@traceable(run_type="llm", metadata={"purpose": "educational_assistant"})
def call_openai_tutor(
    messages: List[dict], tools: Optional[List[dict]]
) -> str:
  return openai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    temperature=0.2,  # Slightly higher for more creative educational responses
    tools=tools
  )

@traceable(run_type="chain", metadata={"workflow": "educational_assessment"})
def assess_learning_topic(inputs, tools):
  response = call_openai_tutor(inputs, tools)
  tool_call_args = json.loads(response.choices[0].message.tool_calls[0].function.arguments)
  topic = tool_call_args["topic"]
  language = tool_call_args["language"]
  tool_response_message = {
    "role": "tool",
    "content": json.dumps({
        "topic": topic,
        "language": language,
        "difficulty": get_coding_difficulty(topic, language),
        "recommendation": f"This is a {get_coding_difficulty(topic, language)} level topic in {language}"
    }),
    "tool_call_id": response.choices[0].message.tool_calls[0].id
  }
  inputs.append(response.choices[0].message)
  inputs.append(tool_response_message)
  output = call_openai_tutor(inputs, None)
  return output

# Custom tools for educational assessment
educational_tools = [
    {
      "type": "function",
      "function": {
        "name": "get_coding_difficulty",
        "description": "Assess the difficulty level of a programming topic for learning purposes",
        "parameters": {
          "type": "object",
          "properties": {
            "topic": {
              "type": "string",
              "description": "The programming topic to assess, e.g., variables, functions, loops"
            },
            "language": {
              "type": "string",
              "enum": ["Python", "JavaScript", "Java", "C++"],
              "description": "The programming language context"
            }
          },
          "required": ["topic", "language"]
        }
      }
    }
]

# Custom input for educational assessment
educational_inputs = [
  {"role": "system", "content": "You are an educational assistant that helps assess programming topics."},
  {"role": "user", "content": "I want to learn about recursion in Python. Is it suitable for my level?"},
]

assess_learning_topic(educational_inputs, educational_tools)

ChatCompletion(id='chatcmpl-CMwPcZ6fLmx5pLFUqYAqOFvxhW3Xo', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="Recursion in Python is considered an advanced topic. It involves functions that call themselves to solve problems, which can be quite powerful but also complex. If you're comfortable with basic programming concepts like functions, loops, and data structures, you might be ready to tackle recursion. However, if you're still getting familiar with these foundational concepts, it might be beneficial to strengthen those skills first before diving into recursion. Would you like some resources or a brief overview of recursion?", refusal=None, role='assistant', annotations=[], audio=None, function_call=None, tool_calls=None))], created=1759583788, model='gpt-4o-mini-2024-07-18', object='chat.completion', service_tier='default', system_fingerprint='fp_560af6e559', usage=CompletionUsage(completion_tokens=91, prompt_tokens=104, total_token