# Tracing for Different Types of Runs

### Types of Runs

LangSmith supports many different types of Runs - you can specify what type your Run is in the @traceable decorator. The types of runs are:

- LLM: Invokes an LLM
- Retriever: Retrieves documents from databases or other sources
- Tool: Executes actions with function calls
- Chain: Default type; combines multiple Runs into a larger process
- Prompt: Hydrates a prompt to be used with an LLM
- Parser: Extracts structured data

### Setup

In [1]:
# You can set them inline!
import os

In [2]:
# Or you can use a .env file
from dotenv import load_dotenv
load_dotenv(dotenv_path="../.env", override=True)

True

### LLM Runs for Chat Models

LangSmith provides special rendering and processing for LLM traces. In order to make the most of this feature, you must log your LLM traces in a specific format.

For chat-style models, inputs must be a list of messages in OpenAI-compatible format, represented as Python dictionaries or TypeScript object. Each message must contain the key role and content.

The output is accepted in any of the following formats:

- A dictionary/object that contains the key choices with a value that is a list of dictionaries/objects. Each dictionary/object must contain the key message, which maps to a message object with the keys role and content.
- A dictionary/object that contains the key message with a value that is a message object with the keys role and content.
- A tuple/array of two elements, where the first element is the role and the second element is the content.
- A dictionary/object that contains the key role and content.
The input to your function should be named messages.

You can also provide the following metadata fields to help LangSmith identify the model and calculate costs. If using LangChain or OpenAI wrapper, these fields will be automatically populated correctly.
- ls_provider: The provider of the model, eg "openai", "anthropic", etc.
- ls_model_name: The name of the model, eg "gpt-4o-mini", "claude-3-opus-20240307", etc.

In [7]:
from langsmith import traceable

inputs = [
  {"role": "system", "content": "You are a helpful assistant."},
  {"role": "user", "content": "I'd like to book a table for two."},
]

output = {
  "choices": [
      {
          "message": {
              "role": "assistant",
              "content": "Sure, what time would you like to book the table for?"
          }
      }
  ]
}

# Can also use one of:
# output = {
#     "message": {
#         "role": "assistant",
#         "content": "Sure, what time would you like to book the table for?"
#     }
# }
#
# output = {
#     "role": "assistant",
#     "content": "Sure, what time would you like to book the table for?"
# }
#
# output = ["assistant", "Sure, what time would you like to book the table for?"]
model_metadata= {
      "ls_provider" : "Meta Llama", 
      "ls_model_name" : "llama 3.2"
  }

@traceable(
  run_type="llm",  
  metadata= model_metadata
)
def chat_model(messages: list):
  return output

chat_model(inputs)

{'choices': [{'message': {'role': 'assistant',
    'content': 'Sure, what time would you like to book the table for?'}}]}

### Handling Streaming LLM Runs

For streaming, you can "reduce" the outputs into the same format as the non-streaming version. This is currently only supported in Python.

In [8]:
def _reduce_chunks(chunks: list):
    all_text = "".join([chunk["choices"][0]["message"]["content"] for chunk in chunks])
    return {"choices": [{"message": {"content": all_text, "role": "assistant"}}]}

@traceable(
    run_type="llm",
    metadata=model_metadata,
    reduce_fn= _reduce_chunks
)
def my_streaming_chat_model(messages: list):
    for chunk in ["Hello, " + messages[1]["content"]]:
        yield {
            "choices": [
                {
                    "message": {
                        "content": chunk,
                        "role": "assistant",
                    }
                }
            ]
        }

list(
    my_streaming_chat_model(
        [
            {"role": "system", "content": "You are a helpful assistant. Please greet the user."},
            {"role": "user", "content": "polly the parrot"},
        ],
    )
)

[{'choices': [{'message': {'content': 'Hello, polly the parrot',
     'role': 'assistant'}}]}]

### Retriever Runs + Documents

Many LLM applications require looking up documents from vector databases, knowledge graphs, or other types of indexes. Retriever traces are a way to log the documents that are retrieved by the retriever. LangSmith provides special rendering for retrieval steps in traces to make it easier to understand and diagnose retrieval issues. In order for retrieval steps to be rendered correctly, a few small steps need to be taken.

1. Annotate the retriever step with run_type="retriever".
2. Return a list of Python dictionaries or TypeScript objects from the retriever step. Each dictionary should contain the following keys:
    - page_content: The text of the document.
    - type: This should always be "Document".
    - metadata: A python dictionary or TypeScript object containing metadata about the document. This metadata will be displayed in the trace.

In [9]:
from langsmith import traceable

def _convert_docs(results):
  return [
      {
          "page_content": r,
          "type": "Document", # This is the wrong format! The key should be type
          "metadata": {"foo": "bar"}
      }
      for r in results
  ]

@traceable(
    run_type="retriever"
)
def retrieve_docs(query):
  # Retriever returning hardcoded dummy documents.
  # In production, this could be a real vector datatabase or other document index.
  contents = ["Document contents 1", "Document contents 2", "Document contents 3"]
  return _convert_docs(contents)

retrieve_docs("User query")

[{'page_content': 'Document contents 1',
  'type': 'Document',
  'metadata': {'foo': 'bar'}},
 {'page_content': 'Document contents 2',
  'type': 'Document',
  'metadata': {'foo': 'bar'}},
 {'page_content': 'Document contents 3',
  'type': 'Document',
  'metadata': {'foo': 'bar'}}]

### Tool Calling

LangSmith has custom rendering for Tool Calls made by the model to make it clear when provided tools are being used.

In [46]:
from langsmith import traceable
from typing import List, Optional
import json
from llmhelper import get_chatmodel_from_ollama

llm_client = get_chatmodel_from_ollama()

@traceable(
  run_type="tool"
)
def get_current_temperature(location: str, unit: str):
    return 65 if unit == "Fahrenheit" else 17

@traceable(run_type="llm")
def call_model(
    messages: List[dict], tools: Optional[List[dict]]
) -> str:
    llm_client_with_tools = llm_client.bind(tools = tools)
    response = llm_client_with_tools.invoke(messages)
    print("response --------->> \n " , response ) 
    return response

from langchain_core.messages import ToolMessage, AIMessage

@traceable(run_type="chain")
def ask_about_the_weather(inputs, tools):
  response = call_model(inputs, tools)
  #print("response ->> " , response)
  #print("response ->> " , response)
  print("response ->> " , response.response_metadata["model"] )
  print("response ->> " , response.response_metadata["message"] )  
  tool_call_args = response.response_metadata["message"].tool_calls[0].function.arguments
  location = tool_call_args["location"]
  unit = tool_call_args["unit"]
  tool_calls_response = response.tool_calls[0]  
  tool_input_message = {
    "role": "tool",
    "content": json.dumps({
        "location": location,
        "unit": unit,
        "temperature": get_current_temperature(location, unit),
    }),
    "tool_call_id": tool_calls_response["id"]
  }
  tool_msg = response.response_metadata["message"]  
  print("message type ===> ", tool_msg.role,  )
  print("message type ===> ", tool_msg.tool_calls[0] )
  #inputs.append(tool_msg)
  #inputs.append(ToolMessage(content=tool_calls_response["name"]))
  inputs.append(tool_input_message)
  output = call_model(inputs, None)
  return output

tools = [
    {
      "type": "function",
      "function": {
        "name": "get_current_temperature",
        "description": "Get the current temperature for a specific location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g., San Francisco, CA"
            },
            "unit": {
              "type": "string",
              "enum": ["Celsius", "Fahrenheit"],
              "description": "The temperature unit to use. Infer this from the user's location."
            }
          },
          "required": ["location", "unit"]
        }
      }
    }
]
inputs = [
  {"role": "system", "content": "You are a helpful assistant."},
  {"role": "user", "content": "What is the weather today in New York City IN Fahrenheit?"},
]

ask_about_the_weather(inputs, tools)

response --------->> 
  content='' additional_kwargs={} response_metadata={'model': 'llama3.2:3b', 'created_at': '2024-12-21T10:49:35.567642Z', 'done': True, 'done_reason': 'stop', 'total_duration': 958233167, 'load_duration': 27014667, 'prompt_eval_count': 209, 'prompt_eval_duration': 444000000, 'eval_count': 28, 'eval_duration': 485000000, 'message': Message(role='assistant', content='', images=None, tool_calls=[ToolCall(function=Function(name='get_current_temperature', arguments={'location': 'New York City, NY', 'unit': 'F'}))])} id='run-cca3fefd-24b2-47a6-ba96-aaf0c22e958a-0' tool_calls=[{'name': 'get_current_temperature', 'args': {'location': 'New York City, NY', 'unit': 'F'}, 'id': '6b53284d-78dd-4978-95a8-568395e8e8d8', 'type': 'tool_call'}] usage_metadata={'input_tokens': 209, 'output_tokens': 28, 'total_tokens': 237}
response ->>  llama3.2:3b
response ->>  role='assistant' content='' images=None tool_calls=[ToolCall(function=Function(name='get_current_temperature', arguments={

AIMessage(content='The current temperature in New York City is 17°F.', additional_kwargs={}, response_metadata={'model': 'llama3.2:3b', 'created_at': '2024-12-21T10:49:35.923588Z', 'done': True, 'done_reason': 'stop', 'total_duration': 341666083, 'load_duration': 11713875, 'prompt_eval_count': 71, 'prompt_eval_duration': 117000000, 'eval_count': 13, 'eval_duration': 212000000, 'message': Message(role='assistant', content='', images=None, tool_calls=None)}, id='run-692bc2dd-cca3-4a13-998b-5acaeb2f3781-0', usage_metadata={'input_tokens': 71, 'output_tokens': 13, 'total_tokens': 84})