# Tracing Basics

### Setup

Make sure you set your environment variables, including your OpenAI API key.

In [2]:
# You can set them inline
import os
os.environ["OPENAI_API_KEY"] = ""
os.environ["LANGSMITH_API_KEY"] = ""
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_PROJECT"] = "langsmith-academy"

In [7]:
# Or you can use a .env file
from dotenv import load_dotenv
load_dotenv(dotenv_path="../../.env", override=True)

True

### Tracing with @traceable

The @traceable decorator is a simple way to log traces from the LangSmith Python SDK. Simply decorate any function with @traceable.

The decorator works by creating a run tree for you each time the function is called and inserting it within the current trace. The function inputs, name, and other information is then streamed to LangSmith. If the function raises an error or if it returns a response, that information is also added to the tree, and updates are patched to LangSmith so you can detect and diagnose sources of errors. This is all done on a background thread to avoid blocking your app's execution.

In [13]:
# TODO: Import traceable
from langsmith import traceable
from openai import OpenAI
from typing import List
import nest_asyncio
from utils import get_vector_db_retriever

MODEL_PROVIDER = "openai"
MODEL_NAME = "gpt-4o-mini"
APP_VERSION = 2.0  # Updated version for custom implementation
RAG_SYSTEM_PROMPT = """You are a helpful AI assistant specializing in educational content. 
Use the following pieces of retrieved context to answer the latest question in the conversation. 
If you don't know the answer, just say that you don't know. 
Keep your responses clear, informative, and limited to four sentences maximum.
"""

openai_client = OpenAI()
nest_asyncio.apply()
retriever = get_vector_db_retriever()

# TODO: Set up tracing for each function
@traceable
def retrieve_documents(question: str):
    return retriever.invoke(question)   # NOTE: This is a LangChain vector db retriever, so this .invoke() call will be traced automatically

@traceable
def generate_response(question: str, documents):
    formatted_docs = "\n\n".join(doc.page_content for doc in documents)
    messages = [
        {
            "role": "system",
            "content": RAG_SYSTEM_PROMPT
        },
        {
            "role": "user",
            "content": f"Context: {formatted_docs} \n\n Question: {question}"
        }
    ]
    return call_openai(messages)

@traceable
def call_openai(
    messages: List[dict], model: str = MODEL_NAME, temperature: float = 0.1  # Slightly increased temperature for more variety
) -> str:
    return openai_client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature,
    )

@traceable
def my_custom_rag_system(question: str):  # Renamed function to be more personal
    documents = retrieve_documents(question)
    response = generate_response(question, documents)
    return response.choices[0].message.content

@traceable handles the RunTree lifecycle for you!

In [14]:
question = "What are the best practices for implementing traceable decorators in LangSmith?"
ai_answer = my_custom_rag_system(question)
print(ai_answer)

Best practices for implementing traceable decorators in LangSmith include using the `@traceable` decorator to log traces from your functions easily. Ensure that your environment variables, specifically `LANGSMITH_TRACING` set to 'true' and `LANGSMITH_API_KEY` set to your API key, are configured correctly to enable tracing. When implementing custom chat models, make sure your inputs and outputs conform to the required format recognized by LangSmith, including the correct structure for messages. Additionally, if wrapping synchronous functions, use the `await` keyword to ensure traces are logged properly.


##### Let's take a look in LangSmith!

### Adding Metadata

LangSmith supports sending arbitrary metadata along with traces.

Metadata is a collection of key-value pairs that can be attached to runs. Metadata can be used to store additional information about a run, such as the version of the application that generated the run, the environment in which the run was generated, or any other information that you want to associate with a run. Similar to tags, you can use metadata to filter runs in the LangSmith UI, and can be used to group runs together for analysis.

In [15]:
from langsmith import traceable

@traceable(
    # TODO: Add Metadata
    metadata={"vectordb": "sklearn", "custom_implementation": "akshat_version"}  # Added custom metadata
)
def retrieve_documents(question: str):
    return retriever.invoke(question)

@traceable(
    metadata={"response_style": "educational", "max_sentences": 4}  # Added custom metadata for response generation
)
def generate_response(question: str, documents):
    formatted_docs = "\n\n".join(doc.page_content for doc in documents)
    messages = [
        {
            "role": "system",
            "content": RAG_SYSTEM_PROMPT
        },
        {
            "role": "user",
            "content": f"Context: {formatted_docs} \n\n Question: {question}"
        }
    ]
    return call_openai(messages)

@traceable(
    # TODO: Add Metadata
    metadata={"model_name": MODEL_NAME, "model_provider": MODEL_PROVIDER, "temperature_setting": "low_creativity"}  # Enhanced metadata
)
def call_openai(
    messages: List[dict], model: str = MODEL_NAME, temperature: float = 0.1
) -> str:
    return openai_client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature,
    )

@traceable(
    metadata={"system_version": APP_VERSION, "creator": "akshat"}  # Added personal metadata
)
def my_custom_rag_system(question: str):
    documents = retrieve_documents(question)
    response = generate_response(question, documents)
    return response.choices[0].message.content

In [16]:
question = "How can I effectively use metadata in LangSmith for better debugging and analysis?"
ai_answer = my_custom_rag_system(question)
print(ai_answer)

To effectively use metadata in LangSmith for better debugging and analysis, you should define a series of transformations that collect relevant metadata from your traces and add them to your dataset. This allows you to analyze the context of your experiments, such as configurations and performance metrics. Additionally, you can view and compare multiple experiments associated with your dataset, which helps identify patterns or issues. Utilizing these features will enhance your ability to diagnose problems and optimize your application.


You can also add metadata at runtime!

In [17]:
question = "What are the advantages of adding runtime metadata versus static metadata in tracing?"
ai_answer = my_custom_rag_system(question, langsmith_extra={"metadata": {"runtime_query_type": "comparison", "difficulty_level": "intermediate"}})
print(ai_answer)

Adding runtime metadata allows for dynamic information to be captured during the execution of a trace, such as the environment details or specific conditions at the time of the run. This can provide valuable context for analyzing performance issues or understanding the behavior of the application under different circumstances. In contrast, static metadata typically contains fixed information, like version numbers or configuration settings, which may not reflect the real-time state of the application. Utilizing both types of metadata can enhance the depth of analysis and improve the observability of the system.


##### Let's take a look in LangSmith!