# Conversational Threads

Many LLM applications have a chatbot-like interface in which the user and the LLM application engage in a multi-turn conversation. In order to track these conversations, you can use the Threads feature in LangSmith.

This is relevant to our RAG application, which should maintain context from prior conversations with users.

### Setup

In [5]:
# You can set them inline
import os
os.environ["OPENAI_API_KEY"] = ""
os.environ["LANGSMITH_API_KEY"] = ""
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_PROJECT"] = "langsmith-academy"  # If you don't set this, traces will go to the Default project

In [6]:
# Or you can use a .env file
from dotenv import load_dotenv
load_dotenv(dotenv_path="../../.env", override=True)

False

### Group traces into threads


A Thread is a sequence of traces representing a single conversation. Each response is represented as its own trace, but these traces are linked together by being part of the same thread.

To associate traces together, you need to pass in a special metadata key where the value is the unique identifier for that thread.

The key value is the unique identifier for that conversation. The key name should be one of:

- session_id
- thread_id
- conversation_id.

The value should be a UUID.

In [7]:
 #Imports, Setup, and Initialization with Debug

import uuid
import os
from dotenv import load_dotenv
from langsmith import traceable
from openai import OpenAI
from typing import List, Optional
import nest_asyncio
from utils import get_vector_db_retriever

# Load environment variables from .env file for API keys, project and tracing flags
load_dotenv(dotenv_path="../../.env", override=True)

os.environ["LANGSMITH_TRACING"] = "true"  # Enable tracing globally

# Generate a unique thread_id for conversation linking
thread_id = str(uuid.uuid4())
print(f"Generated thread_id for trace grouping: {thread_id}")

# Initialize OpenAI client and other dependencies
openai_client = OpenAI()
nest_asyncio.apply()
retriever = get_vector_db_retriever()


In [10]:
#  Document Retriever Function with Metadata and Error Handling

@traceable(run_type="chain")
def retrieve_documents(question: str, extra_metadata: Optional[dict] = None):
    print(f"Retrieving documents for question: {question}")
    try:
        docs = retriever.invoke(question)
        print(f"Retrieved {len(docs)} documents.")
    except Exception as ex:
        print(f"Error during retrieval: {ex}")
        docs = []

    # Inject trace metadata for thread grouping if provided
    metadata = {"thread_id": thread_id}
    if extra_metadata:
        metadata.update(extra_metadata)

    # Return both documents and injected metadata for trace association
    return {"documents": docs, "metadata": metadata}


NameError: name 'Optional' is not defined

### Now let's run our application twice with this thread_id

In [None]:
# Generate Response Function with Metadata, Debug, and Error Management

@traceable(run_type="chain")
def generate_response(question: str, documents: List, extra_metadata: Optional[dict] = None):
    formatted_docs = "\n\n".join(doc.page_content for doc in documents)
    print(f"Formatting documents, length: {len(formatted_docs)} characters")

    rag_system_prompt = """You are an assistant for question-answering tasks.
    Use the following pieces of retrieved context to answer the latest question in the conversation.
    If you don't know the answer, just say that you don't know.
    Use three sentences maximum and keep the answer concise.
    """
    messages = [
        {"role": "system", "content": rag_system_prompt},
        {"role": "user", "content": f"Context: {formatted_docs} \n\n Question: {question}"}
    ]

    # Inject or update metadata with thread_id for tracing linkage
    metadata = {"thread_id": thread_id}
    if extra_metadata:
        metadata.update(extra_metadata)

    try:
        response = call_openai(messages, langsmith_extra={"metadata": metadata})
        print("OpenAI response received.")
    except Exception as e:
        print(f"Error calling OpenAI: {e}")
        response = None

    return response


In [None]:
#  OpenAI Call Function with Metadata and Timing

@traceable(run_type="llm")
def call_openai(messages: List[dict], model: str = "gpt-4o-mini", temperature: float = 0.0, langsmith_extra: Optional[dict] = None):
    import time
    print(f"Calling OpenAI with model {model}...")
    start = time.time()

    try:
        response = openai_client.chat.completions.create(
            model=model,
            messages=messages,
            temperature=temperature,
            langsmith_extra=langsmith_extra
        )
        duration = time.time() - start
        print(f"OpenAI call duration: {duration:.2f} seconds")
    except Exception as error:
        print(f"OpenAI API call failed with error: {error}")
        raise

    return response


### Let's take a look in LangSmith!