# Conversational Threads

Many LLM applications have a chatbot-like interface in which the user and the LLM application engage in a multi-turn conversation. In order to track these conversations, you can use the Threads feature in LangSmith.

This is relevant to our RAG application, which should maintain context from prior conversations with users.

### Setup

In [1]:
# Or you can use a .env file
from dotenv import load_dotenv
load_dotenv()

True

### Group traces into threads


A Thread is a sequence of traces representing a single conversation. Each response is represented as its own trace, but these traces are linked together by being part of the same thread.

To associate traces together, you need to pass in a special metadata key where the value is the unique identifier for that thread.

The key value is the unique identifier for that conversation. The key name should be one of:

- session_id
- thread_id
- conversation_id.

The value should be a UUID.

In [3]:
import uuid
thread_id = uuid.uuid4()

In [4]:
from langsmith import traceable
from openai import OpenAI
from typing import List
import nest_asyncio
from utils import get_vector_db_retriever

openai_client = OpenAI()
nest_asyncio.apply()
retriever = get_vector_db_retriever()

@traceable(run_type="chain")
def retrieve_documents(question: str):
    return retriever.invoke(question)

@traceable(run_type="chain")
def generate_response(question: str, documents):
    formatted_docs = "\n\n".join(doc.page_content for doc in documents)
    rag_system_prompt = """You are an assistant for question-answering tasks. 
    Use the following pieces of retrieved context to answer the latest question in the conversation. 
    If you don't know the answer, just say that you don't know. 
    Use three sentences maximum and keep the answer concise.
    """
    messages = [
        {
            "role": "system",
            "content": rag_system_prompt
        },
        {
            "role": "user",
            "content": f"Context: {formatted_docs} \n\n Question: {question}"
        }
    ]
    return call_openai(messages)

@traceable(run_type="llm")
def call_openai(
    messages: List[dict], model: str = "gpt-4o-mini", temperature: float = 0.0
) -> str:
    return openai_client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature,
    )

@traceable(run_type="chain")
def langsmith_rag(question: str):
    documents = retrieve_documents(question)
    response = generate_response(question, documents)
    return response.choices[0].message.content


USER_AGENT environment variable not set, consider setting it to identify your requests.


### Now let's run our application twice with this thread_id

In [7]:
question = "Trace Number 1 for thread why is name thread_id or session_id or conversation_id preffered over other names for the metadata?"
ai_answer = langsmith_rag(question, langsmith_extra={"metadata": {"thread_id": thread_id}})
print(ai_answer)

The names "thread_id," "session_id," and "conversation_id" are preferred for metadata because they provide clear and specific identifiers for tracking conversations in a structured manner. Using these standardized names helps ensure consistency and makes it easier to filter and manage traces related to specific interactions. Additionally, they align with common practices in conversation tracking, enhancing interoperability across different systems.


In [8]:
question = "Trace Number 2 for thread what is the puropse of the UUID and why is it better than giving any random value to the session_id?"
ai_answer = langsmith_rag(question, langsmith_extra={"metadata": {"thread_id": thread_id}})
print(ai_answer)

The UUID serves as a unique identifier for the trace, ensuring that each run can be distinctly recognized and tracked within the system. Using a UUID is better than a random value for the session_id because it guarantees uniqueness and avoids potential collisions, which can lead to confusion in trace management. This structured approach enhances the reliability and integrity of the tracing process.


### Let's take a look in LangSmith!

In [9]:
import threading
import time
import uuid

def worker(thread_id):
    # Generate a unique ID for this task
    task_id = uuid.uuid4()
    print(f"Thread-{thread_id} started | Task ID: {task_id}")
    time.sleep(0.5)  # simulate some work
    print(f"Thread-{thread_id} finished | Task ID: {task_id}")

# Create and start multiple threads
threads = []
for i in range(5):  # launching 5 threads
    t = threading.Thread(target=worker, args=(i,))
    threads.append(t)
    t.start()

# Wait for all threads to finish
for t in threads:
    t.join()

print("All threads completed.")


Thread-0 started | Task ID: 8f15db95-7d5a-41b9-be7b-760dbd71daba
Thread-1 started | Task ID: 12010e73-4651-45f0-b605-29d3a5cd1572
Thread-2 started | Task ID: 8fa10f3a-5770-41bf-b99b-a1377f725d07
Thread-3 started | Task ID: ad3fb0da-ce7b-47d0-b1eb-3acc1558582f
Thread-4 started | Task ID: c4b89704-8348-41e4-9da8-cc475e3f9a39
Thread-0 finished | Task ID: 8f15db95-7d5a-41b9-be7b-760dbd71daba
Thread-1 finished | Task ID: 12010e73-4651-45f0-b605-29d3a5cd1572
Thread-2 finished | Task ID: 8fa10f3a-5770-41bf-b99b-a1377f725d07
Thread-3 finished | Task ID: ad3fb0da-ce7b-47d0-b1eb-3acc1558582f
Thread-4 finished | Task ID: c4b89704-8348-41e4-9da8-cc475e3f9a39
All threads completed.
