# Tracing vanilla code with LangSmith

In [21]:
from dotenv import load_dotenv
import os
print(load_dotenv('../.env'))
print(os.environ['LANGSMITH_PROJECT'])
os.environ['LANGSMITH_TRACING']="true"
os.environ['USER_AGENT'] = 'myagent'

True
agentic-ops


# Ingest Documents

In [8]:
from langchain.document_loaders import HuggingFaceDatasetLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

def preprocess_dataset(docs_list):
    text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
        chunk_size=700,
        chunk_overlap=50,
        disallowed_special=()
    )
    doc_splits = text_splitter.split_documents(docs_list)
    return doc_splits
# https://huggingface.co/datasets/m-ric/transformers_documentation_en
transformers_doc = HuggingFaceDatasetLoader("m-ric/transformers_documentation_en", "text")
docs = preprocess_dataset(transformers_doc.load()[:50])
docs[0]

Document(metadata={'filename': 'perf_train_cpu_many.md'}, page_content='"\\n\\n# Efficient Training on Multiple CPUs\\n\\nWhen training on a single CPU is too slow, we can use multiple CPUs. This guide focuses on PyTorch-based DDP enabling distributed CPU training efficiently.\\n\\n## Intel\\u00ae oneCCL Bindings for PyTorch\\n\\n[Intel\\u00ae oneCCL](https://github.com/oneapi-src/oneCCL) (collective communications library) is a library for efficient distributed deep learning training implementing such collectives like allreduce, allgather, alltoall. For more information on oneCCL, please refer to the [oneCCL documentation](https://spec.oneapi.com/versions/latest/elements/oneCCL/source/index.html) and [oneCCL specification](https://spec.oneapi.com/versions/latest/elements/oneCCL/source/index.html).\\n\\nModule `oneccl_bindings_for_pytorch` (`torch_ccl` before version 1.12)  implements PyTorch C10D ProcessGroup API and can be dynamically loaded as external ProcessGroup and only works on

In [9]:
from langchain_openai import OpenAIEmbeddings
from langchain_qdrant import QdrantVectorStore
from qdrant_client import QdrantClient

vectorstore = QdrantVectorStore.from_documents(
    docs,
    OpenAIEmbeddings(model=os.environ["EMBEDDING_MODEL"]),
    location=":memory:",
    collection_name="documentations",
)
retriever = vectorstore.as_retriever()

In [10]:
retriever.invoke("what is huggingface ? ")[0]

Document(metadata={'filename': 'philosophy.md', '_id': '4efe0c99d8744e159d41bdded1c0530b', '_collection_name': 'documentations'}, page_content="provided by the official authors\\n\\n    of said architecture.\\n\\n  - The code is usually as close to the original code base as possible which means some PyTorch code may be not as\\n\\n    *pytorchic* as it could be as a result of being converted TensorFlow code and vice versa.\\n\\nA few other goals:\\n\\n- Expose the models' internals as consistently as possible:\\n\\n  - We give access, using a single API, to the full hidden-states and attention weights.\\n\\n  - The preprocessing classes and base model APIs are standardized to easily switch between models.\\n\\n- Incorporate a subjective selection of promising tools for fine-tuning and investigating these models:\\n\\n  - A simple and consistent way to add new tokens to the vocabulary and embeddings for fine-tuning.\\n\\n  - Simple ways to mask and prune Transformer heads.\\n\\n- Easily

### Simple RAG application

In [11]:
from langsmith import traceable
from openai import OpenAI
from typing import List

MODEL_PROVIDER = "openai"
MODEL_NAME = os.environ["OPENAI_MODEL"]
APP_VERSION = 1.0
RAG_SYSTEM_PROMPT = """You are an assistant for question-answering tasks. 
Use the following pieces of retrieved context to answer the latest question in the conversation. 
If you don't know the answer, just say that you don't know. 
Use three sentences maximum and keep the answer concise.
"""

@traceable
def retrieve_documents(question: str):
    return retriever.invoke(question)

@traceable
def generate_response(question: str, documents):
    formatted_docs = "\n\n".join(doc.page_content for doc in documents)
    messages = [
        {
            "role": "system",
            "content": RAG_SYSTEM_PROMPT
        },
        {
            "role": "user",
            "content": f"Context: {formatted_docs} \n\n Question: {question}"
        }
    ]
    return call_openai(messages)

@traceable
def call_openai(
    messages: List[dict], model: str = MODEL_NAME, temperature: float = 0.0
) -> str:
    return OpenAI().chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature,
    )

@traceable
def langsmith_rag(question: str):
    documents = retrieve_documents(question)
    response = generate_response(question, documents)
    return response.choices[0].message.content


In [12]:
question = "What is huggingface used for?"
ai_answer = langsmith_rag(question, langsmith_extra={"metadata": {"sample_runtime": "metadata added"}})
print(ai_answer)

Hugging Face is used for providing easy access to state-of-the-art machine learning models, especially in natural language processing (NLP), computer vision, and audio tasks. It offers tools and libraries to download, fine-tune, and deploy pretrained models for tasks like translation, text generation, summarization, and more. The platform also enables sharing and collaboration on models through the Hugging Face Hub.


https://smith.langchain.com/

# See tracing UI

# Tracing with metadata

In [13]:

@traceable(metadata={"vectordb": "qdrant"})
def retrieve_documents(question: str):
    return retriever.invoke(question)

@traceable
def generate_response(question: str, documents):
    formatted_docs = "\n\n".join(doc.page_content for doc in documents)
    messages = [
        {
            "role": "system",
            "content": RAG_SYSTEM_PROMPT
        },
        {
            "role": "user",
            "content": f"Context: {formatted_docs} \n\n Question: {question}"
        }
    ]
    return call_openai(messages)

@traceable(metadata={"model_name": MODEL_NAME, "model_provider": MODEL_PROVIDER})
def call_openai(
    messages: List[dict], model: str = MODEL_NAME, temperature: float = 0.0
) -> str:
    return OpenAI().chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature,
    )

@traceable
def langsmith_rag(question: str):
    documents = retrieve_documents(question)
    response = generate_response(question, documents)
    return response.choices[0].message.content

In [14]:
question = "What is huggingface used for?"
ai_answer = langsmith_rag(question, langsmith_extra={"metadata": {"sample_runtime": "metadata added"}})
print(ai_answer)

Hugging Face is used for providing easy access to state-of-the-art machine learning models, especially in natural language processing (NLP) tasks like translation, text generation, and language modeling. It offers a unified library (Transformers) that allows users to download, fine-tune, and deploy pretrained models for various tasks using simple APIs. The platform also supports sharing and collaborating on models through the Hugging Face Hub.


# Use run_type to recognize as a distinct category in LangSmith

In [15]:
from langsmith import traceable

inputs = [
  {"role": "system", "content": "You are a helpful assistant."},
  {"role": "user", "content": "I'd like to book a table for two."},
]

output = {
  "choices": [
      {
          "message": {
              "role": "assistant",
              "content": "Sure, what time would you like to book the table for?"
          }
      }
  ]
}

@traceable(run_type="llm") 
def chat_model(messages: list):
  return output

chat_model(inputs)

{'choices': [{'message': {'role': 'assistant',
    'content': 'Sure, what time would you like to book the table for?'}}]}

# Threads

In [22]:
import uuid
thread_id = uuid.uuid4()

In [23]:
from langsmith import traceable
from openai import OpenAI
from typing import List

openai_client = OpenAI()

@traceable(run_type="chain")
def retrieve_documents(question: str):
    return retriever.invoke(question)

@traceable(run_type="chain")
def generate_response(question: str, documents):
    formatted_docs = "\n\n".join(doc.page_content for doc in documents)
    rag_system_prompt = """You are an assistant for question-answering tasks. 
    Use the following pieces of retrieved context to answer the latest question in the conversation. 
    If you don't know the answer, just say that you don't know. 
    Use three sentences maximum and keep the answer concise.
    """
    messages = [
        {
            "role": "system",
            "content": rag_system_prompt
        },
        {
            "role": "user",
            "content": f"Context: {formatted_docs} \n\n Question: {question}"
        }
    ]
    return call_openai(messages)

@traceable(run_type="llm")
def call_openai(
    messages: List[dict], model: str = "gpt-4o-mini", temperature: float = 0.0
) -> str:
    return openai_client.chat.completions.create(
        model=MODEL_NAME,
        messages=messages,
        temperature=temperature,
    )

@traceable(run_type="chain")
def langsmith_rag(question: str):
    documents = retrieve_documents(question)
    response = generate_response(question, documents)
    return response.choices[0].message.content


https://smith.langchain.com/

## Running twice to show 2 turns

In [24]:
question = "What is huggingface ?"
ai_answer = langsmith_rag(question, langsmith_extra={"metadata": {"thread_id": thread_id}})
print(ai_answer)

Hugging Face is a company and open-source platform that provides state-of-the-art machine learning models and tools, especially for natural language processing, computer vision, audio, and multimodal tasks. Their Transformers library supports easy downloading, training, and sharing of pretrained models across frameworks like PyTorch, TensorFlow, and JAX. The Hugging Face Model Hub allows users to share, version, and collaborate on models with a GitHub-like interface.


In [25]:
question = "what are datasets in huggingface then ? "
ai_answer = langsmith_rag(question, langsmith_extra={"metadata": {"thread_id": thread_id}})
print(ai_answer)

In Hugging Face, datasets are collections of data (such as text, images, or audio) organized for machine learning tasks, often stored on disk to efficiently handle large sizes. They can be easily loaded, processed, and transformed using the 🤗 Datasets library, which provides tools for mapping preprocessing functions, batching, and streaming data without inflating memory usage. These datasets are commonly used for training, evaluating, and testing models in NLP, vision, and audio tasks.


## Traces UI tab