# Experiments

### Setup

In [None]:
# You can set them inline
import os
os.environ["GROQ_API_KEY"] = ""
os.environ["LANGSMITH_API_KEY"] = ""
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_PROJECT"] = "langsmith-academy"

In [2]:
!pip install langchain_openai



In [3]:
!pip install groq



In [4]:
!pip install sentence-transformers



Here is the RAG Application that we've been working with throughout this course

In [6]:
import os
import tempfile
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders.sitemap import SitemapLoader
from langchain_community.vectorstores import SKLearnVectorStore
from langchain_community.embeddings import HuggingFaceEmbeddings # Changed from OpenAIEmbeddings
from langsmith import traceable
from groq import Groq
from typing import List
import nest_asyncio

# TODO: Configure this model!
MODEL_NAME = "openai/gpt-oss-120b"
MODEL_PROVIDER = "groq"
APP_VERSION = 1.0
RAG_SYSTEM_PROMPT = """You are an assistant for question-answering tasks.
Use the following pieces of retrieved context to answer the latest question in the conversation.
If you don't know the answer, just say that you don't know.
Use three sentences maximum and keep the answer concise.
"""

groq_client = Groq()

def get_vector_db_retriever():
    persist_path = os.path.join(tempfile.gettempdir(), "union.parquet")
    # Changed from OpenAIEmbeddings to HuggingFaceEmbeddings
    embd = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

    # If vector store exists, then load it
    if os.path.exists(persist_path):
        vectorstore = SKLearnVectorStore(
            embedding=embd,
            persist_path=persist_path,
            serializer="parquet"
        )
        return vectorstore.as_retriever(lambda_mult=0)

    # Otherwise, index LangSmith documents and create new vector store
    ls_docs_sitemap_loader = SitemapLoader(web_path="https://docs.smith.langchain.com/sitemap.xml", continue_on_failure=True)
    ls_docs = ls_docs_sitemap_loader.load()

    text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
        chunk_size=500, chunk_overlap=0
    )
    doc_splits = text_splitter.split_documents(ls_docs)

    vectorstore = SKLearnVectorStore.from_documents(
        documents=doc_splits,
        embedding=embd,
        persist_path=persist_path,
        serializer="parquet"
    )
    return vectorstore.as_retriever(lambda_mult=0)

nest_asyncio.apply()
retriever = get_vector_db_retriever()

"""
retrieve_documents
- Returns documents fetched from a vectorstore based on the user's question
"""
@traceable(run_type="chain")
def retrieve_documents(question: str):
    return retriever.invoke(question)

"""
generate_response
- Calls `call_groq` to generate a model response after formatting inputs
"""
@traceable(run_type="chain")
def generate_response(question: str, documents):
    formatted_docs = "\n\n".join(doc.page_content for doc in documents)
    messages = [
        {
            "role": "system",
            "content": RAG_SYSTEM_PROMPT
        },
        {
            "role": "user",
            "content": f"Context: {formatted_docs} \n\n Question: {question}"
        }
    ]
    return call_groq(messages)

"""
call_groq
- Returns the chat completion output from Groq
"""
@traceable(
    run_type="llm",
    metadata={
        "ls_provider": MODEL_PROVIDER,
        "ls_model_name": MODEL_NAME
    }
)
def call_groq(messages: List[dict]) -> str:
    return groq_client.chat.completions.create(
        model=MODEL_NAME,
        messages=messages,
    )

"""
langsmith_rag
- Calls `retrieve_documents` to fetch documents
- Calls `generate_response` to generate a response based on the fetched documents
- Returns the model response
"""
@traceable(run_type="chain")
def langsmith_rag(question: str):
    documents = retrieve_documents(question)
    response = generate_response(question, documents)
    return response.choices[0].message.content

Fetching pages: 100%|##########| 197/197 [00:24<00:00,  8.13it/s]


### Experiment

Here is a code snippet that should look similar to what you see from the starter code!

There are a few important components here.

1. We have defined an Evaluator
2. We pipe our dataset examples (dict) to the shape of input that our function `langsmith_rag` takes (str) using a target function

In [7]:
from langsmith import evaluate, Client

client = Client()
dataset_name = "RAG Application Golden Dataset"

def is_concise_enough(reference_outputs: dict, outputs: dict) -> dict:
    score = len(outputs["output"]) < 1.5 * len(reference_outputs["output"])
    return {"key": "is_concise", "score": int(score)}

def target_function(inputs: dict):
    return langsmith_rag(inputs["question"])

evaluate(
    target_function,
    data=dataset_name,
    evaluators=[is_concise_enough],
    experiment_prefix="gpt-4o"
)

View the evaluation results for experiment: 'gpt-4o-60018c4c' at:
https://smith.langchain.com/o/1a41bdfe-bec8-4ccc-a389-3f16500469f2/datasets/ac0db9f7-30c1-40a1-8df0-ba6b04ba4bc2/compare?selectedSessions=c782d797-b744-43ee-9ed4-51b419b74420




0it [00:00, ?it/s]

Unnamed: 0,inputs.question,outputs.output,error,reference.output,feedback.is_concise,execution_time,example_id,id
0,How can I trace with the @traceable decorator?,"Import `traceable` from the LangSmith SDK, set...",,To trace with the @traceable decorator in Pyth...,1,3.676107,0c6f44c1-5117-4fa7-9b4a-85669e582637,03fb0215-2bac-4a32-9f90-0867a465400f
1,How do I pass metadata in with @traceable?,Pass a dictionary of key‑value pairs to the `m...,,You can pass metadata with the @traceable deco...,1,3.703276,239741ad-5ea5-4196-af7b-3c62f667e122,a2a1e762-2cbf-4f5a-a909-62d937eb311f
2,What is LangSmith used for in three sentences?,"LangSmith is a platform for collecting, storin...",,LangSmith is a platform designed for the devel...,1,2.567167,2da9cb5f-f3f9-4fa0-9684-4d3a30ddbdbc,30381d12-a2ed-4be9-8481-fd83ed004a72
3,Can LangSmith be used for finetuning and model...,No. LangSmith is a platform for LLM observabil...,,"Yes, LangSmith can be used for fine-tuning and...",1,2.636382,823bedbc-f7b2-4c70-a0f8-aae5d9d4749c,7a45b3d9-99a6-4b26-b6b2-76c87bf07aaf
4,Does LangSmith support online evaluation?,The documentation only describes running evalu...,,"Yes, LangSmith supports online evaluation as a...",1,2.80218,92e5f718-cd77-470d-b21e-ff75ba375e31,de8f4dde-7746-4130-8268-8eba40bad8b0
5,Does LangSmith support offline evaluation?,"Yes, LangSmith supports offline evaluation. Af...",,"Yes, LangSmith supports offline evaluation thr...",1,5.124403,c84f0c56-f395-4187-ae59-967f0919c84a,b4376678-fcdd-4a53-8e27-75b650910e34
6,Can LangSmith be used to evaluate agents?,Yes. LangSmith provides built‑in support for e...,,"Yes, LangSmith can be used to evaluate agents....",1,3.874486,cd589a03-d384-48bf-b8de-bf742424d1b3,b8cf5d05-d6b3-4106-84db-7b9fe7a07eef
7,How do I create user feedback with the LangSmi...,Use the LangSmith SDK’s feedback endpoint to p...,,To create user feedback with the LangSmith SDK...,1,0.959755,f8c4d4a1-68f3-4c13-944e-a936014e7227,2d8149ed-d9b3-410b-832c-ddc0f72f39f0
8,How do I set up tracing to LangSmith if I'm us...,Install the LangSmith package (with OpenTeleme...,,To set up tracing to LangSmith while using Lan...,0,1.635614,f9075a65-85e3-4e74-816f-37066a79786c,1ffe4e77-6eff-4b70-99fa-eb04d6c03c40
9,What testing capabilities does LangSmith have?,LangSmith lets you **trace** your LLM app to s...,,LangSmith offers capabilities for creating dat...,1,0.736741,fbc5becc-ece2-4b6b-a77c-178d8506b112,68d32f4c-cdfe-4c8a-9314-cf7edd7cc286


### Modifying your Application

Now, let's change our model to gpt-35-turbo and see how it performs!

Make this change, and then run this code snippet!

In [8]:
from langsmith import evaluate, Client
from langsmith.schemas import Example, Run

def target_function(inputs: dict):
    return langsmith_rag(inputs["question"])

evaluate(
    target_function,
    data=dataset_name,
    evaluators=[is_concise_enough],
    experiment_prefix="gpt-3.5-turbo"
)

View the evaluation results for experiment: 'gpt-3.5-turbo-88169779' at:
https://smith.langchain.com/o/1a41bdfe-bec8-4ccc-a389-3f16500469f2/datasets/ac0db9f7-30c1-40a1-8df0-ba6b04ba4bc2/compare?selectedSessions=cd5a40d1-9d1a-46e8-9ba1-25e0c3b1d98c




0it [00:00, ?it/s]

Unnamed: 0,inputs.question,outputs.output,error,reference.output,feedback.is_concise,execution_time,example_id,id
0,How can I trace with the @traceable decorator?,Set `LANGSMITH_TRACING=true` (and `LANGSMITH_A...,,To trace with the @traceable decorator in Pyth...,1,0.815906,0c6f44c1-5117-4fa7-9b4a-85669e582637,7422272e-4d65-41bc-aa54-8a6537b4f56b
1,How do I pass metadata in with @traceable?,Pass a dictionary to the `metadata` argument o...,,You can pass metadata with the @traceable deco...,1,0.858754,239741ad-5ea5-4196-af7b-3c62f667e122,f19a9e96-8cad-4d1f-82ad-b0bbc49c0381
2,What is LangSmith used for in three sentences?,"LangSmith is a platform for collecting, storin...",,LangSmith is a platform designed for the devel...,1,0.966331,2da9cb5f-f3f9-4fa0-9684-4d3a30ddbdbc,f0cc3e5b-628c-43fc-8e9b-54a94cb9bc32
3,Can LangSmith be used for finetuning and model...,"No. LangSmith is an observability, evaluation,...",,"Yes, LangSmith can be used for fine-tuning and...",1,0.663049,823bedbc-f7b2-4c70-a0f8-aae5d9d4749c,cdcc7640-fa97-4007-9dc3-62fb3517d6f1
4,Does LangSmith support online evaluation?,I don’t know.,,"Yes, LangSmith supports online evaluation as a...",1,0.68393,92e5f718-cd77-470d-b21e-ff75ba375e31,7681f295-c479-4316-a7a2-3dd47d9862a0
5,Does LangSmith support offline evaluation?,"Yes, LangSmith supports offline evaluation. Af...",,"Yes, LangSmith supports offline evaluation thr...",1,1.337342,c84f0c56-f395-4187-ae59-967f0919c84a,0aa245a4-ce95-4b56-b9ec-f8fa3fb3c45f
6,Can LangSmith be used to evaluate agents?,Yes. LangSmith provides built‑in support for e...,,"Yes, LangSmith can be used to evaluate agents....",1,0.644802,cd589a03-d384-48bf-b8de-bf742424d1b3,1627cc2f-0979-4e5b-a620-053e455f49fd
7,How do I create user feedback with the LangSmi...,You can create feedback by calling the SDK’s `...,,To create user feedback with the LangSmith SDK...,1,0.917319,f8c4d4a1-68f3-4c13-944e-a936014e7227,df9cfbc8-9ac2-4402-8d3a-369fe356c0b5
8,How do I set up tracing to LangSmith if I'm us...,1. Install the LangSmith SDK (with OpenTelemet...,,To set up tracing to LangSmith while using Lan...,0,6.196731,f9075a65-85e3-4e74-816f-37066a79786c,14f6b920-e20a-4282-9d27-2daf40a64fb0
9,What testing capabilities does LangSmith have?,LangSmith lets you trace an LLM application to...,,LangSmith offers capabilities for creating dat...,1,5.942248,fbc5becc-ece2-4b6b-a77c-178d8506b112,01dd00f3-144e-4957-93b3-b31e006aefe4


### Running over Different pieces of Data

##### Dataset Version

You can execute an experiment on a specific version of a dataset in the sdk by using the `as_of` parameter in `list_examples`

Let's try running on just our initial dataset.

In [11]:
evaluate(
    target_function,
    data=client.list_examples(dataset_name=dataset_name, as_of="initial dataset"),   # We use as_of to specify a version
    evaluators=[is_concise_enough],
    experiment_prefix="initial dataset version"
)

StopIteration: 

"+ Tag this version" not available in Langsmith UI (Examples Section), hence the error

##### Dataset Split

You can run an experiment on a specific split of your dataset, let's try running on the Crucial Examples split.

In [12]:
evaluate(
    target_function,
    data=client.list_examples(dataset_name=dataset_name, splits=["Crucial Examples"]),  # We pass in a list of Splits
    evaluators=[is_concise_enough],
    experiment_prefix="Crucial Examples split"
)

View the evaluation results for experiment: 'Crucial Examples split-1b99d5fe' at:
https://smith.langchain.com/o/1a41bdfe-bec8-4ccc-a389-3f16500469f2/datasets/ac0db9f7-30c1-40a1-8df0-ba6b04ba4bc2/compare?selectedSessions=6416c0d8-c3b2-4ee8-862c-056b5977da01




0it [00:00, ?it/s]

Unnamed: 0,inputs.question,outputs.output,error,reference.output,feedback.is_concise,execution_time,example_id,id
0,How do I pass metadata in with @traceable?,You can supply a metadata dictionary directly ...,,You can pass metadata with the @traceable deco...,1,0.901693,239741ad-5ea5-4196-af7b-3c62f667e122,f3c8bfef-1e20-4cc4-9fc8-ac41a957d891
1,Can LangSmith be used for finetuning and model...,"No, LangSmith is designed for observability, e...",,"Yes, LangSmith can be used for fine-tuning and...",1,0.842128,823bedbc-f7b2-4c70-a0f8-aae5d9d4749c,db232b28-32d3-445c-a93e-c6f249aeceea
2,Does LangSmith support offline evaluation?,Yes. LangSmith allows you to run offline evalu...,,"Yes, LangSmith supports offline evaluation thr...",1,0.942875,c84f0c56-f395-4187-ae59-967f0919c84a,d79d7da4-3c99-4921-a530-1db9429fbad4
3,How do I create user feedback with the LangSmi...,Use the LangSmith SDK’s `Client` to call `crea...,,To create user feedback with the LangSmith SDK...,1,0.909015,f8c4d4a1-68f3-4c13-944e-a936014e7227,b6f27a84-f750-4c50-9650-4b1f84a5df4a
4,How do I set up tracing to LangSmith if I'm us...,1. Install the LangSmith package with OpenTele...,,To set up tracing to LangSmith while using Lan...,0,0.885426,f9075a65-85e3-4e74-816f-37066a79786c,97c24d11-c875-4130-82b6-de8bfbac10a9


##### Specific Data Points

You can specify individual data points to run an experiment over as well

In [13]:
evaluate(
    target_function,
    data=client.list_examples(
        dataset_name=dataset_name,
        example_ids=[   # We pass in a specific list of example_ids
            # TODO: You will need to paste in your own example ids for this to work!
            "",
            ""
        ]
    ),
    evaluators=[is_concise_enough],
    experiment_prefix="two specific example ids"
)

LangSmithError: Failed to GET /examples in LangSmith API. HTTPError('422 Client Error: unknown for url: https://api.smith.langchain.com/examples?offset=0&id=&id=&inline_s3_urls=True&limit=100&dataset=ac0db9f7-30c1-40a1-8df0-ba6b04ba4bc2', '{"detail":["query.id.0: Input should be a valid UUID, invalid length: expected length 32 for simple format, found 0","query.id.1: Input should be a valid UUID, invalid length: expected length 32 for simple format, found 0"]}')

### Other Parameters

##### Repetitions

You can run an experiment several times to make sure you have consistent results

In [14]:
evaluate(
    target_function,
    data=dataset_name,
    evaluators=[is_concise_enough],
    experiment_prefix="two repetitions",
    num_repetitions=2   # This field defaults to 1
)

View the evaluation results for experiment: 'two repetitions-3cc22313' at:
https://smith.langchain.com/o/1a41bdfe-bec8-4ccc-a389-3f16500469f2/datasets/ac0db9f7-30c1-40a1-8df0-ba6b04ba4bc2/compare?selectedSessions=ddb8905c-e8b0-4792-82a9-c7023b603e05




0it [00:00, ?it/s]

Unnamed: 0,inputs.question,outputs.output,error,reference.output,feedback.is_concise,execution_time,example_id,id
0,How do I pass metadata in with @traceable?,Use the decorator’s `metadata` parameter. For ...,,You can pass metadata with the @traceable deco...,1,0.918834,239741ad-5ea5-4196-af7b-3c62f667e122,a59b3e45-427a-4925-9d31-5b2d206da0e9
1,Can LangSmith be used for finetuning and model...,"No. LangSmith is designed for observability, e...",,"Yes, LangSmith can be used for fine-tuning and...",1,0.831448,823bedbc-f7b2-4c70-a0f8-aae5d9d4749c,155e55f7-dcfb-4a35-8145-dbc6025d4374
2,Does LangSmith support offline evaluation?,Yes. LangSmith allows you to run offline evalu...,,"Yes, LangSmith supports offline evaluation thr...",1,0.89914,c84f0c56-f395-4187-ae59-967f0919c84a,7035eec6-189c-436b-bedb-6a72aa2725e3
3,How do I create user feedback with the LangSmi...,You can create feedback directly from the SDK ...,,To create user feedback with the LangSmith SDK...,1,0.96098,f8c4d4a1-68f3-4c13-944e-a936014e7227,af4602c0-5629-4f2d-a323-02962982979f
4,How do I set up tracing to LangSmith if I'm us...,Install the LangSmith package with OpenTelemet...,,To set up tracing to LangSmith while using Lan...,0,0.835902,f9075a65-85e3-4e74-816f-37066a79786c,93566abb-9945-4a9c-8177-6d2a643d9a66
5,How can I trace with the @traceable decorator?,Import the decorator (`from langsmith import t...,,To trace with the @traceable decorator in Pyth...,1,0.71518,0c6f44c1-5117-4fa7-9b4a-85669e582637,cd15498f-f055-4879-9ec9-3735ea37844d
6,What is LangSmith used for in three sentences?,"LangSmith is a platform that collects, stores,...",,LangSmith is a platform designed for the devel...,1,0.738611,2da9cb5f-f3f9-4fa0-9684-4d3a30ddbdbc,bc367dc3-3a8a-4545-8706-5f0d5ea81b6a
7,Does LangSmith support online evaluation?,"No, LangSmith currently provides offline, batc...",,"Yes, LangSmith supports online evaluation as a...",1,0.668861,92e5f718-cd77-470d-b21e-ff75ba375e31,69adda0f-03b2-465b-84bf-dc7c75fedc25
8,Can LangSmith be used to evaluate agents?,Yes. LangSmith provides built‑in support for e...,,"Yes, LangSmith can be used to evaluate agents....",1,5.864117,cd589a03-d384-48bf-b8de-bf742424d1b3,82bc5f36-4f04-4040-a734-b002257aeba6
9,What testing capabilities does LangSmith have?,LangSmith lets you **trace** an LLM applicatio...,,LangSmith offers capabilities for creating dat...,1,5.977301,fbc5becc-ece2-4b6b-a77c-178d8506b112,5ea3caf4-7257-4106-a028-93fa91d3d014


##### Concurrency
You can also kick off concurrent threads of execution to make your experiments finish faster!

In [15]:
evaluate(
    target_function,
    data=dataset_name,
    evaluators=[is_concise_enough],
    experiment_prefix="concurrency",
    max_concurrency=3,  # This defaults to None, so this is an improvement!
)

View the evaluation results for experiment: 'concurrency-c5e22a6e' at:
https://smith.langchain.com/o/1a41bdfe-bec8-4ccc-a389-3f16500469f2/datasets/ac0db9f7-30c1-40a1-8df0-ba6b04ba4bc2/compare?selectedSessions=99abec25-0828-4ef0-a25e-8f40635a0736




0it [00:00, ?it/s]

ERROR:langsmith.evaluation._runner:Error running target function: Error code: 429 - {'error': {'message': 'Rate limit reached for model `openai/gpt-oss-120b` in organization `org_01k3jxpyfmfc29vdryjk9aj4vr` service tier `on_demand` on tokens per minute (TPM): Limit 8000, Used 7568, Requested 1424. Please try again in 7.438s. Need more tokens? Upgrade to Dev Tier today at https://console.groq.com/settings/billing', 'type': 'tokens', 'code': 'rate_limit_exceeded'}}
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/dist-packages/langsmith/evaluation/_runner.py", line 1924, in _forward
    fn(*args, langsmith_extra=langsmith_extra)
  File "/tmp/ipython-input-862480817.py", line 5, in target_function
    return langsmith_rag(inputs["question"])
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/ipython-input-2317952141.py", line 111, in langsmith_rag
    response = generate_response(question, documents)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "

Unnamed: 0,inputs.question,outputs.output,error,reference.output,feedback.is_concise,execution_time,example_id,id,feedback.wrapper
0,Can LangSmith be used for finetuning and model...,"No. LangSmith is focused on LLM observability,...",,"Yes, LangSmith can be used for fine-tuning and...",1.0,4.362311,823bedbc-f7b2-4c70-a0f8-aae5d9d4749c,3a899d90-fd9a-4bba-ae5c-0094e08225c3,
1,Does LangSmith support offline evaluation?,"Yes, LangSmith supports offline evaluation. Af...",,"Yes, LangSmith supports offline evaluation thr...",1.0,12.029278,c84f0c56-f395-4187-ae59-967f0919c84a,051d5d99-f739-4285-8559-0a3c107e5ef4,
2,How do I pass metadata in with @traceable?,,"RateLimitError(""Error code: 429 - {'error': {'...",You can pass metadata with the @traceable deco...,,14.579454,239741ad-5ea5-4196-af7b-3c62f667e122,f582c7ab-9e7d-4dce-b2f2-223ae9d618ca,
3,How do I create user feedback with the LangSmi...,,"RateLimitError(""Error code: 429 - {'error': {'...",To create user feedback with the LangSmith SDK...,,14.542827,f8c4d4a1-68f3-4c13-944e-a936014e7227,b8ad786d-c6a6-4c09-898b-ad30a2d1a769,
4,How can I trace with the @traceable decorator?,Set the `LANGSMITH_TRACING` environment variab...,,To trace with the @traceable decorator in Pyth...,1.0,5.646962,0c6f44c1-5117-4fa7-9b4a-85669e582637,d4d9b246-bea3-47f0-85fb-3ca4421b3e60,
5,What is LangSmith used for in three sentences?,LangSmith is a platform that stores and proces...,,LangSmith is a platform designed for the devel...,1.0,7.62316,2da9cb5f-f3f9-4fa0-9684-4d3a30ddbdbc,d884fd42-5cde-40e5-b885-4bd50477a2f2,
6,How do I set up tracing to LangSmith if I'm us...,,"RateLimitError(""Error code: 429 - {'error': {'...",To set up tracing to LangSmith while using Lan...,,16.527214,f9075a65-85e3-4e74-816f-37066a79786c,fbc031a6-bb70-46b9-9b68-2c2c4486eed9,
7,What testing capabilities does LangSmith have?,LangSmith lets you trace an application to see...,,LangSmith offers capabilities for creating dat...,1.0,0.796839,fbc5becc-ece2-4b6b-a77c-178d8506b112,9bb9f249-9dde-49e1-8d4e-aa795b44b147,
8,Does LangSmith support online evaluation?,,"RateLimitError(""Error code: 429 - {'error': {'...","Yes, LangSmith supports online evaluation as a...",,9.531488,92e5f718-cd77-470d-b21e-ff75ba375e31,f3d2d135-c1f6-4d65-b482-6d60406c4f71,
9,Can LangSmith be used to evaluate agents?,Yes. LangSmith provides built‑in support for e...,,"Yes, LangSmith can be used to evaluate agents....",1.0,11.074101,cd589a03-d384-48bf-b8de-bf742424d1b3,a9963c2f-5de4-4756-9500-23ffbcb562d3,


##### Metadata

You can (and should) add metadata to your experiments, to make them easier to find in the UI

In [16]:
evaluate(
    target_function,
    data=dataset_name,
    evaluators=[is_concise_enough],
    experiment_prefix="metadata added",
    metadata={  # We can pass custom metadata for the experiment, such as the model name
        "model_name": MODEL_NAME
    }
)

View the evaluation results for experiment: 'metadata added-dd0195ee' at:
https://smith.langchain.com/o/1a41bdfe-bec8-4ccc-a389-3f16500469f2/datasets/ac0db9f7-30c1-40a1-8df0-ba6b04ba4bc2/compare?selectedSessions=99be9366-472f-475f-829e-9e71e5d38a51




0it [00:00, ?it/s]

Unnamed: 0,inputs.question,outputs.output,error,reference.output,feedback.is_concise,execution_time,example_id,id
0,How do I pass metadata in with @traceable?,Pass a dictionary to the metadata parameter of...,,You can pass metadata with the @traceable deco...,1,10.678925,239741ad-5ea5-4196-af7b-3c62f667e122,d93155da-0e4f-4dfc-8a29-1c87cf175994
1,Can LangSmith be used for finetuning and model...,"LangSmith is focused on LLM observability, eva...",,"Yes, LangSmith can be used for fine-tuning and...",1,5.032482,823bedbc-f7b2-4c70-a0f8-aae5d9d4749c,19319b61-fc00-4f0b-b8ba-c0ed0b0081ca
2,Does LangSmith support offline evaluation?,Yes. LangSmith allows you to run offline evalu...,,"Yes, LangSmith supports offline evaluation thr...",1,8.216073,c84f0c56-f395-4187-ae59-967f0919c84a,97bd45b3-2a5d-44b7-ac54-53e1dd785000
3,How do I create user feedback with the LangSmi...,You can add feedback from code by calling the ...,,To create user feedback with the LangSmith SDK...,1,8.564137,f8c4d4a1-68f3-4c13-944e-a936014e7227,bb661d2b-d249-4055-964b-d5bbf1195e7f
4,How do I set up tracing to LangSmith if I'm us...,Install the LangSmith client (with OpenTelemet...,,To set up tracing to LangSmith while using Lan...,0,10.270089,f9075a65-85e3-4e74-816f-37066a79786c,74c5d813-9d91-4b8b-86d0-6a5566f0be65
5,How can I trace with the @traceable decorator?,"To trace a function, set `LANGSMITH_TRACING=tr...",,To trace with the @traceable decorator in Pyth...,1,9.045512,0c6f44c1-5117-4fa7-9b4a-85669e582637,59eef109-17e2-4f73-aa3a-64ff53c74ea6
6,What is LangSmith used for in three sentences?,LangSmith is a platform that stores and proces...,,LangSmith is a platform designed for the devel...,1,3.095262,2da9cb5f-f3f9-4fa0-9684-4d3a30ddbdbc,23fcf313-3d6e-4d4e-9efd-5e537d26a091
7,Does LangSmith support online evaluation?,I don’t know.,,"Yes, LangSmith supports online evaluation as a...",1,8.602845,92e5f718-cd77-470d-b21e-ff75ba375e31,fce519bc-486e-422b-b3a3-eb1d8f17d1f7
8,Can LangSmith be used to evaluate agents?,Yes. LangSmith provides built‑in support for e...,,"Yes, LangSmith can be used to evaluate agents....",1,6.873095,cd589a03-d384-48bf-b8de-bf742424d1b3,a9a6843b-f6e4-4399-8c1a-b307809eccf7
9,What testing capabilities does LangSmith have?,LangSmith lets you **trace** each step of your...,,LangSmith offers capabilities for creating dat...,1,6.282547,fbc5becc-ece2-4b6b-a77c-178d8506b112,ee8b78ea-314b-4686-a9d4-404e7c4dd024
