# Experiments

### Setup

In [1]:
# Or you can use a .env file
from dotenv import load_dotenv
load_dotenv(override=True)

True

Here is the RAG Application that we've been working with throughout this course

In [4]:
import os
import tempfile
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders.sitemap import SitemapLoader
from langchain_community.vectorstores import SKLearnVectorStore
from langchain_openai import OpenAIEmbeddings
from langsmith import traceable
from openai import OpenAI
from typing import List
import nest_asyncio

# TODO: Configure this model!
MODEL_NAME = "gpt-3.5-turbo"
MODEL_PROVIDER = "openai"
APP_VERSION = 1.0
RAG_SYSTEM_PROMPT = """You are an assistant for question-answering tasks. 
Use the following pieces of retrieved context to answer the latest question in the conversation. 
If you don't know the answer, just say that you don't know. 
Use three sentences maximum and keep the answer concise.
"""

openai_client = OpenAI()

def get_vector_db_retriever():
    persist_path = os.path.join(tempfile.gettempdir(), "union.parquet")
    embd = OpenAIEmbeddings()

    # If vector store exists, then load it
    if os.path.exists(persist_path):
        vectorstore = SKLearnVectorStore(
            embedding=embd,
            persist_path=persist_path,
            serializer="parquet"
        )
        return vectorstore.as_retriever(lambda_mult=0)

    # Otherwise, index LangSmith documents and create new vector store
    ls_docs_sitemap_loader = SitemapLoader(web_path="https://docs.smith.langchain.com/sitemap.xml", continue_on_failure=True)
    ls_docs = ls_docs_sitemap_loader.load()

    text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
        chunk_size=500, chunk_overlap=0
    )
    doc_splits = text_splitter.split_documents(ls_docs)

    vectorstore = SKLearnVectorStore.from_documents(
        documents=doc_splits,
        embedding=embd,
        persist_path=persist_path,
        serializer="parquet"
    )
    vectorstore.persist()
    return vectorstore.as_retriever(lambda_mult=0)

nest_asyncio.apply()
retriever = get_vector_db_retriever()

"""
retrieve_documents
- Returns documents fetched from a vectorstore based on the user's question
"""
@traceable(run_type="chain")
def retrieve_documents(question: str):
    return retriever.invoke(question)

"""
generate_response
- Calls `call_openai` to generate a model response after formatting inputs
"""
@traceable(run_type="chain")
def generate_response(question: str, documents):
    formatted_docs = "\n\n".join(doc.page_content for doc in documents)
    messages = [
        {
            "role": "system",
            "content": RAG_SYSTEM_PROMPT
        },
        {
            "role": "user",
            "content": f"Context: {formatted_docs} \n\n Question: {question}"
        }
    ]
    return call_openai(messages)

"""
call_openai
- Returns the chat completion output from OpenAI
"""
@traceable(
    run_type="llm",
    metadata={
        "ls_provider": MODEL_PROVIDER,
        "ls_model_name": MODEL_NAME
    }
)
def call_openai(messages: List[dict]) -> str:
    return openai_client.chat.completions.create(
        model=MODEL_NAME,
        messages=messages,
    )

"""
langsmith_rag
- Calls `retrieve_documents` to fetch documents
- Calls `generate_response` to generate a response based on the fetched documents
- Returns the model response
"""
@traceable(run_type="chain")
def langsmith_rag(question: str):
    documents = retrieve_documents(question)
    response = generate_response(question, documents)
    return response.choices[0].message.content


### Experiment

Here is a code snippet that should look similar to what you see from the starter code!

There are a few important components here.

1. We have defined an Evaluator
2. We pipe our dataset examples (dict) to the shape of input that our function `langsmith_rag` takes (str) using a target function

In [3]:
from langsmith import evaluate, Client

client = Client()
dataset_name = "RAG Application Golden Dataset"

def is_concise_enough(reference_outputs: dict, outputs: dict) -> dict:
    score = len(outputs["output"]) < 1.5 * len(reference_outputs["output"])
    return {"key": "is_concise", "score": int(score)}

def target_function(inputs: dict):
    return langsmith_rag(inputs["question"])

evaluate(
    target_function,
    data=dataset_name,
    evaluators=[is_concise_enough],
    experiment_prefix="gpt-4o"
)

  from .autonotebook import tqdm as notebook_tqdm


View the evaluation results for experiment: 'gpt-4o-33c0ef6e' at:
https://smith.langchain.com/o/54f14e87-ff07-44be-8054-7d3057dedd08/datasets/8a1d2349-001b-4389-9d2f-1761f770180f/compare?selectedSessions=4847442a-524a-46ed-b4f1-f308082e6ee9




15it [00:42,  2.81s/it]


Unnamed: 0,inputs.question,outputs.output,error,reference.output,feedback.is_concise,execution_time,example_id,id
0,How do I pass metadata in with @traceable?,"To pass metadata in with `@traceable`, you sho...",,You can pass metadata with the @traceable deco...,1,4.456727,013f5c2a-d656-4b5c-ac2a-7aa6d9851055,d2e5f2e4-242e-4ebf-8671-d2094b421ec9
1,How do I create user feedback with the LangSmi...,To create user feedback using the LangSmith SD...,,To create user feedback with the LangSmith SDK...,1,3.261547,66dc8ce4-b64b-435c-8402-c51655526c49,3cfe86a8-0e4f-4ce2-bbf4-77bcb79d92ea
2,What integrations does LangSmith offer for dat...,LangSmith integrates with ClickHouse and Postg...,,LangSmith offers integrations with various dat...,1,2.609381,694aeaeb-ea3e-4211-8c51-ea1c06225c0f,3f1c5f4c-4305-4190-a132-cfc761360b39
3,Can LangSmith be used for finetuning and model...,LangSmith is primarily a platform for building...,,"Yes, LangSmith can be used for fine-tuning and...",1,2.900767,af19e03e-4214-4b09-9297-ef19e9769cd6,6c76c54b-a948-4e8b-aa16-7f305fd4ef95
4,What are the benefits of using LangSmith for L...,LangSmith offers several benefits for LLM appl...,,The benefits of using LangSmith for LLM applic...,1,3.487851,b1bc04fe-c97d-433a-b395-23b9de408e93,81a820de-36b2-441e-b3e3-d24fa5abd4fe
5,How do I log custom events with LangSmith?,"To log custom events with LangSmith, you need ...",,"To log custom events with LangSmith, use the `...",1,3.605289,d7464063-ba20-4f02-acc5-30c33c37df23,d2defd7f-2352-42d7-a48f-143a7564ac7f
6,Is there a Javascript LangSmith SDK?,"Yes, there is a JavaScript SDK for LangSmith.",,"Yes, there is a Javascript LangSmith SDK!",1,1.203041,d4490e4c-780d-41e4-b5af-41f4f55acf2d,3af41539-ceee-41b9-8ac8-a0db3c60e4a3
7,How do I set up tracing to LangSmith if I'm us...,To set up tracing with LangSmith while using L...,,To set up tracing to LangSmith using LangChain...,1,2.539009,62953086-630f-44c5-b909-21194c629f93,a8c0629d-2322-4447-bef5-5e104b93b91a
8,How do I set up tracing to LangSmith if I'm us...,To set up tracing to LangSmith when using Lang...,,To set up tracing to LangSmith while using Lan...,1,3.245654,1f16e22f-6abe-443a-aa4e-21598e1c1ab6,c1cd238a-052f-40b0-a608-87d429a7986a
9,Does LangSmith support online evaluation?,"Yes, LangSmith supports online evaluation. It ...",,"Yes, LangSmith supports online evaluation as a...",1,3.61324,3420d01a-4d5f-4a66-92ff-46de047122ec,f86094d9-b54d-4247-b4a3-97d29b4fcd86


### Modifying your Application

Now, let's change our model to gpt-35-turbo and see how it performs!

Make this change, and then run this code snippet!

In [5]:
from langsmith import evaluate, Client
from langsmith.schemas import Example, Run

def target_function(inputs: dict):
    return langsmith_rag(inputs["question"])

evaluate(
    target_function,
    data=dataset_name,
    evaluators=[is_concise_enough],
    experiment_prefix="gpt-3.5-turbo"
)

View the evaluation results for experiment: 'gpt-3.5-turbo-c93504cb' at:
https://smith.langchain.com/o/54f14e87-ff07-44be-8054-7d3057dedd08/datasets/8a1d2349-001b-4389-9d2f-1761f770180f/compare?selectedSessions=4f7c7447-122b-4714-b7d9-33d8f6a0a708




15it [00:28,  1.91s/it]


Unnamed: 0,inputs.question,outputs.output,error,reference.output,feedback.is_concise,execution_time,example_id,id
0,How do I pass metadata in with @traceable?,"In @traceable, you can pass metadata by includ...",,You can pass metadata with the @traceable deco...,1,1.860785,013f5c2a-d656-4b5c-ac2a-7aa6d9851055,b4a90595-0106-4039-9a7f-f508437807e6
1,How do I create user feedback with the LangSmi...,To create user feedback using the LangSmith SD...,,To create user feedback with the LangSmith SDK...,1,1.487922,66dc8ce4-b64b-435c-8402-c51655526c49,bd98dadc-20ba-4d01-9a11-97ac3b0f8776
2,What integrations does LangSmith offer for dat...,"LangSmith offers integrations with ClickHouse,...",,LangSmith offers integrations with various dat...,1,2.984898,694aeaeb-ea3e-4211-8c51-ea1c06225c0f,2ab60c10-a43a-47ad-9b3e-d80329dbeb66
3,Can LangSmith be used for finetuning and model...,LangSmith is primarily designed for monitoring...,,"Yes, LangSmith can be used for fine-tuning and...",1,2.865167,af19e03e-4214-4b09-9297-ef19e9769cd6,3f894b81-d279-4b2a-a363-18bc43cebed6
4,What are the benefits of using LangSmith for L...,LangSmith allows users to easily run multiple ...,,The benefits of using LangSmith for LLM applic...,1,1.656999,b1bc04fe-c97d-433a-b395-23b9de408e93,78465643-906e-45c7-b96c-5321276b2bd0
5,How do I log custom events with LangSmith?,"To log custom LLM traces with LangSmith, you n...",,"To log custom events with LangSmith, use the `...",0,2.508037,d7464063-ba20-4f02-acc5-30c33c37df23,92653328-fc96-44f8-b0a0-e0da7e20f20e
6,Is there a Javascript LangSmith SDK?,"LangSmith offers a JavaScript/TypeScript SDK, ...",,"Yes, there is a Javascript LangSmith SDK!",0,0.940141,d4490e4c-780d-41e4-b5af-41f4f55acf2d,cb4571c6-e3d1-4f0e-ba0a-fa2b0fc906e0
7,How do I set up tracing to LangSmith if I'm us...,To set up tracing to LangSmith when using Lang...,,To set up tracing to LangSmith using LangChain...,1,1.824752,62953086-630f-44c5-b909-21194c629f93,efcac1f9-8d91-4be9-a266-7ad4c47de52b
8,How do I set up tracing to LangSmith if I'm us...,To set up tracing to LangSmith when using Lang...,,To set up tracing to LangSmith while using Lan...,0,1.577835,1f16e22f-6abe-443a-aa4e-21598e1c1ab6,cfe24242-4261-448f-8622-8516f2ec68dc
9,Does LangSmith support online evaluation?,"Yes, LangSmith supports online evaluations whi...",,"Yes, LangSmith supports online evaluation as a...",0,2.419033,3420d01a-4d5f-4a66-92ff-46de047122ec,872a2f3c-fd40-4099-a2b4-f085fbf7570c


### Running over Different pieces of Data

##### Dataset Version

You can execute an experiment on a specific version of a dataset in the sdk by using the `as_of` parameter in `list_examples`

Let's try running on just our initial dataset.

In [8]:
evaluate(
    target_function,
    data=client.list_examples(dataset_name=dataset_name, as_of="initial dataset"),   # We use as_of to specify a version
    evaluators=[is_concise_enough],
    experiment_prefix="initial dataset version"
)

View the evaluation results for experiment: 'initial dataset version-c590bb67' at:
https://smith.langchain.com/o/54f14e87-ff07-44be-8054-7d3057dedd08/datasets/8a1d2349-001b-4389-9d2f-1761f770180f/compare?selectedSessions=dac83a30-fa0e-4a20-9471-1b139da419fb




10it [00:35,  3.55s/it]


Unnamed: 0,inputs.question,outputs.output,error,reference.output,feedback.is_concise,execution_time,example_id,id
0,How do I pass metadata in with @traceable?,To pass metadata along with the @traceable fun...,,You can pass metadata with the @traceable deco...,1,1.576719,013f5c2a-d656-4b5c-ac2a-7aa6d9851055,1fea40cd-6944-4245-815c-133495b18c61
1,How do I set up tracing to LangSmith if I'm us...,To set up tracing to LangSmith when using Lang...,,To set up tracing to LangSmith while using Lan...,0,1.691673,1f16e22f-6abe-443a-aa4e-21598e1c1ab6,749b364e-2f68-4b79-9741-f0d7c84c0e6f
2,Does LangSmith support online evaluation?,"Yes, LangSmith supports online evaluation thro...",,"Yes, LangSmith supports online evaluation as a...",0,2.032186,3420d01a-4d5f-4a66-92ff-46de047122ec,458ab30b-89a2-4758-bf68-47439c1d9b92
3,What is LangSmith used for in three sentences?,LangSmith is a platform for building productio...,,LangSmith is a platform designed for the devel...,1,2.143363,4bea205a-ea82-42d0-a15f-6bb5b5129521,5a4624b0-010d-4a57-a2ab-5970522841d7
4,Can LangSmith be used to evaluate agents?,"Yes, LangSmith can be used to evaluate agents ...",,"Yes, LangSmith can be used to evaluate agents....",1,1.251195,4e202564-da5a-4bc5-bc7e-f187d5b97788,272bd6de-4b3e-45d2-aa24-098cb9d3adee
5,How do I create user feedback with the LangSmi...,To create user feedback using the LangSmith SD...,,To create user feedback with the LangSmith SDK...,1,18.382388,66dc8ce4-b64b-435c-8402-c51655526c49,a2a52c2f-3796-4132-8f2a-49d091289386
6,How can I trace with the @traceable decorator?,"To trace with the @traceable decorator, you ne...",,To trace with the @traceable decorator in Pyth...,1,1.675385,96e3dc14-8655-4b76-9f77-194b2ebe33e1,00522bab-6ce3-4f0e-8f98-dd861217dea2
7,Does LangSmith support offline evaluation?,LangSmith does not explicitly mention support ...,,"Yes, LangSmith supports offline evaluation thr...",1,2.321673,a03ea8d2-4ed9-4160-8b64-5c16c01ce4a5,edd896f9-b479-47b6-9516-3025f95c69f5
8,Can LangSmith be used for finetuning and model...,"LangSmith is primarily focused on monitoring, ...",,"Yes, LangSmith can be used for fine-tuning and...",1,1.988585,af19e03e-4214-4b09-9297-ef19e9769cd6,6434cdc6-cbaf-4e46-99e0-4412cf834c56
9,What testing capabilities does LangSmith have?,LangSmith in LangChain offers testing capabili...,,LangSmith offers capabilities for creating dat...,1,1.844225,bb5d51d9-2c74-4b17-9f5e-413064e943fd,dc556aca-1b44-4f53-af95-71191a51fcdc


##### Dataset Split

You can run an experiment on a specific split of your dataset, let's try running on the Crucial Examples split.

In [9]:
evaluate(
    target_function,
    data=client.list_examples(dataset_name=dataset_name, splits=["Crucial Examples"]),  # We pass in a list of Splits
    evaluators=[is_concise_enough],
    experiment_prefix="Crucial Examples split"
)

View the evaluation results for experiment: 'Crucial Examples split-6043a43e' at:
https://smith.langchain.com/o/54f14e87-ff07-44be-8054-7d3057dedd08/datasets/8a1d2349-001b-4389-9d2f-1761f770180f/compare?selectedSessions=8f352334-4149-4f9c-b4d5-cd3b9030a035




5it [00:08,  1.64s/it]


Unnamed: 0,inputs.question,outputs.output,error,reference.output,feedback.is_concise,execution_time,example_id,id
0,How do I pass metadata in with @traceable?,To pass metadata with `@traceable` in the give...,,You can pass metadata with the @traceable deco...,1,1.848248,013f5c2a-d656-4b5c-ac2a-7aa6d9851055,dbaed195-67e3-4b34-80fc-1a0ae354af7a
1,How do I create user feedback with the LangSmi...,To create user feedback with the LangSmith SDK...,,To create user feedback with the LangSmith SDK...,1,1.438165,66dc8ce4-b64b-435c-8402-c51655526c49,574faa3f-b8bc-4646-84e7-18f37cf52579
2,What integrations does LangSmith offer for dat...,"LangSmith offers integrations with ClickHouse,...",,LangSmith offers integrations with various dat...,1,1.17778,694aeaeb-ea3e-4211-8c51-ea1c06225c0f,5b2beaeb-46d3-4fe7-85be-7c3b2e783d4f
3,Can LangSmith be used for finetuning and model...,LangSmith is primarily focused on observabilit...,,"Yes, LangSmith can be used for fine-tuning and...",0,1.741863,af19e03e-4214-4b09-9297-ef19e9769cd6,b7ecedf8-f699-48cf-8c20-c65f405570a1
4,What are the benefits of using LangSmith for L...,LangSmith allows for easy management and compa...,,The benefits of using LangSmith for LLM applic...,1,1.412495,b1bc04fe-c97d-433a-b395-23b9de408e93,d3861330-8c31-4b0a-8ee1-d318e8c6ccb6


##### Specific Data Points

You can specify individual data points to run an experiment over as well

In [10]:
evaluate(
    target_function,
    data=client.list_examples(
        dataset_name=dataset_name, 
        example_ids=[   # We pass in a specific list of example_ids
            # TODO: You will need to paste in your own example ids for this to work!
            "3420d01a-4d5f-4a66-92ff-46de047122ec",
            "b1bc04fe-c97d-433a-b395-23b9de408e93"
        ]
    ),
    evaluators=[is_concise_enough],
    experiment_prefix="two specific example ids"
)

View the evaluation results for experiment: 'two specific example ids-900059c8' at:
https://smith.langchain.com/o/54f14e87-ff07-44be-8054-7d3057dedd08/datasets/8a1d2349-001b-4389-9d2f-1761f770180f/compare?selectedSessions=f98f9008-1b43-46fa-ab37-ce6ee88b569a




2it [00:04,  2.07s/it]


Unnamed: 0,inputs.question,outputs.output,error,reference.output,feedback.is_concise,execution_time,example_id,id
0,Does LangSmith support online evaluation?,"Yes, LangSmith supports online evaluation thro...",,"Yes, LangSmith supports online evaluation as a...",0,1.473262,3420d01a-4d5f-4a66-92ff-46de047122ec,f80f7ba1-643c-44ad-8776-07efae1a94d9
1,What are the benefits of using LangSmith for L...,LangSmith offers the capability to easily view...,,The benefits of using LangSmith for LLM applic...,0,1.962131,b1bc04fe-c97d-433a-b395-23b9de408e93,77e59d4b-c01b-4081-8fea-eef7fdfbab30


### Other Parameters

##### Repetitions

You can run an experiment several times to make sure you have consistent results

In [11]:
evaluate(
    target_function,
    data=dataset_name,
    evaluators=[is_concise_enough],
    experiment_prefix="two repetitions",
    num_repetitions=2   # This field defaults to 1
)

View the evaluation results for experiment: 'two repetitions-c9c31784' at:
https://smith.langchain.com/o/54f14e87-ff07-44be-8054-7d3057dedd08/datasets/8a1d2349-001b-4389-9d2f-1761f770180f/compare?selectedSessions=84b0aef3-2a50-4f53-8a15-ea40ad21a48e




30it [00:50,  1.67s/it]


Unnamed: 0,inputs.question,outputs.output,error,reference.output,feedback.is_concise,execution_time,example_id,id
0,How do I pass metadata in with @traceable?,"To pass metadata with ""@traceable,"" you can cu...",,You can pass metadata with the @traceable deco...,1,2.340202,013f5c2a-d656-4b5c-ac2a-7aa6d9851055,eca61435-9024-4565-ae29-2fe79814ac90
1,How do I create user feedback with the LangSmi...,To create user feedback with the LangSmith SDK...,,To create user feedback with the LangSmith SDK...,1,2.380154,66dc8ce4-b64b-435c-8402-c51655526c49,89be7e96-b3cb-4e81-a418-4ca88b38b5ba
2,What integrations does LangSmith offer for dat...,"LangSmith offers integrations with ClickHouse,...",,LangSmith offers integrations with various dat...,1,2.314238,694aeaeb-ea3e-4211-8c51-ea1c06225c0f,8c9bae24-e1b0-4417-b3cf-336fdd8a6802
3,Can LangSmith be used for finetuning and model...,"LangSmith is mainly focused on monitoring, eva...",,"Yes, LangSmith can be used for fine-tuning and...",1,1.417022,af19e03e-4214-4b09-9297-ef19e9769cd6,61d4fc23-570d-436b-a82b-9cb2e7bd0ecb
4,What are the benefits of using LangSmith for L...,LangSmith allows running and comparing multipl...,,The benefits of using LangSmith for LLM applic...,1,1.564615,b1bc04fe-c97d-433a-b395-23b9de408e93,088e0b35-99d8-4395-9b4e-a9ded0b724d5
5,How do I log custom events with LangSmith?,"To log custom events with LangSmith, you must ...",,"To log custom events with LangSmith, use the `...",1,2.137431,d7464063-ba20-4f02-acc5-30c33c37df23,4abad699-c774-4834-b567-adb72852f7ec
6,Is there a Javascript LangSmith SDK?,"Yes, there is a JavaScript (JS/TS) SDK for Lan...",,"Yes, there is a Javascript LangSmith SDK!",0,1.626113,d4490e4c-780d-41e4-b5af-41f4f55acf2d,23606d94-cf7f-4cf1-9253-60f04867c6b1
7,How do I set up tracing to LangSmith if I'm us...,To set up tracing to LangSmith while using Lan...,,To set up tracing to LangSmith using LangChain...,1,1.327752,62953086-630f-44c5-b909-21194c629f93,973c1ca3-2dd9-4100-af85-1766ee67e5c4
8,How do I set up tracing to LangSmith if I'm us...,To enable distributed tracing across multiple ...,,To set up tracing to LangSmith while using Lan...,0,1.989898,1f16e22f-6abe-443a-aa4e-21598e1c1ab6,af6ec952-2322-4235-bc04-2406e442154c
9,Does LangSmith support online evaluation?,"Yes, LangSmith supports online evaluation thro...",,"Yes, LangSmith supports online evaluation as a...",0,1.327372,3420d01a-4d5f-4a66-92ff-46de047122ec,9c1170a1-a3a7-4952-84e0-195dce3486b3


##### Concurrency
You can also kick off concurrent threads of execution to make your experiments finish faster!

In [12]:
evaluate(
    target_function,
    data=dataset_name,
    evaluators=[is_concise_enough],
    experiment_prefix="concurrency",
    max_concurrency=3,  # This defaults to None, so this is an improvement!
)

View the evaluation results for experiment: 'concurrency-fb852ad3' at:
https://smith.langchain.com/o/54f14e87-ff07-44be-8054-7d3057dedd08/datasets/8a1d2349-001b-4389-9d2f-1761f770180f/compare?selectedSessions=573b6621-6b9d-4ba2-aa6f-402bcf45b1da




15it [00:09,  1.60it/s]


Unnamed: 0,inputs.question,outputs.output,error,reference.output,feedback.is_concise,execution_time,example_id,id
0,How do I pass metadata in with @traceable?,"To pass metadata in with @traceable, you can i...",,You can pass metadata with the @traceable deco...,1,1.396615,013f5c2a-d656-4b5c-ac2a-7aa6d9851055,9b0cde5d-86a9-49c4-845d-40bb5fe939e1
1,How do I create user feedback with the LangSmi...,To create user feedback with the LangSmith SDK...,,To create user feedback with the LangSmith SDK...,1,1.459603,66dc8ce4-b64b-435c-8402-c51655526c49,32b4e16d-fe3b-4e6f-bd6f-c99761c02767
2,What integrations does LangSmith offer for dat...,"LangSmith offers integrations with ClickHouse,...",,LangSmith offers integrations with various dat...,1,1.811433,694aeaeb-ea3e-4211-8c51-ea1c06225c0f,96b00277-4dbc-45cb-8874-2db0f24fcf57
3,What are the benefits of using LangSmith for L...,LangSmith allows users to run multiple experim...,,The benefits of using LangSmith for LLM applic...,1,1.514606,b1bc04fe-c97d-433a-b395-23b9de408e93,1252a890-20cc-4e1e-a2eb-0ffcb986eeb1
4,Can LangSmith be used for finetuning and model...,"LangSmith is primarily focused on monitoring, ...",,"Yes, LangSmith can be used for fine-tuning and...",1,1.843963,af19e03e-4214-4b09-9297-ef19e9769cd6,b972614a-2a4d-4d40-9363-7913502915c5
5,How do I log custom events with LangSmith?,"To log custom LLM traces with LangSmith, you m...",,"To log custom events with LangSmith, use the `...",0,2.016821,d7464063-ba20-4f02-acc5-30c33c37df23,6b1b3266-e2dd-4887-aa03-7f2cbd0dbbe9
6,Is there a Javascript LangSmith SDK?,"Yes, there is a JavaScript SDK (or JS/TS SDK) ...",,"Yes, there is a Javascript LangSmith SDK!",0,1.750938,d4490e4c-780d-41e4-b5af-41f4f55acf2d,2f694fd3-541b-4b3d-ba9d-638bc7970b14
7,How do I set up tracing to LangSmith if I'm us...,To set up tracing to LangSmith while using Lan...,,To set up tracing to LangSmith using LangChain...,1,1.761287,62953086-630f-44c5-b909-21194c629f93,7a6138c6-f61e-455c-97a0-0e5339ddc418
8,How do I set up tracing to LangSmith if I'm us...,To set up tracing to LangSmith when using Lang...,,To set up tracing to LangSmith while using Lan...,0,1.577226,1f16e22f-6abe-443a-aa4e-21598e1c1ab6,5cf2a48c-c002-4621-a5df-e7524f59d213
9,What is LangSmith used for in three sentences?,LangSmith is a platform designed for building ...,,LangSmith is a platform designed for the devel...,1,1.301769,4bea205a-ea82-42d0-a15f-6bb5b5129521,50766913-8d6c-416e-ac8c-9d3aaf3a4299


##### Metadata 

You can (and should) add metadata to your experiments, to make them easier to find in the UI

In [13]:
evaluate(
    target_function,
    data=dataset_name,
    evaluators=[is_concise_enough],
    experiment_prefix="metadata added",
    metadata={  # We can pass custom metadata for the experiment, such as the model name
        "model_name": MODEL_NAME
    }
)

View the evaluation results for experiment: 'metadata added-c8eeb25f' at:
https://smith.langchain.com/o/54f14e87-ff07-44be-8054-7d3057dedd08/datasets/8a1d2349-001b-4389-9d2f-1761f770180f/compare?selectedSessions=bc9992b1-ce08-48da-b673-6792ebb88071




15it [00:24,  1.62s/it]


Unnamed: 0,inputs.question,outputs.output,error,reference.output,feedback.is_concise,execution_time,example_id,id
0,How do I pass metadata in with @traceable?,"To pass metadata in with @traceable, you can a...",,You can pass metadata with the @traceable deco...,1,1.627154,013f5c2a-d656-4b5c-ac2a-7aa6d9851055,c0d32d93-5361-47c7-9093-b2930952ca91
1,How do I create user feedback with the LangSmi...,To create user feedback using the LangSmith SD...,,To create user feedback with the LangSmith SDK...,1,1.389851,66dc8ce4-b64b-435c-8402-c51655526c49,1becf79d-66d9-4712-95f5-7948d0d742f4
2,What integrations does LangSmith offer for dat...,"LangSmith offers integrations with ClickHouse,...",,LangSmith offers integrations with various dat...,1,1.294607,694aeaeb-ea3e-4211-8c51-ea1c06225c0f,e2f60401-d84e-4188-b30f-e9c929e80dfe
3,Can LangSmith be used for finetuning and model...,LangSmith is primarily designed for monitoring...,,"Yes, LangSmith can be used for fine-tuning and...",1,1.480968,af19e03e-4214-4b09-9297-ef19e9769cd6,d2434901-9832-4e62-b71d-43270d3d5a8e
4,What are the benefits of using LangSmith for L...,LangSmith allows users to run multiple experim...,,The benefits of using LangSmith for LLM applic...,1,1.295257,b1bc04fe-c97d-433a-b395-23b9de408e93,72c3304e-6855-42a3-bc98-3e4bf91f038e
5,How do I log custom events with LangSmith?,"To log custom LLM traces with LangSmith, you m...",,"To log custom events with LangSmith, use the `...",0,1.604803,d7464063-ba20-4f02-acc5-30c33c37df23,3e2ed6e9-7798-4162-ac9c-745c37046817
6,Is there a Javascript LangSmith SDK?,"Yes, there is a JS/TS (JavaScript/TypeScript) ...",,"Yes, there is a Javascript LangSmith SDK!",0,1.551659,d4490e4c-780d-41e4-b5af-41f4f55acf2d,66ef0f4a-9a3d-43ff-8592-b058cbd830d6
7,How do I set up tracing to LangSmith if I'm us...,"To set up tracing to LangSmith with LangChain,...",,To set up tracing to LangSmith using LangChain...,0,1.632549,62953086-630f-44c5-b909-21194c629f93,d24feea0-71c4-40d1-a1a6-3adac8a8619d
8,How do I set up tracing to LangSmith if I'm us...,To set up tracing to LangSmith when using Lang...,,To set up tracing to LangSmith while using Lan...,0,1.709414,1f16e22f-6abe-443a-aa4e-21598e1c1ab6,797c400f-ea7d-4477-b408-a6eb2cef9e54
9,Does LangSmith support online evaluation?,"Yes, LangSmith supports online evaluations thr...",,"Yes, LangSmith supports online evaluation as a...",0,1.816592,3420d01a-4d5f-4a66-92ff-46de047122ec,fd2cd6ba-521b-4ce7-a9b3-f8cb26caacb6


# Ran experiment on my dataset MAT496

In [17]:
from langsmith import evaluate, Client

client = Client()
dataset_name2 = "MAT496"

def is_concise_enough(reference_outputs: dict, outputs: dict) -> dict:
    score = len(outputs["output"]) < 1.5 * len(reference_outputs["output"])
    return {"key": "is_concise", "score": int(score)}

def target_function(inputs: dict):
    return langsmith_rag(inputs["question"])

evaluate(
    target_function,
    data=dataset_name,
    evaluators=[is_concise_enough],
    experiment_prefix="gpt-3.5-turbo"
)

View the evaluation results for experiment: 'gpt-3.5-turbo-737965f2' at:
https://smith.langchain.com/o/54f14e87-ff07-44be-8054-7d3057dedd08/datasets/d8968fe0-60ed-4e5a-ac64-8051f66e8189/compare?selectedSessions=f27b7b29-843d-42d7-884c-4759379a3010




10it [00:13,  1.33s/it]


Unnamed: 0,inputs.question,outputs.output,error,reference.output,feedback.is_concise,execution_time,example_id,id
0,Why do cats purr?,I don't know the answer to that question.,,Cats purr as a form of communication and self-...,1,1.175081,2fe9c5c2-19eb-4adb-9388-e471dffd1e1a,7a1b7cda-8ec8-414b-92cb-e9d5741fdb6f
1,What is gravity?,"I don't have information on ""gravity"" based on...",,Gravity is the force that attracts two bodies ...,1,1.31088,cab986e6-fd37-4c62-baa6-abfad4d4e3ac,c54d72dc-4354-443e-9444-83d25977d754
2,What is the tallest mountain in the world?,The tallest mountain in the world is Mount Eve...,,Mount Everest is the tallest mountain in the w...,0,1.678963,e4cb3445-15fd-4876-a162-e90f1877e262,3ed157be-66a0-4e62-a706-1cb02e5ec253
3,How do airplanes stay in the air?,I don't know the answer to that question.,,"Airplanes stay in the air because of lift, whi...",1,1.046344,2495beef-cc80-4642-a68b-4aac98984f7e,171f8853-6753-471b-9aec-edcd3fe3272e
4,How does photosynthesis work?,I don't know the answer to that question.,,Photosynthesis is the process by which green p...,1,1.079036,62cf05d1-eb91-4d00-94bf-86255131684d,d99d817e-59d7-441d-bfd0-98b9f808df0b
5,Why is the sky blue?,"I don't know the answer to ""Why is the sky blue?""",,The sky appears blue because of a phenomenon c...,1,1.232643,ac646e2a-f047-40d7-8836-bcfc78ec3389,3ec91c3c-7a38-4483-8483-9d44a19bb521
6,What causes rainbows?,"Rainbows are caused by the reflection, refract...",,"Rainbows are caused by the refraction, dispers...",0,1.925579,c4a2d2be-2329-4ceb-b20b-ab4c3e28c6fb,4dabe616-5e5f-4b70-9b82-5a0095b19ce5
7,Why do leaves change color in autumn?,I don't know the answer to that question.,,"In autumn, chlorophyll in leaves breaks down, ...",1,1.124792,d277c736-7c05-4dff-ad6a-95ebf3912dcd,e88c030d-b6c3-4b31-b53b-a8d52cfde8b3
8,How many bones are in the human body?,I don't know.,,"An adult human has 206 bones. At birth, there ...",1,1.11669,e1bc6e2f-3728-4f0a-a6c8-c648a5db7e0d,e6655307-c55a-4527-956c-995b6647f57f
9,Why do we sleep?,I don't know the answer to that question.,,"Sleep helps the body repair, consolidate memor...",1,1.055027,f59b802d-1f43-4148-8f87-12fa183285ad,0fa6c75b-1e5e-452d-946e-cb28ea92436f


# Running evaluation on a custom split using my own Experiment

In [21]:
evaluate(
    target_function,
    data=client.list_examples(dataset_name=dataset_name2, splits=["Crucial Examples"]),  # We pass in a list of Splits
    evaluators=[is_concise_enough],
    experiment_prefix="gpt-3.5-turbo"
)

View the evaluation results for experiment: 'gpt-3.5-turbo-03155b3c' at:
https://smith.langchain.com/o/54f14e87-ff07-44be-8054-7d3057dedd08/datasets/d8968fe0-60ed-4e5a-ac64-8051f66e8189/compare?selectedSessions=34a352d3-1389-446a-ac5c-a59bf9c27474




3it [00:04,  1.39s/it]


Unnamed: 0,inputs.question,outputs.output,error,reference.output,feedback.is_concise,execution_time,example_id,id
0,Why do cats purr?,"I don't know the answer to ""Why do cats purr?""",,Cats purr as a form of communication and self-...,1,1.451055,2fe9c5c2-19eb-4adb-9388-e471dffd1e1a,aa4eb1b4-49b8-4c63-9e51-e18ebe687599
1,What is gravity?,I don't have information on gravity in the pro...,,Gravity is the force that attracts two bodies ...,1,1.093875,cab986e6-fd37-4c62-baa6-abfad4d4e3ac,bd991088-0b04-42ce-9bde-83b96697bd32
2,What is the tallest mountain in the world?,I don't know the answer to that question.,,Mount Everest is the tallest mountain in the w...,1,1.130349,e4cb3445-15fd-4876-a162-e90f1877e262,5e2b8c20-cb3f-4411-9e3a-953d53a57430
