# Running Chains on Traced Datasets

Developing applications with language models can be uniquely challenging. To manage this complexity and ensure reliable performance, LangChain provides tracing and evaluation functionality through . This notebook demonstrates how to run Chains, which are language model functions, on previously captured datasets or traces. Some common use cases for this approach include:

- Running an evaluation chain to grade previous runs.
- Comparing different chains, LLMs, and agents on traced datasets.
- Executing a stochastic chain multiple times over a dataset to generate metrics before deployment.

Please note that this notebook assumes you have LangChain+ tracing running in the background. To set it up, follow the [tracing directions here](..\/..\/tracing\/local_installation.md).


In [1]:
from langchain.client import LangChainClient

client = LangChainClient(
    api_url="http://localhost:8000",
    api_key=None,
)

## Seed an example dataset

If you have been using LangChainPlus already, you may have datasets available. To view all saved datasets, run:

```
datasets = client.list_datasets()
datasets
```
Datasets can be created in a number of ways, most often by collecting `Run`'s captured through the LangChain tracing API.

However, this notebook assumes you're running locally for the first time, so we'll start by uploading an example evaluation dataset.

In [2]:
# !pip install datasets > /dev/null
# !pip install pandas > /dev/null

In [3]:
import pandas as pd
from langchain.evaluation.loading import load_dataset

dataset = load_dataset("agent-search-calculator")
df = pd.DataFrame(dataset, columns=["question", "answer"])
df.columns = ["input", "output"] # The chain we want to evaluate below expects inputs with the "input" key 

Found cached dataset json (/Users/wfh/.cache/huggingface/datasets/LangChainDatasets___json/LangChainDatasets--agent-search-calculator-8a025c0ce5fb99d2/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4)


  0%|          | 0/1 [00:00<?, ?it/s]

In [4]:
df.head()

Unnamed: 0,input,output
0,How many people live in canada as of 2023?,"approximately 38,625,801"
1,who is dua lipa's boyfriend? what is his age r...,her boyfriend is Romain Gravas. his age raised...
2,what is dua lipa's boyfriend age raised to the...,her boyfriend is Romain Gravas. his age raised...
3,how far is it from paris to boston in miles,"approximately 3,435 mi"
4,what was the total number of points scored in ...,approximately 2.682651500990882


In [5]:
dataset_name = f"calculator_example.csv"

In [6]:
if dataset_name not in set([dataset.name for dataset in client.list_datasets()]):
    dataset = client.upload_dataframe(df, 
                            name=dataset_name,
                            description="Acalculator example dataset",
                            input_keys=["input"],
                            output_keys=["output"],
                   )

## Running a Chain on a Traced Dataset

Once you have a dataset, you can run a chain over it to see its results. The run traces will automatically be associated with the dataset for easy attribution and analysis.

**First, we'll define the chain we wish to run over the dataset.**

In this case, we're using an agent, but it can be any simple chain.

In [7]:
from langchain.chat_models import ChatOpenAI
from langchain.agents import initialize_agent, load_tools
from langchain.agents import AgentType

llm = ChatOpenAI(temperature=0)
tools = load_tools(['serpapi', 'llm-math'], llm=llm)
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=False)

**Now we're ready to run the chain!**

In [8]:
chain_results = await client.arun_chain_on_dataset(
    dataset_name=dataset_name,
    chain=agent,
    num_workers=5, # Optional, sets the number of examples to run at a time
    session_name="Calculator Dataset Runs", # Optional. Will be sent to the 'default' session otherwise
)

Chain failed for example d015179a-f4a7-40e1-ba2f-6ca90ed45fed. Error: unknown format from LLM: Assuming we don't have any information about the actual number of points scored in the 2023 super bowl, we cannot provide a mathematical expression to solve this problem.
Chain failed for example d151c47c-fe80-4b58-8a28-5d5523d42b27. Error: 'age'. Please try again with a valid numerical expression
Chain failed for example 0d63e5f0-83ce-49b2-826b-ced0ab4ca701. Error: invalid syntax. Perhaps you forgot a comma? (<expr>, line 1). Please try again with a valid numerical expression
Chain failed for example 2dc584d8-dbe8-4146-8872-f92440f1048e. Error: 'VariableNode' object is not callable. Please try again with a valid numerical expression


## Reviewing the Chain Results

The method called above returns a dictionary mapping Example IDs to the output of the chain.
You can directly inspect the results below.

In [9]:
chain_results

{'1781b1db-45cb-428f-8000-00482ccf7142': 'The distance between Paris and Boston is 3448 miles.',
 'd015179a-f4a7-40e1-ba2f-6ca90ed45fed': {'Error': "unknown format from LLM: Assuming we don't have any information about the actual number of points scored in the 2023 super bowl, we cannot provide a mathematical expression to solve this problem."},
 'd151c47c-fe80-4b58-8a28-5d5523d42b27': {'Error': "'age'. Please try again with a valid numerical expression"},
 'd437c17f-85b6-4d30-a4ff-8f2b7456ae43': 'The current population of Canada as of 2023 is 38,677,281.',
 'a4d4ddbd-eb82-4300-9473-97715baf8f26': "Anwar Hadid's age raised to the .43 power is approximately 2.68.",
 '0d63e5f0-83ce-49b2-826b-ced0ab4ca701': {'Error': 'invalid syntax. Perhaps you forgot a comma? (<expr>, line 1). Please try again with a valid numerical expression'},
 '7702d8f5-96d1-46ad-9919-cb40efed224c': '1.9347796717823205',
 'dcf59228-f3d4-4359-83d4-9032ec00a950': '0',
 'd93f5863-cdb1-4529-ae33-71013c8e5854': "Devin Bo

In [10]:
# You can navigate to the UI by clicking on the link below
client

In [11]:
# You can review all the chain runs over a given example as follows:
example_id = next(iter(chain_results))
example = client.read_example(example_id)

In [12]:
# For example, view the chain runs on this example
example.chain_runs

[{'child_runs': [{'child_runs': [{'start_time': '2023-05-04T01:40:58.004292',
      'end_time': '2023-05-04T01:41:00.124338',
      'extra': {},
      'execution_order': 3,
      'serialized': {'name': 'ChatOpenAI'},
      'session_id': 2,
      'example_id': '1781b1db-45cb-428f-8000-00482ccf7142',
      'error': None,
      'parent_chain_run_id': 2,
      'parent_tool_run_id': None,
      'prompts': ['Human: Answer the following questions as best you can. You have access to the following tools:\n\nSearch: A search engine. Useful for when you need to answer questions about current events. Input should be a search query.\nCalculator: Useful for when you need to answer questions about math.\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [Search, Calculator]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Act