In [1]:
import os
import pandas as pd
from athina.evals import ContextContainsEnoughInformation
from athina.loaders import RagLoader
from athina.keys import AthinaApiKey, OpenAiApiKey
from athina.interfaces.athina import AthinaExperiment
from athina.datasets import yc_query_mini

  from .autonotebook import tqdm as notebook_tqdm


### Configure your API keys

Evals use OpenAI, so you need to configure your OpenAI API key.

If you wish to view the results on Athina's UI, and maintain a historical record of experiments, then you also need an Athina API Key.

In [2]:
from dotenv import load_dotenv

load_dotenv()

OpenAiApiKey.set_key(os.getenv('OPENAI_API_KEY'))
AthinaApiKey.set_key(os.getenv('ATHINA_API_KEY')) # Optional, recommended

### Load your dataset

You can use one of our `loaders` to load the data from a Dictionary, CSV or JSON file.

Here's an example
```
from athina.loaders import RagLoader

dataset = RagLoader().load_dict(raw_data)
```

Here is the complete [documentation](https://docs.athina.ai/evals/running_evals/loading_data) specifying the various ways you can load your dataset.

In [3]:
# Create or load batch dataset
raw_data = yc_query_mini.data
dataset = RagLoader().load_dict(raw_data)

pd.DataFrame(dataset)

Unnamed: 0,query,context,response
0,What are some successful companies that went t...,Y Combinator has invested in companies in vari...,"Airbnb, Dropbox, Stripe, Reddit, Coinbase, Ins..."
1,In which city is YC located?,"Y Combinator is located in Mountain View, Cali...",Y Combinator is located in San Francisco
2,How much equity does YC take?,Y Combinator invests $500k in 200 startups twi...,YC invests $150k for 7%.
3,How much equity does YC take?,Y Combinator invests $500k in 200 startups twi...,I cannot answer this question as I do not have...
4,Who founded YC and when was it founded?,Y Combinator was founded in March 2005 by Paul...,Y Combinator was founded in 2005
5,Does Y Combinator invest in startups outside t...,Y Combinator invests in startups from all over...,"Yes, Y Combinator invests in international sta..."
6,How much does YC invest in startups?,YC invests $150k for 7%.,$150k
7,What is YC's motto?,Y Combinator's motto is 'Make something people...,Make something people want


### Describe your experiment metadata fields (optional)
These metadata fields are only used as identifiers when we save your experiment on Athina Develop.
This helps you search, sort and filter through past experimentation runs.

Currently, this includes your:
- `experiment_name`: (string) The name of your experiment
- `experiment_description`: (string) A description this iteration of your experiment
- `language_model_provider`: (string) `openai`
- `language_model_id`: (string) The language model used for the LLM inference (ex: `gpt-3.5-turbo`)
- `prompt_template`: (object) A JS object representing the prompt you are sending to the LLM (for example, messages array in OpenAI)
- `dataset_name`: (string) An identifier for the dataset you are using.

In [4]:
# Define your experiment parameters
prompt_template = [
    { 
        "role": "system",
        "content": "You are an expert at answering questions about Y Combinator. If you do not know the answer, say I don't know. Be direct and concise in your responses" },
    { 
        "role": "user", 
        "content": "{query}"
    }
]
experiment = AthinaExperiment(
    experiment_name="ContextRelevance",
    experiment_description="Checking retrieval scores for YC dataset with a simple zero-shot prompt",
    language_model_provider="openai",
    language_model_id="gpt-3.5-turbo",
    prompt_template=prompt_template,
    dataset_name="yc_dataset_mini",
)

### Run your evaluation

Simply instantiate the evaluator class you wish to use, and call `run_batch` to the eval

##### Run evals in parallel (much faster)

You may specify `max_parallel_evals` to run multiple LLM evaluation inferences in parallel.

##### View as a dataframe
Call `.to_df()` on the results to view as a dataframe


##### Log results to Athina Develop (Dashboard UI)
If you have specified an `AthinaApiKey`, then results will automatically logged to the dashboard.

In [5]:
# Checks if the LLM response answers the user query sufficiently
results = ContextContainsEnoughInformation().configure_experiment(experiment).run_batch(
    data=dataset,
    max_parallel_evals=5 # Run up to 5 evals in parallel
)

results.to_df()

Unnamed: 0,query,context,response,display_name,failed,grade_reason,runtime,model,passed
0,What are some successful companies that went through YC?,"Y Combinator has invested in companies in various fields like FinTech, Healthcare, AI, etc.","Airbnb, Dropbox, Stripe, Reddit, Coinbase, Instacart.",Context Contains Enough Information,True,"The context provided does not contain specific examples of successful companies that went through Y Combinator (YC). It only mentions that YC has invested in companies across various fields like FinTech, Healthcare, AI, etc. Without specific company names or examples, the chatbot cannot answer the user's query based solely on the context given.",4595,gpt-4-1106-preview,0.0
1,In which city is YC located?,"Y Combinator is located in Mountain View, California.",Y Combinator is located in San Francisco,Context Contains Enough Information,False,"The context clearly states that Y Combinator is located in Mountain View, California. Therefore, the chatbot can use this information to answer the user's query about the city in which YC is located, which is Mountain View.",2962,gpt-4-1106-preview,1.0
2,How much equity does YC take?,Y Combinator invests $500k in 200 startups twice a year.,YC invests $150k for 7%.,Context Contains Enough Information,True,"The context provided states the amount Y Combinator invests in startups and how often, but it does not specify the percentage of equity YC takes in return for its investment. Therefore, the chatbot cannot answer the user's query about the amount of equity YC takes based solely on the provided context.",11666,gpt-4-1106-preview,0.0
3,How much equity does YC take?,Y Combinator invests $500k in 200 startups twice a year.,I cannot answer this question as I do not have enough information.,Context Contains Enough Information,True,"The context provided only states the amount Y Combinator invests in startups and how often, but it does not specify the percentage of equity YC takes in return for its investment. Therefore, the chatbot cannot answer the user's query about equity with the information given in the context.",4183,gpt-4-1106-preview,0.0
4,Who founded YC and when was it founded?,"Y Combinator was founded in March 2005 by Paul Graham, Jessica Livingston, Trevor Blackwell, and Robert Tappan Morris.",Y Combinator was founded in 2005,Context Contains Enough Information,False,"The context provided contains all the necessary information to answer the user's query. It specifies that Y Combinator (YC) was founded in March 2005 and lists the founders as Paul Graham, Jessica Livingston, Trevor Blackwell, and Robert Tappan Morris. Therefore, the chatbot can use this information to directly answer the question about who founded YC and when it was founded.",7534,gpt-4-1106-preview,1.0
5,Does Y Combinator invest in startups outside the US?,Y Combinator invests in startups from all over the world.,"Yes, Y Combinator invests in international startups as well as US startups.",Context Contains Enough Information,False,"The context directly answers the user's query. It states that Y Combinator invests in startups from all over the world, which implies that they do invest in startups outside the US.",3781,gpt-4-1106-preview,1.0
6,How much does YC invest in startups?,YC invests $150k for 7%.,$150k,Context Contains Enough Information,False,"The context provides the exact amount YC invests in startups, which is $150k for 7% equity. This information directly answers the user's query about the investment amount by YC in startups.",5912,gpt-4-1106-preview,1.0
7,What is YC's motto?,Y Combinator's motto is 'Make something people want'.,Make something people want,Context Contains Enough Information,False,"The context directly provides the answer to the user's query. The user asked for Y Combinator's motto, and the context states that Y Combinator's motto is 'Make something people want'. Therefore, the chatbot can use the context information to answer the user's query accurately.",2908,gpt-4-1106-preview,1.0
