### Setup

In [None]:
%pip install --no-cache-dir -qU ragas nltk langchain-openai

### Import dataset

In [1]:
from datasets import load_dataset
dataset = load_dataset(
    "explodinggradients/amnesty_qa",
    "english_v3",
    trust_remote_code=True
)

  from .autonotebook import tqdm as notebook_tqdm
Repo card metadata block was not found. Setting CardData to empty.


### Load eval dataset

In [2]:
from ragas import EvaluationDataset

eval_dataset = EvaluationDataset.from_hf_dataset(dataset["eval"])

### Select eval metrics

In [9]:
from ragas.metrics import LLMContextRecall, Faithfulness, FactualCorrectness, SemanticSimilarity
from ragas import evaluate, RunConfig

### Setup LLM evaluator

In [13]:
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage
llm = ChatOpenAI(model="gemini-1.5-flash", 
                 timeout=900, 
                 api_key="sk-1", 
                 base_url="http://localhost:4000/v1")

messages = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="How do I use ChatOpenAI?")
]
response = llm.invoke(messages)
print(response.content)

ChatOpenAI doesn't exist as a standalone product or service.  It's likely you're referring to using the OpenAI API, specifically models like `gpt-3.5-turbo` or `gpt-4`, which power conversational AI applications.  These models are accessed through the OpenAI API, not a separate "ChatOpenAI" platform.

To use OpenAI's conversational AI models, you'll need to:

1. **Create an OpenAI Account:**  Go to the OpenAI website (openai.com) and sign up for an account.  You'll likely need to provide payment information, even if you only use the free trial.

2. **Obtain an API Key:** Once your account is created, navigate to your account settings to find your API keys.  You'll need this key to authenticate your requests to the OpenAI API.  Keep this key secret; do not share it publicly.

3. **Choose a Programming Language and Library:** OpenAI's API can be accessed using various programming languages. Popular choices include Python (with the `openai` library), Node.js, and others.  You'll need to i

In [14]:
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
from langchain_openai import OpenAIEmbeddings
evaluator_llm = LangchainLLMWrapper(llm)
evaluator_embeddings = LangchainEmbeddingsWrapper(OpenAIEmbeddings(model="nomic-embed-text", 
                                                                   base_url="http://localhost:11434/v1"))

### Running Evaluation

In [15]:
metrics = [
    LLMContextRecall(llm=evaluator_llm), 
    FactualCorrectness(llm=evaluator_llm), 
    Faithfulness(llm=evaluator_llm),
    # SemanticSimilarity(embeddings=evaluator_embeddings)
]

results = evaluate(dataset=eval_dataset, metrics=metrics, batch_size=1, 
                   run_config=RunConfig(timeout=900))

Evaluating: 100%|██████████| 60/60 [36:33<00:00, 36.57s/it]


### Export and analyzing results

In [16]:
df = results.to_pandas()
df.head()

Unnamed: 0,user_input,retrieved_contexts,response,reference,context_recall,factual_correctness,faithfulness
0,What are the global implications of the USA Su...,"[- In 2022, the USA Supreme Court handed down ...",The global implications of the USA Supreme Cou...,The global implications of the USA Supreme Cou...,1.0,0.5,0.095238
1,Which companies are the main contributors to G...,"[In recent years, there has been increasing pr...","According to the Carbon Majors database, the m...","According to the Carbon Majors database, the m...",1.0,0.11,0.166667
2,Which private companies in the Americas are th...,[The issue of greenhouse gas emissions has bec...,"According to the Carbon Majors database, the l...",The largest private companies in the Americas ...,1.0,0.59,0.5
3,What action did Amnesty International urge its...,"[In the case of the Ogoni 9, Amnesty Internati...",Amnesty International urged its supporters to ...,Amnesty International urged its supporters to ...,1.0,0.25,0.166667
4,What are the recommendations made by Amnesty I...,"[In recent years, Amnesty International has fo...",Amnesty International made several recommendat...,The recommendations made by Amnesty Internatio...,1.0,0.36,0.318182
