[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/weaviate/recipes/blob/main/integrations/operations/ragas/ragas-demo.ipynb)

In [1]:
import weaviate
import json
import os

client = weaviate.Client(
    url = os.getenv("WEAVIATE_URL"),  # Replace with your cluster url
    auth_client_secret=weaviate.AuthApiKey(api_key=os.getenv("WEAVIATE_API_KEY")),  # Replace w/ your Weaviate instance API key
    additional_headers = {
        "X-OpenAI-Api-Key": os.getenv("OPENAI_API_KEY")  # Replace with your inference API key
    }
)

In [2]:
with open("faq.json", "r") as f:
    json_data = json.load(f)

queries = [{"question": item["question"], "answer": item["answer"]} for item in json_data["questions"]]
queries

[{'question': 'Why would I use Weaviate as my vector database?',
  'answer': 'Our goal is three-folded. Firstly, we want to make it as easy as possible for others to create their own semantic systems or vector search engines (hence, our APIs are GraphQL based). Secondly, we have a strong focus on the semantic element (the knowledge in vector databases, if you will). Our ultimate goal is to have Weaviate help you manage, index, and understand your data so that you can build newer, better, and faster applications. And thirdly, we want you to be able to run it everywhere. This is the reason why Weaviate comes containerized.'},
 {'question': 'What is the difference between Weaviate and for example Elasticsearch?',
  'answer': 'Other database systems like Elasticsearch rely on inverted indices, which makes search super fast. Weaviate also uses inverted indices to store data and values. But additionally, Weaviate is also a vector-native search database, which means that data is stored as vec

In [15]:
# Ragas wants ['question', 'answer', 'contexts', 'ground_truths'] as
'''
{
    "question": ['What is ref2vec?', ...], <-- question from faq doc
    "answer": [], <-- answer from generated result
    "contexts": [], <-- content
    "ground_truths": [] <-- answer from faq doc
}
'''
questions = []
answers = []
contexts = []
ground_truths = []

for query in queries:
    question = query["question"]
    graphql_query = """
    {
        Get {
            Document(
                hybrid: {
                    query: "%s",
                    alpha: 1
                },
                limit: 5
            ){
                content
                source
                title
                _additional {
                    generate(
                        groupedResult: {
                            task: "Please answer the question %s. Make sure your answer is based on the following search results."
                        }
                    ){
                        groupedResult
                        error
                    }
                }
            }
        }
    }""" % (question, question)

    questions.append(question)
    ground_truths.append([query["answer"]])
    responses = client.query.raw(graphql_query)["data"]["Get"]["Document"]
    new_answer = responses[0]["_additional"]["generate"]["groupedResult"]
    answers.append(new_answer)
    new_contexts = [response["content"] for response in responses]
    contexts.append(new_contexts)    

In [16]:
data = {
    "question": questions,
    "answer": answers,
    "contexts": contexts,
    "ground_truths": ground_truths
}
data

{'question': ['Why would I use Weaviate as my vector database?',
  'What is the difference between Weaviate and for example Elasticsearch?',
  'Do I need to know about Docker (Compose) to use Weaviate?',
  'What happens when the Weaviate Docker container restarts? Is my data in the Weaviate database lost?',
  "Are there any 'best practices' or guidelines to consider when designing a schema?",
  'Is it possible to create one-to-many relationships in the schema?',
  'Do Weaviate classes have namespaces?',
  'Are there restrictions on UUID formatting? Do I have to adhere to any standards?',
  'If I do not specify a UUID during adding data objects, will Weaviate create one automatically?',
  'Can I use Weaviate to create a traditional knowledge graph?',
  'Why does Weaviate have a schema and not an ontology?',
  'How can I retrieve the total object count in a class?',
  "How do I get the cosine similarity from Weaviate's certainty?",
  'What is the best way to iterate through objects? Can 

## Ragas part

In [19]:
from datasets import Dataset
dataset = Dataset.from_dict(data)

In [20]:
from ragas import evaluate
from ragas.metrics import (
    faithfulness,
    answer_relevancy,
    context_precision,
    context_recall
)

result = evaluate(
    dataset = dataset,
    metrics = [faithfulness, answer_relevancy, context_precision, context_recall]
)

evaluating with [faithfulness]


100%|██████████| 2/2 [01:55<00:00, 57.98s/it]


evaluating with [answer_relevancy]


  0%|          | 0/2 [00:00<?, ?it/s]/usr/local/lib/python3.11/site-packages/langchain/embeddings/openai.py:501: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.5/migration/
  response = response.dict()
/usr/local/lib/python3.11/site-packages/pydantic/main.py:979: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.5/migration/
 50%|█████     | 1/2 [00:11<00:11, 11.99s/it]/usr/local/lib/python3.11/site-packages/langchain/embeddings/openai.py:501: PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.5/migration/
  response = response.dict()
/

evaluating with [context_precision]


100%|██████████| 2/2 [00:04<00:00,  2.35s/it]


evaluating with [context_recall]


100%|██████████| 2/2 [00:53<00:00, 26.87s/it]


In [21]:
df = result.to_pandas()
df

Unnamed: 0,question,contexts,answer,ground_truths,faithfulness,answer_relevancy,context_precision,context_recall
0,Why would I use Weaviate as my vector database?,"[ware, customers using the free service will a...",You might consider using Weaviate as your vect...,"[Our goal is three-folded. Firstly, we want to...",0.9,0.967471,1.0,1.0
1,What is the difference between Weaviate and fo...,[reason why Weaviate comes containerized.\n\n\...,The difference between Weaviate and Elasticsea...,[Other database systems like Elasticsearch rel...,0.833333,0.950363,1.0,0.5
2,Do I need to know about Docker (Compose) to us...,[\n\n## Overview\n\nWeaviate supports deployme...,"Based on the search results provided, it is no...",[Weaviate uses Docker images as a means to dis...,0.714286,0.990731,1.0,0.5
3,What happens when the Weaviate Docker containe...,[arts? Is my data in the Weaviate database los...,"When the Weaviate Docker container restarts, w...",[There are three levels: You have no volume co...,0.666667,0.950248,1.0,0.333333
4,Are there any 'best practices' or guidelines t...,"[ start up with a volume, all your data will b...","When designing a schema, there are indeed seve...","[As a rule of thumb, the smaller the units, th...",1.0,0.979785,1.0,0.0
5,Is it possible to create one-to-many relations...,[.\n\n\n\n#### Q: Is it possible to create one...,"Yes, it is possible to create one-to-many rela...","[Yes, it is possible to reference to one or mo...",0.5,1.0,0.416667,0.5
6,Do Weaviate classes have namespaces?,[ses in weave maybe also for new listeners we ...,"Yes, Weaviate classes act as namespaces. This ...",[Yes. Each class itself acts like namespaces. ...,0.666667,0.937631,0.533333,0.333333
7,Are there restrictions on UUID formatting? Do ...,[ID formatting? Do I have to adhere to any sta...,"Yes, there are restrictions on UUID formatting...",[The UUID must be presented as a string matchi...,0.375,0.961429,1.0,1.0
8,If I do not specify a UUID during adding data ...,[terministically determine them based on some ...,"Yes, if you do not specify a UUID when adding ...","[Yes, a UUID will be created if not specified.]",1.0,0.949678,1.0,1.0
9,Can I use Weaviate to create a traditional kno...,[viate to create a traditional knowledge graph...,"Yes, you can use Weaviate to create a traditio...","[Yes, you can! Weaviate support ontology, RDF-...",0.8,0.950145,1.0,0.666667


In [22]:
df.to_csv('/Users/erikacardenas/Desktop/ragas1.csv', index=True)