<a target="_blank" href="https://colab.research.google.com/github/okareo-ai/okareo-python-sdk/blob/main/examples/retrieval_eval.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

## Generate a retrieval-RAG evaluation scenario from your data!

Get your API token from [https://app.okareo.com/](https://app.okareo.com/) and set it in the cell below. 👇
   (Note: You will need to register first.)





In [7]:
OKAREO_API_KEY = "<YOUR-OKAREO-API-TOKEN>"

In [None]:
%pip install okareo

**Load documents from your RAG database as Okareo scenario.**  
- These documents will be used to generate synthetic user questions.
- As an example we are using documents about a fictitious WebBizz web business. .jsonl file contains ```"result" : "<ID of the document>", "input" : "<document text>"```
- Replace this with your own data in same .jsonl format.

In [None]:
import os
import tempfile
import random
import string
from okareo import Okareo

random_string = ''.join(random.choices(string.ascii_letters, k=5))

okareo = Okareo(OKAREO_API_KEY)
# Webbizz is an example web business. 
# We load short documents about different business aspects as source scenario.
webbizz_documents = os.popen('curl https://raw.githubusercontent.com/okareo-ai/okareo-python-sdk/main/examples/webbizz_30_articles.jsonl').read()

with tempfile.NamedTemporaryFile(suffix="webbizz_30_articles.jsonl", mode="w+", delete=True) as temp_file:
    temp_file.write(webbizz_documents)
    temp_file.seek(0) # Move the file pointer to the beginning

    # Upload the questions to Okareo from the temporary file
    document_scenario = okareo.upload_scenario_set(file_path=temp_file.name, scenario_name=f"WebBizz Documents - {random_string}")

**Generate retrieval questions from documents using Okareo Text Reverse Question Generator**

In [18]:

from okareo import Okareo
from okareo_api_client.models.generation_tone import GenerationTone
from okareo_api_client.models.scenario_set_generate import ScenarioSetGenerate
from okareo_api_client.models.scenario_type import ScenarioType

okareo = Okareo(OKAREO_API_KEY)
random_string = ''.join(random.choices(string.ascii_letters, k=5))

# Use the scenario set of documents to generate a scenario of questions
generated_scenario = okareo.generate_scenario_set(
    ScenarioSetGenerate(
        name=f"Retrieval - Generated Scenario - {random_string}",
        source_scenario_id=document_scenario.scenario_id,
        number_examples=4, # Number of questions to generate for each document
        generation_type=ScenarioType.TEXT_REVERSE_QUESTION, # This type is for questions from the text
        generation_tone=GenerationTone.INFORMAL, # Specifying tone of the generated questions
        post_template="""{"question": "{generation.input}", "document": "{input}"}""",# for easy validation we are generating questions next to source documents 
    )
)

# Print a link back to Okareo app to see the generated scenario
print(f"See generated scenario in Okareo app: {generated_scenario.app_link}")

See generated scenario in Okareo app: https://app.okareo.com/project/394c2c12-be7a-47a6-911b-d6c673bc543b/scenario/76bdab7c-ab96-4eb2-8a63-06e9bb33dfba
