In [1]:
from uptrain import EvalAssistant
import os

  from .autonotebook import tqdm as notebook_tqdm


Let's set the inputs for your AI assistant:
- `user_bot_name`: Your bot name

- `user_bot_instructions`: The original set of prompts you want to use to test your assistant

- `user_bot_file_list`: The path to the files which act as your knowledge base

- `user_bot_model`(optional): The LLM model you want to use (we will use `gpt-4-1106-preview` by default)

In [2]:
user_bot_name = 'Nurse Bot v1'

user_bot_instructions = "You are an expert, professional nurse who is supposed to answer patient queries on different medical scenarios to patients."

user_bot_file_list = ['context_docs/nurse_doc.docx','context_docs/covid_faq.pdf','context_docs/malaria.pdf']

Let's set the arguments for the evaluator:

- `user_bot_purpose`: A small description of the purpose of your bot

- `evaluator_persona`: List of different persona (or scenarios) you wish to test your bot on. 

- `evaluator_bot_model`(optional): The LLM model you want to use (we will use `gpt-4-1106-preview` by default)

In [3]:
####### Scenario to evaluate #########

user_bot_purpose = 'Answer patient queries on different medical scenarios to patients'

evaluator_persona = [
    "Elderly patient asking about the symptoms of COVID 19",
    # "Anxious patient preparing for surgery",
    # "A mother whose teenage son is suffering from Malaria",
    # "An anxious patient irritated about the pain he is facing due to chicken pox medicines"
    # "New parent asking about infant feeding",
    # "Chronic pain patient managing arthritis",
    # "Teenager seeking advice on acne treatment",
    # "Caregiver looking for tips on dementia care",
    # "Busy professional with flu symptoms",
    # "Non-native speaker asking about medication side effects"
    # "A patient who talks in pronouns"
]

In [4]:
user_bot_purpose

'Answer patient queries on different medical scenarios to patients'

Let's simulate the conversations based on these personas

By default, we will generate 4 pairs of conversation for each scenario. If you wish to change that, let's say to 10 conversation pairs, you can simply do so by adding an argument: `trial_count = 10`

In [5]:
assistant_eval_client = EvalAssistant(openai_api_key=os.environ["OPENAI_API_KEY"])

message =  assistant_eval_client.simulate_conversation(
    user_bot_name = user_bot_name,
    user_bot_instructions = user_bot_instructions,
    user_bot_purpose = user_bot_purpose,
    user_bot_file_list = user_bot_file_list,
    evaluator_persona_list = evaluator_persona,
    trial_count= 2
)

[32m2024-04-03 19:08:23.501[0m | [1mINFO    [0m | [36muptrain.framework.eval_assistant.assistant_evals_utils[0m:[36msimulate_conversation[0m:[36m202[0m - [1mStep 1 of 1 Completed[0m


0      Symptoms: Provide information on common sympto...
1       Include advice on when to seek medical attent...
2       COVID-19 Information: Explain the symptoms, t...
3       Include details on testing and quarantine gui...
4       Medication Information: Provide guidance on c...
                             ...                        
651                                                 936)
652     \n• Management of severe malaria: a practical...
653                        \n• World malaria report 2016
654             Geneva: World Health Org anization; 2016
655                                                  \n 
Name: chunk, Length: 656, dtype: object
1


: 

In [1]:
from sentence_transformers import SentenceTransformer
text = "efvhefvehfv b3jr 3r3  3r3r3r3r3r r3r3gg3 3r3r3r 3r3r3r "

from uptrain.utilities import lazy_load_dep
encoder = SentenceTransformer("paraphrase-mpnet-base-v2")
vectors = encoder.encode(text)

  from .autonotebook import tqdm as notebook_tqdm


Now let's evaluate these simulated conversations

We will use UpTrain's [Conversation Satisfaction](https://docs.uptrain.ai/predefined-evaluations/conversation-evals/user-satisfaction) to test whether the user seems satisfied with the assistant's responses

In [None]:
from uptrain import ConversationSatisfaction, Evals

results = assistant_eval_client.evaluate(
    data = message,
    checks = [ConversationSatisfaction(llm_persona = user_bot_purpose), Evals.FACTUAL_ACCURACY, Evals.RESPONSE_RELEVANCE, Evals.CONTEXT_RELEVANCE, Evals.RESPONSE_CONCISENESS])

In [None]:
results[0]['conversation']