-
Notifications
You must be signed in to change notification settings - Fork 10
Description
Is your feature request related to a problem?
We are currently in the process of migrating from a single unified assistant (X) to three cadre-specific assistants (Y1, Y2, Y3) for Antara, as the latter setup has been yielding better results.
Please note that while the prompts for each of these assistants are different, they initially shared the same knowledge base (i.e., the same Vector Store ID was used across all assistants).
For this new setup, we ran the first round of evaluations on the older Kaapi Konsole. The results can be accessed here: Click here
Following this, we made updates to the prompts and decided to add additional guidelines to the testing assistants.
All the versions of the prompts can be accessed here: Click here
Since the same Vector Store ID was shared between the testing assistants and the live assistant in production, we created a copy of the Vector Store and attached it to the testing assistants from the backend (outside the Glific UI). This ensured that any additional files used for testing would not be added to the assistant currently live in production. More details about this process can be accessed here: Click here
After adding the additional files to testing assistants KB, we are observing a considerable dip in the evaluation scores, which is not expected as all the docs added are new guidelines.
Describe the solution you'd like
- Investigate the cause of the dip in evaluation scores.
- Compare the performance metrics before and after adding the new files.
- Consider rollback options or adjustments to the prompts if necessary.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status