Evaluation: Improving scores for Educate Girls

**Is your feature request related to a problem?**  

Educate Girls's existing config, prompt, and vector store setup currently gives low evaluation score.

**Describe the solution you'd like**  
- Audit the current setup and baseline evaluation score.  
- Identify factors hurting performance like the prompt, model choice, files used in setting up vector store or poor golden question answers
- Iterate on improvements based on findings from similar NGOs like CBC and Antara, focusing on prompt and configuration changes.  


**Additional Context**  
In parallel with the Gemini-based eval runner work, we should also test Gemini as the underlying model and switch if it shows a meaningful improvement while still using openai for cosine similarity calculation and llm as judge score so we keep scoring framework constant. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation: Improving scores for Educate Girls #855

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Evaluation: Improving scores for Educate Girls #855

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions