Skip to content

Evaluation: Improving scores for Educate Girls #855

@AkhileshNegi

Description

@AkhileshNegi

Is your feature request related to a problem?

Educate Girls's existing config, prompt, and vector store setup currently gives low evaluation score.

Describe the solution you'd like

  • Audit the current setup and baseline evaluation score.
  • Identify factors hurting performance like the prompt, model choice, files used in setting up vector store or poor golden question answers
  • Iterate on improvements based on findings from similar NGOs like CBC and Antara, focusing on prompt and configuration changes.

Additional Context
In parallel with the Gemini-based eval runner work, we should also test Gemini as the underlying model and switch if it shows a meaningful improvement while still using openai for cosine similarity calculation and llm as judge score so we keep scoring framework constant.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

Status

To Do

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions