Skip to content

Feat: Row-level eval details#35

Merged
JeanKaddour merged 2 commits intomainfrom
feat/detailed_evals
Dec 2, 2024
Merged

Feat: Row-level eval details#35
JeanKaddour merged 2 commits intomainfrom
feat/detailed_evals

Conversation

@JeanKaddour
Copy link
Copy Markdown
Contributor

This pull request implements row-level eval details. The changes involve modifications to the evaluation logic, updates to the frontend components, and the addition of a new results page.

Backend Changes:

  1. Evaluation Logic Updates:
    • Replaced task_id with example_id in async def evaluate_dataset_batch to better reflect the context of the evaluations. [1] [2] [3]
    • Added full_prompt to the loop in async def evaluate_dataset_batch to include the full prompt in the evaluation.
    • Updated the response storage and logging to use example_id instead of task_id.

Frontend Changes:

  1. Component Cleanup:

    • Removed unused imports from Header.jsx to clean up the code.
  2. Evaluation Page Updates:

    • Simplified the handleViewResults function in evals.jsx to navigate to a dedicated results page instead of using a modal.
    • Removed the modal for displaying evaluation results from evals.jsx.
  3. New Results Page:

    • Added a new results page (evals/[id].js) to display evaluation results in a table format, including example ID, problem, predicted answer, ground truth, and correctness status. (frontend/src/pages/evals/[id].jsR1-R164)

@JeanKaddour JeanKaddour merged commit 20b3a12 into main Dec 2, 2024
@JeanKaddour JeanKaddour deleted the feat/detailed_evals branch December 28, 2024 01:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant