Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Changed the previous evaluation to
Evaluation
If the user wants to consider the ground truth (which can be specified through the config), we perform the evaluation as follows:
We evaluate the model's responses using three columns:
ground_truth
: This column contains corrected labels, representing whether the response should be 'Agree' or 'Disagree'.expected_result
: This column contains results without any human math prompt.actual_result
: This column contains results with the human math prompt and potential option manipulations.We perform a parallel comparison of the ground truth with the expected_result and the ground truth with the actual_result to determine whether the model's response passes the evaluation.
If the user does not want to use ground truth (by default, we are not using ground truth), we evaluate the model's responses using two columns:
expected_result
: This column contains results without any human math prompt.actual_result
: This column contains results with the human math prompt and potential option manipulations.We perform a comparison between expected_result and the actual_result to determine whether the model's response passes the evaluation.
Sycophancy Notebook -> Notebook
Screenshots