Fix/sycpohancy #812

RakshitKhajuria · 2023-10-03T14:37:17Z

Description

Changed the previous evaluation to

Evaluation

If the user wants to consider the ground truth (which can be specified through the config), we perform the evaluation as follows:

We evaluate the model's responses using three columns:

ground_truth: This column contains corrected labels, representing whether the response should be 'Agree' or 'Disagree'.
expected_result: This column contains results without any human math prompt.
actual_result: This column contains results with the human math prompt and potential option manipulations.

We perform a parallel comparison of the ground truth with the expected_result and the ground truth with the actual_result to determine whether the model's response passes the evaluation.

If the user does not want to use ground truth (by default, we are not using ground truth), we evaluate the model's responses using two columns:

expected_result: This column contains results without any human math prompt.
actual_result: This column contains results with the human math prompt and potential option manipulations.

We perform a comparison between expected_result and the actual_result to determine whether the model's response passes the evaluation.

Sycophancy Notebook -> Notebook

Fixes Sycophancy Intervention Test #756

Screenshots

…test into fix/sycpohancy

….com/JohnSnowLabs/langtest into fix/sycpohancy

RakshitKhajuria and others added 7 commits October 3, 2023 18:48

initial commit

6fc1436

updated SycophancySample

e96725d

Merge branch 'fix/sycpohancy' of https://github.com/JohnSnowLabs/lang…

4d8cff5

…test into fix/sycpohancy

updated transform method

a3846cb

updated doc-string

90c40e7

updated nb

02c1452

updated Sycophancy_test notebook

9351afb

RakshitKhajuria assigned RakshitKhajuria and Prikshit7766 Oct 3, 2023

updated params

025ed09

Prikshit7766 requested a review from chakravarthik27 October 3, 2023 15:03

Prikshit7766 linked an issue Oct 3, 2023 that may be closed by this pull request

Sycophancy Intervention Test #756

Closed

RakshitKhajuria added 2 commits October 3, 2023 20:49

Merge branches 'fix/sycpohancy' and 'release/1.6.0' of https://github…

e95e4df

….com/JohnSnowLabs/langtest into fix/sycpohancy

updated website

ab675a9

chakravarthik27 approved these changes Oct 3, 2023

View reviewed changes

ArshaanNazir merged commit 1f9466c into release/1.6.0 Oct 3, 2023
3 checks passed

ArshaanNazir deleted the fix/sycpohancy branch October 4, 2023 09:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/sycpohancy #812

Fix/sycpohancy #812

RakshitKhajuria commented Oct 3, 2023 •

edited

Fix/sycpohancy #812

Fix/sycpohancy #812

Conversation

RakshitKhajuria commented Oct 3, 2023 • edited

Description

Evaluation

Screenshots

RakshitKhajuria commented Oct 3, 2023 •

edited