Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarification on annotation entries from alpaca_eval #223

Closed
xwinxu opened this issue Feb 1, 2024 · 4 comments · Fixed by #224
Closed

clarification on annotation entries from alpaca_eval #223

xwinxu opened this issue Feb 1, 2024 · 4 comments · Fixed by #224

Comments

@xwinxu
Copy link
Contributor

xwinxu commented Feb 1, 2024

I wanted to clarify the notation in the alpaca_eval command. Let's say I passed in --model_outputs my_model_A.json and
--reference_outputs my_model_B.json. The resulting annotations.json will contain keys output_1, output_2, and rankings under raw_completions:

    "raw_completion":{
      "concise_explanation":"....",
      "ordered_models":[
        {
          "model":"M",
          "rank":1
        },
        {
          "model":"m",
          "rank":2
        }
      ]
    }

According to this template, m corresponds to output_1 and M corresponds to output_2, correct? Then would this mean the model_outputs corresponds to m as well? I'm not sure which one takes precedence.

Thanks!

@xwinxu xwinxu changed the title clarification on model outputs used for evaluation clarification on annotation entries from alpaca_eval Feb 1, 2024
@rtaori
Copy link
Collaborator

rtaori commented Feb 1, 2024

Do you see a field referenced_models alongside? It should look something like this:
"referenced_models": { "M": "output_1", "m": "output_2" }
which should be easier to understand / hopefully resolves the confusion

@YannDubs
Copy link
Collaborator

YannDubs commented Feb 1, 2024

@xwinxu check out the readme, the rank tells you which model was preferred so if preference = 2 (i.e. output_2 is preferred) and M has rank 1 (i.e. it's preferred) then you would know that M corresponds to output_2. The reason why all of this is a bit complicated is because we randomize the order of the outputs when evaluating.

I'll add the field referenced_models that Rohan is referring to, which currently is only added for sample sheets so you won't see it locally.

Note that your reference model will always be output_1.

@YannDubs
Copy link
Collaborator

YannDubs commented Feb 1, 2024

Done @xwinxu , to add referenced_models you need to (1) update alpaca_eval, and (2) rerun the parsing by using the flag --is_reapply_parsing True. Note that this won't rerun the actual OAI annotations, which are cached.

@xwinxu
Copy link
Contributor Author

xwinxu commented Feb 1, 2024

Thanks Yann!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants