Quiz and Judge. Probs normalization.

In the quiz and judge section, particularly in the judge part, I’ve noticed that the yes/no probabilities are not normalized.

For example, suppose the trainee model predicts Token("yes", p=0.1) while the ground truth is "no". In this case, we assume that the model’s probability for "no" is p("no") = 0.9, and we compute the loss as loss = -log(0.9). This results in a small loss, indicating that the classifier assigned a high probability (0.9) to the correct class, but actually the final verdict was incorrect. 

Could you please comment on this behavior?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Quiz and Judge. Probs normalization. #66

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Quiz and Judge. Probs normalization. #66

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions