Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics support #55

Merged
merged 10 commits into from
Sep 6, 2023
Merged

Metrics support #55

merged 10 commits into from
Sep 6, 2023

Conversation

AleHD
Copy link
Collaborator

@AleHD AleHD commented Aug 31, 2023

Added support to calculate custom metrics during evaluation (e.g. accuracy).

  • Added command line argument --metrics. When set, it should be accompanied by one or more names of metric functions. For now only perplexity, accuracy, count_loss_mask (number of nonzero elements per sample in the loss_mask), count_instruct_mask (same as before, but this setting doesn't count the extra tokens that surrounds the messages in the instruction tuning setting, i.e. the<|im_begin|> and <|im_end|>), instruct_accuracy .

Thanks to @andreaskoepf and his fork for serving as inspiration to this implementation.

AleHD and others added 4 commits August 29, 2023 01:30
Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>
Co-authored-by: Andreas Koepf <andreas.koepf@provisio.com>
@AleHD AleHD marked this pull request as draft August 31, 2023 22:41
@AleHD AleHD mentioned this pull request Sep 2, 2023
6 tasks
@AleHD
Copy link
Collaborator Author

AleHD commented Sep 2, 2023

Waiting on the instruction tuning PR. Once #40 merges with main successfully, merging this branch will be easier.

@AleHD AleHD mentioned this pull request Sep 2, 2023
@AleHD AleHD marked this pull request as ready for review September 4, 2023 19:31
@AleHD AleHD merged commit a8feb5b into main Sep 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants