Skip to content

Conversation

@nathan-weinberg
Copy link
Member

This PR introduces the Evaluator parent class and the MMLUEvaluator and MTBenchEvaluator child classes

It also introduces the EvalError parent exception class for all eval custom exceptions and the ModelNotFoundError child exception class

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
Copy link
Member

@alinaryan alinaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM - are PR-MMLU and PR-Bench going to live in the MMLU and MT-Bench child classes respectively or is the idea to add them in later as their own child classes?

@nathan-weinberg
Copy link
Member Author

This LGTM - are PR-MMLU and PR-Bench going to live in the MMLU and MT-Bench child classes respectively or is the idea to add them in later as their own child classes?

Good question! I think we could go with either approach - either extending the current child classes or adding new child classes but in the same files.

Could you do an explicit approval? I've set the repo so two are required for merge. @alimaredia same for you.

@danmcp and @xukai92 feel free to weigh-in as well - this foundation will define subsequent work.

Copy link
Member

@alinaryan alinaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I lean towards new child classes since the inputs/dependencies differ across PR benchmarks. You could add them in this PR or it can be handled in a follow-up PR by anyone.
Thanks for working on this!!

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
@danmcp
Copy link
Member

danmcp commented Jun 14, 2024

This LGTM - are PR-MMLU and PR-Bench going to live in the MMLU and MT-Bench child classes respectively or is the idea to add them in later as their own child classes?
@danmcp and @xukai92 feel free to weigh-in as well - this foundation will define subsequent work.

It would be good to understand the signatures before deciding. But I'm pretty sure they should be new classes. The other thing to consider whether the modeling makes more sense as a is-a or has-a relationship. Has-a might give some useful flexibility.

@nathan-weinberg
Copy link
Member Author

This LGTM - are PR-MMLU and PR-Bench going to live in the MMLU and MT-Bench child classes respectively or is the idea to add them in later as their own child classes?
@danmcp and @xukai92 feel free to weigh-in as well - this foundation will define subsequent work.

It would be good to understand the signatures before deciding. But I'm pretty sure they should be new classes. The other thing to consider whether the modeling makes more sense as a is-a or has-a relationship. Has-a might give some useful flexibility.

I pushed up two additional child classes based on the specs in the doc - lmk your thoughts!

Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
@nathan-weinberg nathan-weinberg merged commit 6632002 into instructlab:main Jun 17, 2024
@nathan-weinberg nathan-weinberg deleted the skeleton branch June 17, 2024 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants