-
Notifications
You must be signed in to change notification settings - Fork 39
Refactoring of Evaluation and adding of evaluate command #264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fast_llm/engine/training/config.py
Outdated
| hint=FieldHint.feature, | ||
| valid=skip_valid_if_none(check_field(Assert.gt, 0)), | ||
| ) | ||
| class TrainingEvaluatorConfig(EvaluatorConfigBase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should still inherit from IntervalConfig so it's simpler and we don't have the redundant run_interval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer composition over multiple inheritance here, as IntervalConfig is a property of the evaluator wrapper (which TrainingEvaluatorConfig is), and not of the new entity derived from EvaluatorConfigBase. In my opinion, this makes the code much more readable.
However, multiple inheritance is already used in many places. So if you still prefer that approach after my explanation, I’m happy to refactor accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple inheritance is only needed because of the EvaluatorConfigBase mixin, which is arguably not even necessary. I'd rather prioritize better usage (configs) over marginally simpler code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will use multiple inheritance but will retain EvaluatorConfigBase so I don't need to rewrite EvaluatorRunner then we will create a separate config for evaluate command in addition for it of accepting training config. We need to discuss what it should look like, so I’ve created an issue for it #285.
| else 0 | ||
| ) | ||
|
|
||
| def get_evaluator( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems unnecessary, we can just call evaluator.get_evaluator()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I explained in details #222 (comment) and #222 (comment) but shortly it is to maintain proper encapsulation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no encapsulation needed though, TrainingEvaluatorConfig is a fixed class with an evaluator :EvaluatorConfig field, which dis dynamic but has a well-defined get_evaluator method.
Encapsulation would be needed if we allowed for a more generic scenario where evaluator.get_evaluator doesn't exist or has a different signature, ex. if we allowed for a more generic evaluator or a generalized TrainingEvaluatorConfig that doesn't have an evaluator. I don't really see this happening anytime soon...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing I can't really work around here is that I need to return TrainingEvaluator and not evaluator.get_evaluator(). This is because TrainingEvaluator is responsible for handling whether evaluators should or should not run during training. Neither the concrete evaluators nor the EvaluatorRunner are aware of this—they simply execute.
|
|
||
|
|
||
| @config_class() | ||
| class EvaluatorConfigBase(Config): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I explained in details #222 (comment) and #222 (comment) but shortly it is to maintain proper encapsulation
I will move it in the #282, so we can close this faster. |
|
|
||
| # @pytest.mark.extra_slow | ||
| @requires_cuda | ||
| def test_loss_validation_vs_inference(model_and_tokenizer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this test is worth it, it's kind of trivial.
✨ Description
Creates Evaluator abstraction so additional evaluators beyond Loss can be added.
Adds an
evaluatecommand that accepts the same training config and enables evaluation on the last checkpoint.Includes some fixes.
Example: specifying multiple LossEvaluators
🔍 Type of change
Select all that apply:
📝 Changes
✅ Checklist
Make sure the following tasks are completed before submitting the PR:
General
Testing