-
Notifications
You must be signed in to change notification settings - Fork 990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add extensible tests to transformers #663
Add extensible tests to transformers #663
Conversation
# Inexact, but at least make sure not too much was produced | ||
assert len(lm["answer"]) < 8, f"Output: {lm['answer']}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea but model quality checks like this (when asserting against unconstrained generations) are a bit dangerous -- you'd be surprised how often a model with minor updates on HuggingFace might spit out an answer that e.g. isn't a number. I'd either add a regex constraint to the gen and check that the number produced can be cast to an integer or just check that the model generated something at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't len()
always return an integer?
assert lm["answer"] in ["p", "t", "w"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
glad to see a forced grammar test here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My background idea is to have a smoke test for each of the ways a generation (be that gen()
, select()
etc.) can be invoked.
Add basic smoke tests to
transformers
which are parameterised with the name of a Hugging Face model. This is to prevent things like #609 reoccurring.