-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove kwargs and add explicit runinference_args #21806
Conversation
Can one of the admins verify this patch? |
3 similar comments
Can one of the admins verify this patch? |
Can one of the admins verify this patch? |
Can one of the admins verify this patch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, just some minor points
examples, self._model, extra_kwargs) | ||
else: | ||
result_generator = self._model_handler.run_inference( | ||
examples, self._model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just always pass extra_kwargs
through?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I originally had it this way because I didn't replace **kwargs
with extra_kwargs
in base
or sklearn
, and only in pytorch
. Changed.
PTAL @TheNeuralBit |
Codecov Report
@@ Coverage Diff @@
## master #21806 +/- ##
==========================================
- Coverage 74.06% 74.01% -0.06%
==========================================
Files 698 699 +1
Lines 92600 92675 +75
==========================================
+ Hits 68585 68592 +7
- Misses 22760 22828 +68
Partials 1255 1255
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Run PythonDocs PreCommit |
It seems the actual error in PythonDocs is hidden by all the dataframe cruft, sorry about that. I filed #21845 to fix that.
|
keys, unkeyed_batch = zip(*batch) | ||
return zip( | ||
keys, self._unkeyed.run_inference(unkeyed_batch, model, **kwargs)) | ||
keys, self._unkeyed.run_inference(unkeyed_batch, model, extra_kwargs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These arguments should be passed through if they exist and not otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried about growing the interface in a lot of places to handle arguments that are only applicable to a single framework.
If an argument is shouldn't be in a framework ideally it should fail and give an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the fact is the interface has grown, extra_kwargs is an argument on ModelHandler.run_inference
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mean to slow things down here, I'll defer to what you all want to do. But I just want to play devil's advocate a bit: If it's a part of the interface it should be a part of the interface. Making it into a dynamic, optional parameter just makes the carve out worse (IMO).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't refresh this page before posting so I didn't see the new comments. I'm leaning towards having extra_kwargs
b/c it's part of the interface, even if it isn't used in sklearn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caught up with @ryanthompson591. We're going to stick with passing (renamed) inference_args
for all frameworks, but for sklearn, raise an exception if a non-empty value is passed.
self, | ||
batch: Sequence[numpy.ndarray], | ||
model: BaseEstimator, | ||
extra_kwargs: Optional[Dict[str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Let's not put these extra args into sklearn. The model doesn't want them, they don't need to be there.
This will be extra unused parameters that don't need to be there. They don't need to be here and they don't need to be in tensorflow implementations.
Don't put them here. If anything the only thing this PR should do is remove **kwargs from this interface, it shouldn't be here.
If someone puts extra args into this interface that it isn't expecting it should fail instead of work silently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If someone puts extra args into this interface that it isn't expecting it should fail instead of work silently.
A ModelHandler that doesn't support extra_kwargs could raise an error. We don't need to rely on Python argument matching to raise the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Removed.
model=model, | ||
prediction_params=prediction_params) | ||
for actual, expected in zip(predictions, KWARGS_TORCH_PREDICTIONS): | ||
batch=KEYED_TORCH_EXAMPLES, model=model, extra_kwargs=extra_kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should keep these args as anonymous
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify by anonymous args?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I've been thinking about this:
I was playing around with ways to make arguments specific to run_inference and I think there are only three ways. Either what you have done, anonymous args, or an if statement
if inference_args:
model_handler.run_inference(model, batch, inference_args)
else:
model_handler.run_inference(model, batch)
I'm not sure what I prefer now that I'm looking at it.
The if statement has the advantage of allowing clients that don't expect this argument to fail or pass without modifcations.
Run Python PreCommit |
model: BaseEstimator, | ||
inference_args: Optional[Dict[str, Any]] = None | ||
) -> Iterable[PredictionResult]: | ||
_validate_inference_args(inference_args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(not urgent). This validation will happen at pipeline execution, right? Can it be done at construction time to reduce the feedback loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I've filed this issue #21894 as something to do later.
Co-authored-by: tvalentyn <tvalentyn@users.noreply.github.com>
Co-authored-by: tvalentyn <tvalentyn@users.noreply.github.com>
Removing
**kwargs
fromRunInference
and add an explicitextra_runinference_args
argument. Only frameworks, like Pytorch, that need the extra params will use it. Other frameworks won't need pass in these extraextra_runinference_args
args.Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.