Remove kwargs and add explicit runinference_args #21806

yeandy · 2022-06-10T19:56:06Z

Removing **kwargs from RunInference and add an explicit extra_runinference_args argument. Only frameworks, like Pytorch, that need the extra params will use it. Other frameworks won't need pass in these extra extra_runinference_args args.

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

Choose reviewer(s) and mention them in a comment (R: @username).
Mention the appropriate issue in your description (for example: "addresses [BEAM-121] Add DisplayData for IO transforms #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment "fixes #" instead.
Update CHANGES.md with noteworthy changes.
If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

See CI.md for more information about GitHub Actions CI.

yeandy · 2022-06-10T19:57:36Z

R: @ryanthompson591 @TheNeuralBit @robertwb

asf-ci · 2022-06-10T20:24:21Z

Can one of the admins verify this patch?

asf-ci · 2022-06-10T20:24:21Z

Can one of the admins verify this patch?

asf-ci · 2022-06-10T20:24:21Z

Can one of the admins verify this patch?

asf-ci · 2022-06-10T20:26:44Z

Can one of the admins verify this patch?

TheNeuralBit

Looks good overall, just some minor points

sdks/python/apache_beam/ml/inference/base.py

sdks/python/apache_beam/ml/inference/pytorch_inference_test.py

TheNeuralBit · 2022-06-10T22:11:42Z

sdks/python/apache_beam/ml/inference/base.py

+          examples, self._model, extra_kwargs)
+    else:
+      result_generator = self._model_handler.run_inference(
+          examples, self._model)


Can't we just always pass extra_kwargs through?

Yeah, I originally had it this way because I didn't replace **kwargs with extra_kwargs in base or sklearn, and only in pytorch. Changed.

yeandy · 2022-06-13T18:04:56Z

PTAL @TheNeuralBit

codecov · 2022-06-13T19:09:33Z

Codecov Report

Merging #21806 (fe84987) into master (8c8c431) will decrease coverage by 0.05%.
The diff coverage is 72.22%.

@@            Coverage Diff             @@
##           master   #21806      +/-   ##
==========================================
- Coverage   74.06%   74.01%   -0.06%     
==========================================
  Files         698      699       +1     
  Lines       92600    92675      +75     
==========================================
+ Hits        68585    68592       +7     
- Misses      22760    22828      +68     
  Partials     1255     1255

Flag	Coverage Δ
python	`83.65% <72.22%> (-0.09%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...thon/apache_beam/ml/inference/pytorch_inference.py	`0.00% <0.00%> (ø)`
sdks/python/apache_beam/ml/inference/base.py	`95.29% <100.00%> (-0.54%)`	⬇️
...thon/apache_beam/ml/inference/sklearn_inference.py	`95.23% <100.00%> (+0.69%)`	⬆️
...eam/runners/portability/fn_api_runner/execution.py	`92.44% <0.00%> (-0.65%)`	⬇️
...hon/apache_beam/runners/worker/bundle_processor.py	`93.17% <0.00%> (-0.38%)`	⬇️
...am/examples/inference/pytorch_language_modeling.py	`0.00% <0.00%> (ø)`
sdks/python/apache_beam/runners/common.py	`88.92% <0.00%> (+0.24%)`	⬆️
sdks/python/apache_beam/transforms/combiners.py	`93.43% <0.00%> (+0.38%)`	⬆️
sdks/python/apache_beam/utils/interactive_utils.py	`97.56% <0.00%> (+2.43%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8c8c431...fe84987. Read the comment docs.

TheNeuralBit · 2022-06-13T22:17:37Z

Run PythonDocs PreCommit

TheNeuralBit · 2022-06-14T01:17:19Z

It seems the actual error in PythonDocs is hidden by all the dataframe cruft, sorry about that. I filed #21845 to fix that.

15:21:58 /home/jenkins/jenkins-slave/workspace/beam_PreCommit_PythonDocs_Phrase/src/sdks/python/test-suites/tox/pycommon/build/srcs/sdks/python/target/.tox-py38-docs/py38-docs/lib/python3.8/site-packages/apache_beam/ml/inference/base.py:docstring of apache_beam.ml.inference.base.RunInference:5: WARNING: Unexpected indentation.

sdks/python/apache_beam/ml/inference/base.py

ryanthompson591 · 2022-06-14T16:29:03Z

sdks/python/apache_beam/ml/inference/base.py

    keys, unkeyed_batch = zip(*batch)
    return zip(
-        keys, self._unkeyed.run_inference(unkeyed_batch, model, **kwargs))
+        keys, self._unkeyed.run_inference(unkeyed_batch, model, extra_kwargs))


These arguments should be passed through if they exist and not otherwise.

I'm worried about growing the interface in a lot of places to handle arguments that are only applicable to a single framework.

If an argument is shouldn't be in a framework ideally it should fail and give an error.

But the fact is the interface has grown, extra_kwargs is an argument on ModelHandler.run_inference.

I don't mean to slow things down here, I'll defer to what you all want to do. But I just want to play devil's advocate a bit: If it's a part of the interface it should be a part of the interface. Making it into a dynamic, optional parameter just makes the carve out worse (IMO).

Sorry, I didn't refresh this page before posting so I didn't see the new comments. I'm leaning towards having extra_kwargs b/c it's part of the interface, even if it isn't used in sklearn.

Caught up with @ryanthompson591. We're going to stick with passing (renamed) inference_args for all frameworks, but for sklearn, raise an exception if a non-empty value is passed.

ryanthompson591 · 2022-06-14T16:30:38Z

sdks/python/apache_beam/ml/inference/sklearn_inference.py

+      self,
+      batch: Sequence[numpy.ndarray],
+      model: BaseEstimator,
+      extra_kwargs: Optional[Dict[str,


No. Let's not put these extra args into sklearn. The model doesn't want them, they don't need to be there.

This will be extra unused parameters that don't need to be there. They don't need to be here and they don't need to be in tensorflow implementations.

Don't put them here. If anything the only thing this PR should do is remove **kwargs from this interface, it shouldn't be here.

If someone puts extra args into this interface that it isn't expecting it should fail instead of work silently.

If someone puts extra args into this interface that it isn't expecting it should fail instead of work silently.

A ModelHandler that doesn't support extra_kwargs could raise an error. We don't need to rely on Python argument matching to raise the error.

Good point. Removed.

ryanthompson591 · 2022-06-14T16:34:25Z

sdks/python/apache_beam/ml/inference/pytorch_inference_test.py

-        model=model,
-        prediction_params=prediction_params)
-    for actual, expected in zip(predictions, KWARGS_TORCH_PREDICTIONS):
+        batch=KEYED_TORCH_EXAMPLES, model=model, extra_kwargs=extra_kwargs)


I think we should keep these args as anonymous

Can you clarify by anonymous args?

ok I've been thinking about this:

I was playing around with ways to make arguments specific to run_inference and I think there are only three ways. Either what you have done, anonymous args, or an if statement

if inference_args:
model_handler.run_inference(model, batch, inference_args)
else:
model_handler.run_inference(model, batch)

I'm not sure what I prefer now that I'm looking at it.

The if statement has the advantage of allowing clients that don't expect this argument to fail or pass without modifcations.

sdks/python/apache_beam/ml/inference/base.py

TheNeuralBit · 2022-06-15T00:20:30Z

Run Python PreCommit

tvalentyn · 2022-06-15T15:29:18Z

sdks/python/apache_beam/ml/inference/sklearn_inference.py

+      model: BaseEstimator,
+      inference_args: Optional[Dict[str, Any]] = None
+  ) -> Iterable[PredictionResult]:
+    _validate_inference_args(inference_args)


(not urgent). This validation will happen at pipeline execution, right? Can it be done at construction time to reduce the feedback loop?

Yes. I've filed this issue #21894 as something to do later.

Co-authored-by: tvalentyn <tvalentyn@users.noreply.github.com>

yeandy added 2 commits June 10, 2022 15:09

Remove kwargs and add runinference_args

68f1322

Merge master

25c4847

github-actions bot added the python label Jun 10, 2022

yeandy added 2 commits June 10, 2022 16:05

Fix names in the test

b1b9f41

Revert change in api.py

dfbdde9

TheNeuralBit approved these changes Jun 10, 2022

View reviewed changes

sdks/python/apache_beam/ml/inference/base.py Outdated Show resolved Hide resolved

sdks/python/apache_beam/ml/inference/pytorch_inference_test.py Outdated Show resolved Hide resolved

Fix variable name; Add docstring

8a726e7

TheNeuralBit reviewed Jun 10, 2022

View reviewed changes

yeandy added 3 commits June 12, 2022 18:09

Merge master

fb6cb15

Refactor out remaining kwargs

4af9194

Merge branch 'master' into refactor_kwargs

5268030

yeandy added 2 commits June 13, 2022 17:41

Merge master

95c96cc

Fix missing extra_kwargs

4e700ce

TheNeuralBit closed this Jun 14, 2022

TheNeuralBit reopened this Jun 14, 2022

yeandy mentioned this pull request Jun 14, 2022

Split PytorchModelHandler into PytorchModelHandlerTensor and PytorchModelHandlerKeyedTensor #21810

Merged

4 tasks

Remove comments

527f24c

TheNeuralBit reviewed Jun 14, 2022

View reviewed changes

sdks/python/apache_beam/ml/inference/base.py Outdated Show resolved Hide resolved

ryanthompson591 suggested changes Jun 14, 2022

View reviewed changes

yeandy added 2 commits June 14, 2022 13:06

Merge branch 'master' into refactor_kwargs

bb70680

Add extra_args back to sklearn; Throw exception for sklearn

b7f5d94

yeandy force-pushed the refactor_kwargs branch from ecaf336 to b7f5d94 Compare June 14, 2022 18:37

Merge branch 'master' into refactor_kwargs

fe84987

tvalentyn approved these changes Jun 15, 2022

View reviewed changes

tvalentyn merged commit 2de4c2c into apache:master Jun 15, 2022

tvalentyn mentioned this pull request Jun 15, 2022

Support **kwargs for PyTorch models. #21453

Closed

tvalentyn reviewed Jun 15, 2022

View reviewed changes

yeandy mentioned this pull request Jun 15, 2022

Move validation of Sklearn inference_args to construction time #21894

Closed

bullet03 pushed a commit to akvelon/beam that referenced this pull request Jun 20, 2022

Remove kwargs and add explicit runinference_args (apache#21806)

da0b478

Co-authored-by: tvalentyn <tvalentyn@users.noreply.github.com>

prodriguezdefino pushed a commit to prodriguezdefino/beam-pabs that referenced this pull request Jun 21, 2022

Remove kwargs and add explicit runinference_args (apache#21806)

9b16fb5

Co-authored-by: tvalentyn <tvalentyn@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove kwargs and add explicit runinference_args #21806

Remove kwargs and add explicit runinference_args #21806

yeandy commented Jun 10, 2022

yeandy commented Jun 10, 2022

asf-ci commented Jun 10, 2022

asf-ci commented Jun 10, 2022

asf-ci commented Jun 10, 2022

asf-ci commented Jun 10, 2022

TheNeuralBit left a comment

TheNeuralBit Jun 10, 2022

yeandy Jun 13, 2022

yeandy commented Jun 13, 2022

codecov bot commented Jun 13, 2022 •

edited

TheNeuralBit commented Jun 13, 2022

TheNeuralBit commented Jun 14, 2022 •

edited

ryanthompson591 Jun 14, 2022

TheNeuralBit Jun 14, 2022

ryanthompson591 Jun 14, 2022

TheNeuralBit Jun 14, 2022

TheNeuralBit Jun 14, 2022 •

edited

yeandy Jun 14, 2022

yeandy Jun 14, 2022

yeandy Jun 14, 2022

ryanthompson591 Jun 14, 2022

TheNeuralBit Jun 14, 2022

yeandy Jun 14, 2022

ryanthompson591 Jun 14, 2022

yeandy Jun 14, 2022

ryanthompson591 Jun 14, 2022

TheNeuralBit commented Jun 15, 2022

tvalentyn Jun 15, 2022

yeandy Jun 15, 2022

Remove kwargs and add explicit runinference_args #21806

Remove kwargs and add explicit runinference_args #21806

Conversation

yeandy commented Jun 10, 2022

GitHub Actions Tests Status (on master branch)

yeandy commented Jun 10, 2022

asf-ci commented Jun 10, 2022

asf-ci commented Jun 10, 2022

asf-ci commented Jun 10, 2022

asf-ci commented Jun 10, 2022

TheNeuralBit left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yeandy commented Jun 13, 2022

codecov bot commented Jun 13, 2022 • edited

Codecov Report

TheNeuralBit commented Jun 13, 2022

TheNeuralBit commented Jun 14, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheNeuralBit Jun 14, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheNeuralBit commented Jun 15, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jun 13, 2022 •

edited

TheNeuralBit commented Jun 14, 2022 •

edited

TheNeuralBit Jun 14, 2022 •

edited