[BEAM-10427] Benchmark runtime typechecking for the Python SDK #12242

saavan-google · 2020-07-13T20:28:55Z

This PR adds a load test for evaluating the performance of the runtime typechecking system for the Python SDK. It works by comparing the performance of a pipeline with the runtime typechecking feature on versus the same pipeline with it off. This load test also allows the user to specify whether the pipeline should be run with simple typehints (e.g. str) or complex, nested typehints (e.g. Tuple[str, int, Iterable[bool]].

This PR is ready for review.

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

Choose reviewer(s) and mention them in a comment (R: @username).
Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
Update CHANGES.md with noteworthy changes.
If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

Post-Commit Tests Status (on master branch)

Lang	SDK	Dataflow	Samza	Twister2
Go		---	---	---
Java
Python			---	---
XLang	---	---	---	---

Pre-Commit Tests Status (on master branch)

---	Java	Python	Go	Website
Non-portable
Portable	---		---	---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

saavan-google · 2020-07-31T18:03:31Z

R: @udim
R: @robertwb

udim · 2020-08-01T00:56:47Z

Please update PR desc. on whether it's ready for review

saavan-google · 2020-08-03T17:52:07Z

R: @udim
R: @robertwb

PTAL - this is ready for review

robertwb

Sorry I didn't have a chance to look at this sooner.

Since this is a purely local change, I would suggest rather than adding all this infrastructure simply creating a variant of (or even adding an option to)

https://github.com/apache/beam/blob/master/sdks/python/apache_beam/tools/map_fn_microbenchmark.py

which enables the two runtime type checking. This'll be much cleaner numbers, and a lot faster turn-around time. (It'd probably be valuable to run this before and after your changes even with typechecking turned off to ensure you're not adding overhead in that case too.)

udim · 2020-08-06T23:27:26Z

Sorry I didn't have a chance to look at this sooner.

Since this is a purely local change, I would suggest rather than adding all this infrastructure simply creating a variant of (or even adding an option to)

https://github.com/apache/beam/blob/master/sdks/python/apache_beam/tools/map_fn_microbenchmark.py

which enables the two runtime type checking. This'll be much cleaner numbers, and a lot faster turn-around time. (It'd probably be valuable to run this before and after your changes even with typechecking turned off to ensure you're not adding overhead in that case too.)

My goal in suggesting this framework was to gather historical numbers, and to test for performance regressions in future PRs.

robertwb

What do the numbers look like now? I'm still concerned a full integration test may be rather noisy to capture this data well and a micro-benchmark would be better suited for this.

robertwb · 2020-08-10T17:58:24Z

sdks/python/apache_beam/testing/load_tests/runtime_type_check_off_test_py3.py

+
+class RunTimeTypeCheckOffTest(BaseRunTimeTypeCheckTest):
+  def __init__(self):
+    self.runtime_type_check = False


Rather than pass this as an attribute of self, can't we pass it as an additional flag in the test? This would allow us to get rid of much if not all of this duplication.

We looked into this but the CLI flags are only parsed after the LoadTest has been initialized, and by that point it's too late to modify the Pipeline options because that happens in the LoadTest constructor.. or can I still modify the options?

It happens here

robertwb · 2020-08-10T18:02:57Z

.test-infra/jenkins/job_LoadTests_RuntimeTypeChecking_Python.groovy

+      ]
+    ],
+    [
+      title          : 'Runtime Type Checking Python Load Test: On | Nested Type Hints',


Running load tests is a limited resource; could you test them both in the same pipeline? This should be sufficient for the purposes of noticing regressions.

Yeah sure

I think the original goal of having both nested type hints and simple type hints as separate tests was to see if there was any performance difference between the two when runtime type checking was on, which would help us narrow down whether the performance drop came from the overhead of the decorator versus the actual type check itself, and also to test for the regressions separately.

I can merge them if it's okay with @udim

saavan-google · 2020-08-11T05:55:56Z

Closing this PR because we've collectively come to a consensus that micro-benchmarking will provide better precision, and consume less resources than load testing. This PR can be a historical marker in case we decide to add a load test in the future to check for regressions. Therefore, the micro-benchmark will be added in a different PR.

probot-autolabeler bot added the python label Jul 13, 2020

probot-autolabeler bot added the infra label Jul 21, 2020

saavan-google and others added 19 commits July 29, 2020 02:51

Add test

c7a8fe0

Add simple annotated input/output DoFns

31313b4

Add nested annotated input/output DoFns

1a28169

Configure test to utilize the simple/nested DoFns

3664a60

Fix example test run and command line arg parsing

46f9939

Hardcode flag values

538fc85

Fix flag bugs, tests failing to run for a different reason

10568f5

Fix type hints, test now runs correctly with simple typehints

d4a4104

Move test to own folder

885d6ed

Add 2 tests inheriting from the BaseRunTimeTypeCheck class

e15292a

Fix child tests

82fff3f

Add groovy config for load tests

507b4b8

Add py3 suffix to tests

80a4d66

Add py3 suffix to base test to fix py38-docs error

4dd0536

Don't run the base test

a75c7dd

Fix imports and folder structure

3d7f588

Delete unused empty file

f162c8c

Delete unused empty file

477131b

Fixup: apply spotless groovy

91c3afd

saavan-google force-pushed the BEAM-10427-benchmark-runtime-typechecking branch from 1cc5ca7 to 91c3afd Compare July 29, 2020 18:07

saavan-google added 5 commits July 29, 2020 16:45

Add *_test_py3 exclusion

e8ee335

Fix lint errors

cf001ae

Fix lint errors

e73a42c

Fix line-too-long error

f1e03ea

Fixup: apply YAPF

efe4195

Fix isort errors

aecc39f

saavan-google added 8 commits August 3, 2020 22:22

Disable isort for the new load tests

6373f90

Fix bad type hint

8b3a1b8

Fix bad type hint

291a4db

Fix bad type hint

9f98c7c

Fix type hints - use custom input

98134c0

Fix type hints, and tests

2dd2fb8

Remove iterable

6af5588

Update tests

74c0034

robertwb reviewed Aug 6, 2020

View reviewed changes

robertwb reviewed Aug 10, 2020

View reviewed changes

saavan-google closed this Aug 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BEAM-10427] Benchmark runtime typechecking for the Python SDK #12242

[BEAM-10427] Benchmark runtime typechecking for the Python SDK #12242

saavan-google commented Jul 13, 2020 •

edited

saavan-google commented Jul 31, 2020

udim commented Aug 1, 2020

saavan-google commented Aug 3, 2020

robertwb left a comment

udim commented Aug 6, 2020

robertwb left a comment

robertwb Aug 10, 2020

saavan-google Aug 10, 2020 •

edited

robertwb Aug 10, 2020

saavan-google Aug 10, 2020

saavan-google commented Aug 11, 2020

[BEAM-10427] Benchmark runtime typechecking for the Python SDK #12242

[BEAM-10427] Benchmark runtime typechecking for the Python SDK #12242

Conversation

saavan-google commented Jul 13, 2020 • edited

Post-Commit Tests Status (on master branch)

Pre-Commit Tests Status (on master branch)

saavan-google commented Jul 31, 2020

udim commented Aug 1, 2020

saavan-google commented Aug 3, 2020

robertwb left a comment

Choose a reason for hiding this comment

udim commented Aug 6, 2020

robertwb left a comment

Choose a reason for hiding this comment

robertwb Aug 10, 2020

Choose a reason for hiding this comment

saavan-google Aug 10, 2020 • edited

Choose a reason for hiding this comment

robertwb Aug 10, 2020

Choose a reason for hiding this comment

saavan-google Aug 10, 2020

Choose a reason for hiding this comment

saavan-google commented Aug 11, 2020

saavan-google commented Jul 13, 2020 •

edited

saavan-google Aug 10, 2020 •

edited