[BEAM-3042] Adding time tracking of batch side inputs [low priority]#5309
[BEAM-3042] Adding time tracking of batch side inputs [low priority]#5309pabloem merged 4 commits intoapache:masterfrom
Conversation
|
Well, it seems that this code is going to disappear. Oh well..... |
|
The code is not going away after all. Reopening. |
|
Run Python PostCommit |
|
r: @aaltay I understand that @boyuanzz has made extensive research on perf diagnosing of a new code change. |
|
Change LGTM. The PR description refers to a flag but I do not see the flag in the code, what are you referring to ? |
|
Gating on a new flag. I've added a commit for that. How does it all look? |
|
Flag pushed. |
|
With 2.5.0 branch cut, I'd like to merge this soon. LMK if things look good |
|
ping |
|
Was there any change after my LGTM on May 11? If not feel free to self merge when ready. |
This PR improves Cython tags for some classes, and uses them for tracking of time spent reading side inputs.
It also adds a state sampler thread to make the side input microbenchmark a more realistic representation of the worker environment (as there may be contention due to the state switching).
NOTE: This PR should add flag versioning before merging in any case.
Current performance with 500 runs:
With change and flag deactivated:
With change and flag activated:
This represents a really small regression in a microbenchmark that specifically exercises this feature.