Skip to content

[BEAM-3042] Adding time tracking of batch side inputs [low priority]#5309

Merged
pabloem merged 4 commits intoapache:masterfrom
pabloem:sideinput-msecs
Jun 15, 2018
Merged

[BEAM-3042] Adding time tracking of batch side inputs [low priority]#5309
pabloem merged 4 commits intoapache:masterfrom
pabloem:sideinput-msecs

Conversation

@pabloem
Copy link
Member

@pabloem pabloem commented May 8, 2018

This PR improves Cython tags for some classes, and uses them for tracking of time spent reading side inputs.

It also adds a state sampler thread to make the side input microbenchmark a more realistic representation of the worker environment (as there may be contention due to the state switching).

NOTE: This PR should add flag versioning before merging in any case.

Current performance with 500 runs:

  • Average runtime: 0.604471780777
  • Time per element: 3.77794862986e-05
  • Regression: 0% (it's the baseline)

With change and flag deactivated:

  • Average runtime: 0.614184889793
  • Time per element: 3.83865556121e-05
  • Regression: 1.6%

With change and flag activated:

  • Average runtime: 0.611651549339
  • Time per element: 3.82282218337e-05
  • Regression: 1.187%

This represents a really small regression in a microbenchmark that specifically exercises this feature.

@pabloem pabloem closed this May 8, 2018
@pabloem pabloem reopened this May 8, 2018
@pabloem pabloem closed this May 8, 2018
@pabloem pabloem reopened this May 8, 2018
@pabloem pabloem force-pushed the sideinput-msecs branch from 15f35b1 to fa7ea73 Compare May 8, 2018 23:59
@pabloem
Copy link
Member Author

pabloem commented May 9, 2018

Well, it seems that this code is going to disappear. Oh well.....

@pabloem pabloem closed this May 9, 2018
@pabloem
Copy link
Member Author

pabloem commented May 9, 2018

The code is not going away after all. Reopening.

@pabloem pabloem reopened this May 9, 2018
@pabloem
Copy link
Member Author

pabloem commented May 9, 2018

Run Python PostCommit

@pabloem
Copy link
Member Author

pabloem commented May 10, 2018

r: @aaltay
Preempting a question that you may have: I believe the reason why this change is faster compared to the previous attempt is the improved cython annotations in the state sampler, and in the counter classes.

I understand that @boyuanzz has made extensive research on perf diagnosing of a new code change.
What do you think is a good path for validating whether the perf footprint is small enough be accepted in? (beyond validating with the microbenchmark)

@pabloem pabloem changed the title [BEAM-3042] Adding time tracking of batch side inputs [BEAM-3042] Adding time tracking of batch side inputs [low priority] May 10, 2018
@aaltay
Copy link
Member

aaltay commented May 11, 2018

Change LGTM. The PR description refers to a flag but I do not see the flag in the code, what are you referring to ?

@pabloem pabloem force-pushed the sideinput-msecs branch from 779d786 to f4fdb4e Compare May 14, 2018 17:16
@pabloem
Copy link
Member Author

pabloem commented May 14, 2018

Gating on a new flag. I've added a commit for that. How does it all look?

@pabloem pabloem force-pushed the sideinput-msecs branch from f4fdb4e to b62be14 Compare May 17, 2018 00:13
@pabloem
Copy link
Member Author

pabloem commented May 18, 2018

Flag pushed.

@pabloem pabloem force-pushed the sideinput-msecs branch from 5e1975c to 187092a Compare June 4, 2018 16:09
@pabloem pabloem force-pushed the sideinput-msecs branch from 187092a to bb46ab6 Compare June 4, 2018 17:06
@pabloem
Copy link
Member Author

pabloem commented Jun 11, 2018

With 2.5.0 branch cut, I'd like to merge this soon. LMK if things look good

@pabloem
Copy link
Member Author

pabloem commented Jun 14, 2018

ping

@aaltay
Copy link
Member

aaltay commented Jun 14, 2018

Was there any change after my LGTM on May 11? If not feel free to self merge when ready.

@pabloem pabloem merged commit 9cdbbe0 into apache:master Jun 15, 2018
@pabloem pabloem deleted the sideinput-msecs branch June 15, 2018 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants