New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pantsd] Repair end to end runtracker timing for pantsd runs. #5526

Merged
merged 4 commits into from Feb 28, 2018

Conversation

Projects
None yet
2 participants
@kwlzn
Copy link
Member

kwlzn commented Feb 27, 2018

Fixes #5522

@stuhood
Copy link
Member

stuhood left a comment

Thanks Kris.

"""Creates a synthetic workunit with a pre-computed timing.
- name: A short name for this work. E.g., 'resolve', 'compile', 'scala', 'zinc'.
- timing: The pre-compted timing for this workunit.

This comment has been minimized.

@stuhood

stuhood Feb 27, 2018

Member

computed

@@ -53,6 +54,7 @@ def _setup_pailgun(self):
"""Sets up a PailgunServer instance."""
# Constructs and returns a runnable PantsRunner.
def runner_factory(sock, arguments, environment):
start_time = time.time()

This comment has been minimized.

@stuhood

stuhood Feb 27, 2018

Member

Would be good to add a comment indicating how you chose this location for start_time: basically, it's as close to the connection arriving off the socket as we can get?

nit: Could use

class Timer(object):
? Maybe overkill, but.

@@ -119,6 +119,7 @@ def __init__(self, run_info_dir, parent, name, labels=None, cmd='', log_config=N
# In seconds since the epoch. Doubles, to account for fractional seconds.
self.start_time = 0
self.end_time = 0

This comment has been minimized.

@stuhood

stuhood Feb 27, 2018

Member

Because these properties are public, it would be good to defend against people accessing them directly rather than consuming def duration. Perhaps you could convert them into @propertys that fail if _synthetic_duration is set? Or convert to @propertys and then deprecate the properties?

@@ -75,6 +79,7 @@ def _run(self):
repro.capture(run_tracker.run_info.get_as_dict())

# Record the preceding product graph size.
run_tracker.new_pre_timed_workunit('graph_warmth', self._graph_warmth_timing)

This comment has been minimized.

@stuhood

stuhood Feb 27, 2018

Member

Will this render under main? I think it would need to to actually be counted toward cumulative timings for main. Should be able to check easily in ./pants server --open.

This comment has been minimized.

@kwlzn

kwlzn Feb 28, 2018

Member

apparently not: the self_timings show up implicitly as main:graph_warmth, but tests reveal that the cumulative_timings for main < self_timings for main:graph_warmth..

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Feb 27, 2018

Actually, sorry to do this now, but I have another potential approach here that I didn't think of until after posting.

How would you feel about having a native "startup" workunit created by RunTracker.start, and giving def start another parameter that is something like "actual start time". Basically, I think that things that cannot be tracked via the contextmanager-based workunit approach with RunTracker is pretty small, and might only be this pre-start work... so the synthetic workunit might be too general.

Anyway, don't feel strongly about it... but i think it is a minor change, and the upside would be that we would have a guarantee that all work between start = time.time() and the beginning of the RunTracker's main workunit is covered. As it stands, this PR might miss the fork timing.

@kwlzn

This comment has been minimized.

Copy link
Member

kwlzn commented Feb 27, 2018

How would you feel about having a native "startup" workunit created by RunTracker.start, and giving def start another parameter that is something like "actual start time".

makes sense to me, but I see them as two different things we'd want to time. i.e. I think we should probably do both?

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Feb 27, 2018

makes sense to me, but I see them as two different things we'd want to time. i.e. I think we should probably do both?

Maybe... I think that the other way to look at it would be that we want to create the RunTracker as early as possible... if we think that we will have multiple relevant workunits running before the RunTracker is started, then that might be justified.

Perhaps a better lens to look at this through is: what will we want to do when a lot more work is happening in the daemon pre-fork? ... thinking about it from that angle, it would almost seem like we want to create the RunTracker itself pre-fork...?

@kwlzn

This comment has been minimized.

Copy link
Member

kwlzn commented Feb 27, 2018

thinking about it from that angle, it would almost seem like we want to create the RunTracker itself pre-fork...?

totally makes sense down the road - I was thinking we'd probably want to port a "v2" RunTracker over for that as part of the Python pipeline porting. but right now I'd love to avoid initializing any kind of v1 Subsystem pre-fork if we can help it. so it depends how much we want to lean in on this particular issue right now I guess. my take is that we just capture some timings and pass those into a post-fork RunTracker for now and revisit the bigger reporting port later.

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Feb 27, 2018

totally makes sense down the road - I was thinking we'd probably want to port a "v2" RunTracker over for that as part of the Python pipeline porting.

I agree that it seems unlikely that we'll be able to use the RunTracker completely as is to track the number of units we're expecting. But I want to hedge our bets and make sure that both evolving and replacing remain open options until we're ready to dive in there.

but right now I'd love to avoid initializing any kind of v1 Subsystem pre-fork if we can help it.

Agreed. If you want to push forward with an approach that assumes muiltiple synthetic workunits created pre-fork and then recorded post-fork, that would be fine with me.

@kwlzn kwlzn force-pushed the twitter:kwlzn/v2_timing branch from bdb64a4 to 5748152 Feb 28, 2018

@kwlzn kwlzn changed the title [pantsd] Add graph warmth timing to runtracker for pantsd runs. [pantsd] Repair end to end timing to runtracker for pantsd runs. Feb 28, 2018

@kwlzn kwlzn requested a review from stuhood Feb 28, 2018

@kwlzn

This comment has been minimized.

Copy link
Member

kwlzn commented Feb 28, 2018

@stuhood revised the approach - PTAL.

@stuhood
Copy link
Member

stuhood left a comment

I like it.

@@ -41,6 +41,10 @@ def run(self):
with hard_exit_handler(), maybe_profiled(profile_path):
self._run()

def _maybe_get_client_start_time_from_env(self):
client_start_time = self._env.pop('PANTSD_RUNTRACKER_CLIENT_START_TIME', None)

This comment has been minimized.

@stuhood

stuhood Feb 28, 2018

Member

Neat.

@kwlzn kwlzn changed the title [pantsd] Repair end to end timing to runtracker for pantsd runs. [pantsd] Repair end to end runtracker timing for pantsd runs. Feb 28, 2018

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Feb 28, 2018

Oh, sorry again: one more thing before merging. I'm guessing that we're failing to capture some time even when running without pantsd:

Can we use this same start codepath to record a start time as early as possible both with and without pantsd? Basically, make the new argument required.

@kwlzn

This comment has been minimized.

Copy link
Member

kwlzn commented Feb 28, 2018

Can we use this same start codepath to record a start time as early as possible both with and without pants?

I assumed this was just a few seconds at most, but added - how's that look?

@stuhood
Copy link
Member

stuhood left a comment

Thanks!

@kwlzn kwlzn merged commit 1cee589 into pantsbuild:master Feb 28, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

stuhood added a commit that referenced this pull request Mar 1, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment