Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pantsd] Repair end to end runtracker timing for pantsd runs. #5526

Merged
merged 4 commits into from Feb 28, 2018

Conversation

kwlzn
Copy link
Member

@kwlzn kwlzn commented Feb 27, 2018

Fixes #5522

Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Kris.

"""Creates a synthetic workunit with a pre-computed timing.

- name: A short name for this work. E.g., 'resolve', 'compile', 'scala', 'zinc'.
- timing: The pre-compted timing for this workunit.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

computed

@@ -53,6 +54,7 @@ def _setup_pailgun(self):
"""Sets up a PailgunServer instance."""
# Constructs and returns a runnable PantsRunner.
def runner_factory(sock, arguments, environment):
start_time = time.time()
Copy link
Sponsor Member

@stuhood stuhood Feb 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to add a comment indicating how you chose this location for start_time: basically, it's as close to the connection arriving off the socket as we can get?

nit: Could use

class Timer(object):
? Maybe overkill, but.

@@ -119,6 +119,7 @@ def __init__(self, run_info_dir, parent, name, labels=None, cmd='', log_config=N
# In seconds since the epoch. Doubles, to account for fractional seconds.
self.start_time = 0
self.end_time = 0
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because these properties are public, it would be good to defend against people accessing them directly rather than consuming def duration. Perhaps you could convert them into @propertys that fail if _synthetic_duration is set? Or convert to @propertys and then deprecate the properties?

@@ -75,6 +79,7 @@ def _run(self):
repro.capture(run_tracker.run_info.get_as_dict())

# Record the preceding product graph size.
run_tracker.new_pre_timed_workunit('graph_warmth', self._graph_warmth_timing)
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this render under main? I think it would need to to actually be counted toward cumulative timings for main. Should be able to check easily in ./pants server --open.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apparently not: the self_timings show up implicitly as main:graph_warmth, but tests reveal that the cumulative_timings for main < self_timings for main:graph_warmth..

@stuhood
Copy link
Sponsor Member

stuhood commented Feb 27, 2018

Actually, sorry to do this now, but I have another potential approach here that I didn't think of until after posting.

How would you feel about having a native "startup" workunit created by RunTracker.start, and giving def start another parameter that is something like "actual start time". Basically, I think that things that cannot be tracked via the contextmanager-based workunit approach with RunTracker is pretty small, and might only be this pre-start work... so the synthetic workunit might be too general.

Anyway, don't feel strongly about it... but i think it is a minor change, and the upside would be that we would have a guarantee that all work between start = time.time() and the beginning of the RunTracker's main workunit is covered. As it stands, this PR might miss the fork timing.

@kwlzn
Copy link
Member Author

kwlzn commented Feb 27, 2018

How would you feel about having a native "startup" workunit created by RunTracker.start, and giving def start another parameter that is something like "actual start time".

makes sense to me, but I see them as two different things we'd want to time. i.e. I think we should probably do both?

@stuhood
Copy link
Sponsor Member

stuhood commented Feb 27, 2018

makes sense to me, but I see them as two different things we'd want to time. i.e. I think we should probably do both?

Maybe... I think that the other way to look at it would be that we want to create the RunTracker as early as possible... if we think that we will have multiple relevant workunits running before the RunTracker is started, then that might be justified.

Perhaps a better lens to look at this through is: what will we want to do when a lot more work is happening in the daemon pre-fork? ... thinking about it from that angle, it would almost seem like we want to create the RunTracker itself pre-fork...?

@kwlzn
Copy link
Member Author

kwlzn commented Feb 27, 2018

thinking about it from that angle, it would almost seem like we want to create the RunTracker itself pre-fork...?

totally makes sense down the road - I was thinking we'd probably want to port a "v2" RunTracker over for that as part of the Python pipeline porting. but right now I'd love to avoid initializing any kind of v1 Subsystem pre-fork if we can help it. so it depends how much we want to lean in on this particular issue right now I guess. my take is that we just capture some timings and pass those into a post-fork RunTracker for now and revisit the bigger reporting port later.

@stuhood
Copy link
Sponsor Member

stuhood commented Feb 27, 2018

totally makes sense down the road - I was thinking we'd probably want to port a "v2" RunTracker over for that as part of the Python pipeline porting.

I agree that it seems unlikely that we'll be able to use the RunTracker completely as is to track the number of units we're expecting. But I want to hedge our bets and make sure that both evolving and replacing remain open options until we're ready to dive in there.

but right now I'd love to avoid initializing any kind of v1 Subsystem pre-fork if we can help it.

Agreed. If you want to push forward with an approach that assumes muiltiple synthetic workunits created pre-fork and then recorded post-fork, that would be fine with me.

@kwlzn kwlzn changed the title [pantsd] Add graph warmth timing to runtracker for pantsd runs. [pantsd] Repair end to end timing to runtracker for pantsd runs. Feb 28, 2018
@kwlzn kwlzn requested a review from stuhood February 28, 2018 02:17
@kwlzn
Copy link
Member Author

kwlzn commented Feb 28, 2018

@stuhood revised the approach - PTAL.

Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it.

@@ -41,6 +41,10 @@ def run(self):
with hard_exit_handler(), maybe_profiled(profile_path):
self._run()

def _maybe_get_client_start_time_from_env(self):
client_start_time = self._env.pop('PANTSD_RUNTRACKER_CLIENT_START_TIME', None)
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat.

@kwlzn kwlzn changed the title [pantsd] Repair end to end timing to runtracker for pantsd runs. [pantsd] Repair end to end runtracker timing for pantsd runs. Feb 28, 2018
@stuhood
Copy link
Sponsor Member

stuhood commented Feb 28, 2018

Oh, sorry again: one more thing before merging. I'm guessing that we're failing to capture some time even when running without pantsd:

Can we use this same start codepath to record a start time as early as possible both with and without pantsd? Basically, make the new argument required.

@kwlzn
Copy link
Member Author

kwlzn commented Feb 28, 2018

Can we use this same start codepath to record a start time as early as possible both with and without pants?

I assumed this was just a few seconds at most, but added - how's that look?

Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@kwlzn kwlzn merged commit 1cee589 into pantsbuild:master Feb 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants