Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Hierarchical timer class #96

Merged
merged 15 commits into from
Apr 6, 2020

Conversation

michaelbynum
Copy link
Contributor

Summary/Motivation:

This Pr adds a class for hierarchical timing.

Legal Acknowledgement

By contributing to this software project, I agree to the following terms and conditions for my contribution:

  1. I agree my contributions are submitted under the BSD license.
  2. I represent I am authorized to make the contributions and grant the license. If my employer has rights to intellectual property that includes these contributions, I represent that I have received permission to make contributions and grant the required license on behalf of that employer.

@coveralls
Copy link

coveralls commented Apr 3, 2020

Coverage Status

Coverage increased (+0.8%) to 61.847% when pulling 2feaa83 on michaelbynum:hierarchical_timer into ad6043c on PyUtilib:master.

Copy link
Contributor

@jsiirola jsiirola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am eager for this functionality, but have concerns that it duplicates much of the functionality in the TicTocTimer but in a different way. Would you consider refactoring this to have the _HierarchicalHelper use TicTocTimers? There are some features in the TicTocTimer that provide more robustness (notably, higher precision timers on Windows).

parent.timers[identifier] = _HierarchicalHelper()
return parent.timers[identifier]

def start_increment(self, identifier):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TicTocTimer uses start() and stop(). Can we make the two APIs consistent?

s += name_formatter.format(name=name)
s += '{0:>15.2e}'.format(timer.total_time)
s += '{0:>15d}'.format(timer.n_calls)
s += '{0:>15.2e}\n'.format(timer.total_time/timer.n_calls)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can n_calls ever be 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, n_calls can never be 0. The timer would just not exist in that case.

stack = identifier.split('.')
timer = self._get_timer_from_stack(stack)
parent = self._get_timer_from_stack(stack[:-1])
return timer.total_time / parent.total_time * 100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For timing very short times on machines with low precision, total_time can be 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed this in the printing methods. However, I think a division by 0 error is appropriate here. Other methods are available for users to easily get the numerator and denominator individually. Although, given what I just said, it might make sense to remove the percent methods. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I think that either get_relative_percent should return 0 when total_time is 0 so that clients that are doing the equivalent of '%d' % (t.get_relative_percent) will not get exceptions. It is numerically incorrect, but doesn't meaningfully change the interpretation of the results.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another edge case: if you call get_relative_percent from the top of the stack, you should get 100.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. That should now be fixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree - I think returning 0 from get_relative_percent_time when parent.total_time is 0 does change the interpretation of the results. I would rather the user get an exception. However, I am happy to remove this method completely and let users do what they want manually (with calls to get_total_time).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we could return math.nan, which is accurate and does not raise an exception. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelbynum Great idea! Just make it float('nan') (for compatibility).

@codecov-io
Copy link

codecov-io commented Apr 4, 2020

Codecov Report

Merging #96 into master will increase coverage by 0.79%.
The diff coverage is 93.75%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #96      +/-   ##
==========================================
+ Coverage   63.13%   63.93%   +0.79%     
==========================================
  Files          87       87              
  Lines        8788     8916     +128     
==========================================
+ Hits         5548     5700     +152     
+ Misses       3240     3216      -24     
Impacted Files Coverage Δ
pyutilib/misc/timing.py 76.76% <93.75%> (+76.76%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ad6043c...913df89. Read the comment docs.

@michaelbynum
Copy link
Contributor Author

@jsiirola Great suggestions. I never realized the TicTocTimer tracked cumulative time and number of calls. I also did not know about time.perf_counter or time.clock, so thanks. Let me know what you think of the new changes.

ab 4.84e-02 10 4.84e-03 42.0%
other 8.00e-04 N/A N/A 0.7%
b 1.39e-02 10 1.39e-03 10.8%
other 3.49e-04 N/A N/A 0.3%
Copy link
Contributor

@jsiirola jsiirola Apr 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you consider changing this format to more closely match the results from the Python profiler? For reference, the profiler generates:

Function                                             called...
                                                         ncalls  tottime  cumtime
{posix.waitpid}                                      ->
sre_compile.py:228(_compile_charset)                 ->     122    0.026    0.065  sre_compile.py:256(_optimize_charset)
                                                            502    0.000    0.000  {method 'append' of 'list' objects}
                                                             20    0.000    0.000  {method 'extend' of 'list' objects}

I am thinking about a format similar to:

Identifier                               ncalls  tottime  percall      %
------------------------------------------------------------------------
all                                           1    0.129    0.129  100.0
    --------------------------------------------------------------------
    a                                        10    0.115    0.115   89.0
        ----------------------------------------------------------------
        aa                                   50    0.066    0.001   57.3
        ab                                   10    0.048    0.005   42.0
        other                               n/a    0.001      n/a    0.7
        ================================================================
    b                                        10    0.014    0.014   10.8
    other                                   n/a    0.000      n/a    0.3
    ====================================================================
========================================================================

...I am happy to draft the patch if you want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this output way more than what I had. I honestly couldn't come up with anything I really liked, so I just gave up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can take a stab at it right now.

@michaelbynum
Copy link
Contributor Author

@jsiirola I think this is ready now. Thanks for all the great suggestions. I am very happy with this timer now.

@jsiirola
Copy link
Contributor

jsiirola commented Apr 6, 2020

@michaelbynum: some questions/comments as I go through this:

  • Why scientific notation for timing? profile uses '%.3f', and I think that is a good precedent: timings under a millisecond are meaningless, and I (at least) get lost tracking exponents
  • I would like to propose the table be in '%8d %8.3f %8.3f %5.1f': this makes things more compact, yet ensures in the odd chance that timings run over 10000 seconds that there is still a space between numbers.
  • (This is a nit): why use .format()? It is verbose, harder to parse, and generally 2x slower that classic formatting (using '%'). We can move to f-strings when we finally drop 2.x and 3.5 (f-strings are faster than '%' and more concise, but not available until 3.6)
  • Is there a reason you did not have _HierarchicalHelper directly inherit from TicTocTimer? If you did then you would only have to add the timers dict and the printing function.
  • Similarly, I feel like the HierarchicalTimer should itself be a _HierarchicalHelper object, that just overwrites the start / stop methods to take an optional argument. Maybe everything could be implemented in a single class? That way you could get reports from anywhere in the hierarchy.
  • Related to the above, I feel like the info methods (like get_total_time and get relative_precent_time could be calculated locally within the Helper. This actually motivated the previous comment, as I was working through the complexity of the printer (the logic is basically duplicated in both the HierarchicalTimer and in the _HierarchicalHelper
  • You should not directly access the _cumul attribute, as it will return the wrong answer if the timer is currently running. Always access it with toc(msg='')

I have some ideas around addressing the above. Would you like me to draft something as a PR to your branch?

@michaelbynum
Copy link
Contributor Author

@jsiirola

  • I agreed with most of your formatting comments, so I made some updates. I used 12 and 10 rather than 8 and 5 (8 was to short for the titles).
  • I disagree about format - I think it is clear, not verbose. It is my personal preference. Also, I would argue that the performance of printing a HierarchicalTimer is not terribly important.
  • You seem to be favoring inheritance over containment. In this situation, I do not really see the benefit of one over the other, but I am not opposed to inheritance if you want to make some changes an create a PR.
  • The only problem with putting get_relative_percent_time in the helper object is that the helper object does not currently have a pointer to its parent. Again, I am fine with these changes if you want to make a PR.
  • I fixed the issue with accessing _cumul.

@michaelbynum
Copy link
Contributor Author

@jsiirola Looks good. Thanks for these changes.

@carldlaird
Copy link
Contributor

I agree. Thank you both for this - this looks great - and will definitely get used in a few of our projects.

Copy link
Contributor

@jsiirola jsiirola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine to merge. Remaining comments have been archived as #97.

@jsiirola jsiirola changed the title Hierarchical timer Add a Hierarchical timer class Apr 6, 2020
@jsiirola jsiirola merged commit d74cd96 into PyUtilib:master Apr 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants