python timers #2180

chriselion · 2019-06-24T23:27:04Z

Simple nested timers for profiling long-running jobs. To use, just import hierarchical_timer and use is as a context manager:

with hierarchical_timer("my_stuff"):
  do_stuff()

or

@timed # will generate a "my_func" timer
def my_func(x, y)
  return x + y

class Foo:
  @timed  # will generate a "Foo.bar" timer
  def bar()

Some follow-ups todo after the initial PR:

See if we can capture what the existing TrainerMetrics is doing, and delete the old code there
Track timings in the subprocess workers, and send stats back to the "main" process and merge them there.

CLAassistant · 2019-06-24T23:27:12Z

All committers have signed the CLA.

chriselion · 2019-07-09T23:52:17Z

ml-agents-envs/mlagents/envs/timers.py

+    Represents the time spent in a block of code.
+    """
+
+    __slots__ = ["children", "total", "count"]


Possibly a premature optimization, but this helps reduce the memory footprint and time overhead to the overall system.

Wow I didn't know this exists. This will be super useful once we use the profiling in a ton of places.

chriselion · 2019-07-10T00:25:00Z

ml-agents/mlagents/trainers/trainer_controller.py

@@ -118,6 +120,11 @@ def _write_training_metrics(self):
            if brain_name in self.trainer_metrics:
                self.trainers[brain_name].write_training_metrics()

+    def _write_timing_tree(self) -> None:
+        timing_path = f"{self.summaries_dir}/{self.run_id}_timers.json"


I'm open to suggestions on where to put this/what to call it. But this will put it in the same path as the current TrainerMetrics csv files.

chriselion · 2019-07-10T16:54:14Z

ml-agents/mlagents/trainers/trainer_controller.py

@@ -333,6 +343,7 @@ def advance(self, env: SubprocessEnvManager) -> int:
                trainer.increment_step(len(new_step_infos))
                if trainer.is_ready_update():
                    # Perform gradient descent with experience buffer
-                    trainer.update_policy()
+                    with hierarchical_timer("update_policy"):


The decorator is more concise and doesn't need the whitespace change, but this is probably better for parts of the code where we're making calls to an abstract class.

chriselion · 2019-07-10T16:54:42Z

ml-agents-envs/mlagents/envs/timers.py

+
+
+@contextmanager
+def hierarchical_timer(name: str, timer_stack: TimerStack = None) -> Generator:


Should this just be called "timer" or is that too generic?

I think this is good, better to be precise to help people understand.

chriselion · 2019-07-10T16:54:54Z

ml-agents-envs/mlagents/envs/timers.py

@@ -0,0 +1,172 @@
+# # Unity ML-Agents Toolkit


Better suggestions on where this should live?

This seems good to me for the short term.

chriselion · 2019-07-10T17:01:09Z

based on the current PR, here's how hallway looks today:

{
  "total": 0.0,
  "count": 0,
  "self": 0.0,
  "children": [
    {
      "name": "TrainerController.advance",
      "total": 32.731460376,
      "count": 957,
      "self": 1.4661682279999795,
      "children": [
        {
          "name": "env_step",
          "total": 14.123654692000024,
          "count": 957,
          "self": 0.1792060960000832,
          "children": [
            {
              "name": "SubprocessEnvManager._take_step",
              "total": 1.2891531169999557,
              "count": 957,
              "self": 0.05311957899995745,
              "children": [
                {
                  "name": "PPOPolicy.evaluate",
                  "total": 1.2360335379999983,
                  "count": 957,
                  "self": 1.2360335379999983
                }
              ]
            },
            {
              "name": "recv",
              "total": 12.655295478999985,
              "count": 957,
              "self": 12.655295478999985
            }
          ]
        },
        {
          "name": "update_policy",
          "total": 17.141637455999998,
          "count": 14,
          "self": 0.13363037500003472,
          "children": [
            {
              "name": "PPOPolicy.update",
              "total": 17.008007080999963,
              "count": 678,
              "self": 17.008007080999963
            }
          ]
        }
      ]
    }
  ]
}

xiaomaogy · 2019-07-10T23:34:36Z

ml-agents-envs/mlagents/envs/timers.py

+        self.total: float = 0.0
+        self.count: int = 0
+
+    def get_child(self, name: str) -> "TimerNode":


Maybe this is a dumb question.. Why do you make the TimeNode into "TimeNode" here?

Not a dumb question at all. This explains it better than I can: https://mypy.readthedocs.io/en/latest/kinds_of_types.html#class-name-forward-references

The runtime can't parse without the string:

______________________________________ ERROR collecting mlagents/trainers/tests/test_reward_signals.py ______________________________________ venv/lib/python3.7/site-packages/py/_path/local.py:701: in pyimport __import__(modname) ml-agents/mlagents/trainers/__init__.py:6: in <module> from .trainer import * ml-agents/mlagents/trainers/trainer.py:8: in <module> from mlagents.trainers import TrainerMetrics ml-agents/mlagents/trainers/__init__.py:8: in <module> from .trainer_controller import * ml-agents/mlagents/trainers/trainer_controller.py:16: in <module> from mlagents.envs.subprocess_env_manager import SubprocessEnvManager ml-agents-envs/mlagents/envs/subprocess_env_manager.py:9: in <module> from mlagents.envs.timers import timed, hierarchical_timer ml-agents-envs/mlagents/envs/timers.py:35: in <module> class TimerNode: ml-agents-envs/mlagents/envs/timers.py:47: in TimerNode def get_child(self, name: str) -> TimerNode: E NameError: name 'TimerNode' is not defined

It looks like they're not needed in __init__ though, so I'll remove those ones.

xiaomaogy · 2019-07-11T00:01:49Z

ml-agents-envs/mlagents/envs/timers.py

+        if child_list:
+            res["children"] = child_list
+
+        return res


It seems that based on this implementation, the root TimerNode is treated as a special node (its total, count is always 0, it doesn't have a name). Since this is a recursive call, should we set a base case so that the root TimerNode also has the corresponding values?

I agree, the handling of the root node is a little strange but I'm not sure what the best way to format the output is. A few ways I was thinking about:

Omit total, self, and count keys from the root

Just return the list of the root's children

For the root, set total = program execution time, self = total - sum(children), and count = 1

Sounds like you're suggesting the third one? Any other ways to do it?

Yes I'm suggesting the third one. Also I'm suggesting adding a "name" attribute to the TimerNode class, and give the root TimerNode a key value pair as "name":"root". Right now you are adding the "name" attribute to the child TimerNode in the line 104, which was a little bit confusing.

Sounds good, I'll do that (and I forgot about adding "name":"root")

I'll add some comments about the node names - since the parent has the name as the key, the node doesn't actually store it.

xiaomaogy · 2019-07-11T17:53:14Z

ml-agents-envs/mlagents/envs/timers.py

+
+The total time and counts are tracked for each block of code; in this example "foo" and "context.foo" are considered
+distinct blocks, and are tracked separately.
+"""


Maybe it is helpful to point out(or give an example) that @timed is totally equivalent to "with hierarchical_timer()" given a certain condition (when function name equals the name passed in). So that for people who are not familiar with Decorator or doesn't want to go to the implementation, it is easier to understand?

xiaomaogy · 2019-07-11T17:54:17Z

Overall looks good to me. I've learned a lot during the review.

ervteng · 2019-07-11T17:56:31Z

ml-agents/mlagents/trainers/trainer_controller.py

@@ -284,8 +296,10 @@ def start_learning(
        env_manager.close()
        if self.train_model:
            self._write_training_metrics()
+            self._write_timing_tree()


We might want to write the timing tree even when we're not training (i.e., doing inference). I think if we just move it out of this if statement it will do that..

ervteng

Looks good! Might change the timer tree write to be always rather than just while training. This is going to be very useful.

Chris Elion added 3 commits June 24, 2019 15:03

Timer proof-of-concept

92d18cb

micro optimizations

231bfea

add some timers

c0c07dc

cleanup, add asserts

a656b05

chriselion mentioned this pull request Jun 26, 2019

C# hierarchical timers #2198

Merged

Chris Elion added 5 commits July 3, 2019 10:36

Merge remote-tracking branch 'origin/develop' into develop-timers-poc

809ae80

Cleanup (no start/end methods) and handle exceptions

5393a3f

Merge remote-tracking branch 'origin/develop' into develop-timers-poc

7017ece

unit test and decorator

e4bdb25

move output code, add a decorator

42959e2

chriselion commented Jul 9, 2019

View reviewed changes

Chris Elion added 3 commits July 9, 2019 16:57

cleanup

95ca049

module docstring

a75d172

actually write the timings when done with training

ac6b85e

chriselion commented Jul 10, 2019

View reviewed changes

Chris Elion added 3 commits July 10, 2019 09:30

use __qualname__ instead

f6714f9

Merge remote-tracking branch 'origin/develop' into develop-timers-poc

cc3043a

add a few more timers

ae5f155

chriselion commented Jul 10, 2019

View reviewed changes

Chris Elion added 2 commits July 10, 2019 10:02

fix mock import

2a38af0

fix unit test

d896d50

chriselion changed the title ~~[proof of concept] hierarchical timers~~ python timers Jul 10, 2019

chriselion requested review from xiaomaogy and ervteng July 10, 2019 17:21

xiaomaogy reviewed Jul 10, 2019

View reviewed changes

don't need fwd reference

ec887ee

xiaomaogy reviewed Jul 11, 2019

View reviewed changes

ervteng reviewed Jul 11, 2019

View reviewed changes

ervteng approved these changes Jul 11, 2019

View reviewed changes

Chris Elion added 3 commits July 11, 2019 11:12

cleanup root

d339ccb

always write timers, add comments

b2096bb

undo accidental change

c9c8273

chriselion merged commit 3e242c7 into develop Jul 11, 2019

chriselion deleted the develop-timers-poc branch July 11, 2019 22:25

github-actions bot locked as resolved and limited conversation to collaborators May 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

python timers #2180

python timers #2180

chriselion commented Jun 24, 2019 •

edited

Loading

CLAassistant commented Jun 24, 2019 •

edited

Loading

chriselion Jul 9, 2019

xiaomaogy Jul 10, 2019 •

edited

Loading

chriselion Jul 10, 2019

chriselion Jul 10, 2019

chriselion Jul 10, 2019

xiaomaogy Jul 10, 2019

chriselion Jul 10, 2019

xiaomaogy Jul 10, 2019

chriselion commented Jul 10, 2019

xiaomaogy Jul 10, 2019

chriselion Jul 10, 2019 •

edited

Loading

xiaomaogy Jul 11, 2019

chriselion Jul 11, 2019 •

edited

Loading

xiaomaogy Jul 11, 2019

chriselion Jul 11, 2019

xiaomaogy Jul 11, 2019

xiaomaogy commented Jul 11, 2019

ervteng Jul 11, 2019

chriselion Jul 11, 2019

ervteng left a comment



		@contextmanager
		def hierarchical_timer(name: str, timer_stack: TimerStack = None) -> Generator:

python timers #2180

python timers #2180

Conversation

chriselion commented Jun 24, 2019 • edited Loading

CLAassistant commented Jun 24, 2019 • edited Loading

Choose a reason for hiding this comment

xiaomaogy Jul 10, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chriselion commented Jul 10, 2019

Choose a reason for hiding this comment

chriselion Jul 10, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chriselion Jul 11, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xiaomaogy commented Jul 11, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ervteng left a comment

Choose a reason for hiding this comment

chriselion commented Jun 24, 2019 •

edited

Loading

CLAassistant commented Jun 24, 2019 •

edited

Loading

xiaomaogy Jul 10, 2019 •

edited

Loading

chriselion Jul 10, 2019 •

edited

Loading

chriselion Jul 11, 2019 •

edited

Loading