More Timer refinement #46023

robieta · 2020-10-08T07:36:39Z

This PR just adds more polish to the benchmark utils:

common.py, timer.py, and valgrind_wrapper/timer_interface.py are now MyPy strict compliant. (except for three violations due to external deps.) Compare and Fuzzer will be covered in a future PR.
CallgrindStats now uses TaskSpec rather than accepting the individual fields which brings it closer to Measurement.
Some __repr__ logic has been moved into TaskSpec (which Measurement and CallgrindStats use in their own __repr__s) for a more unified feel and less horrible f-string hacking, and the repr's have been given a cleanup pass.
Tuple[FunctionCount, ...] has been formalized as the FunctionCounts class, which has a much nicer __repr__ than just the raw tuple, as well as some convenience methods (__add__, __sub__, filter, transform) for easier DIY stat exploration. (I find myself using the latter two a lot now.) My personal experience is that manipulating FunctionCounts is massively more pleasant than the raw tuples of FunctionCount. (Though it's still possible to get at the raw data if you want.)
Better support for multi-line stmt and setup.
Compare now also supports rowwise coloring, which is often the more natural layout for A/B testing.
Limited support for globals in collect_callgrind. This should make it easier to benchmark JIT models. (CC @ZolotukhinM)
More unit tests, including extensive tests for the Callgrind stats manipulation APIs.
Mitigate issue with MKL_THREADING_LAYER when run in Jupyter. (Pytorch 1.5.0 (installed from conda) errors with complaints about incompatibility between MKL and libgomp when using Pytorch's multiprocessing #37377)

Test plan: changes should be covered by existing and new unit tests.

robieta · 2020-10-08T18:00:59Z

Test breakage is due to where I put test_callgrind_artifacts.json. I'll sort it out.

robieta · 2020-10-08T23:16:34Z

I moved the benchmark utils tests into a separate file, as they are now non-trivial. However I realize this makes it more difficult to review. The changes apart from the move are:

Add multi-line test to test_timer
Add a JIT'd function and an int to globals in test_collect_callgrind to make sure artifact transfer works properly.
test_manipulate_callgrind_stats is all new.
Add test for rowwise coloring in test_compare

robieta · 2020-10-08T23:19:53Z

Also CC @heitorschueroff
I'm starting to flesh out the API docstrings, and a couple bits of this PR will allow us to formulate the latter parts of our recipe a little more elegantly.

codecov · 2020-10-09T03:37:33Z

Codecov Report

Merging #46023 into master will increase coverage by 0.04%.
The diff coverage is 67.84%.

@@            Coverage Diff             @@
##           master   #46023      +/-   ##
==========================================
+ Coverage   68.33%   68.37%   +0.04%     
==========================================
  Files         410      411       +1     
  Lines       53795    53937     +142     
==========================================
+ Hits        36760    36881     +121     
- Misses      17035    17056      +21

Impacted Files	Coverage Δ
...enchmark/utils/valgrind_wrapper/compat_bindings.py	`0.00% <0.00%> (ø)`
...enchmark/utils/valgrind_wrapper/timer_interface.py	`48.90% <58.92%> (+23.11%)`	⬆️
torch/utils/benchmark/utils/timer.py	`85.88% <83.33%> (+7.05%)`	⬆️
torch/utils/benchmark/utils/common.py	`98.63% <97.43%> (+2.23%)`	⬆️
torch/utils/benchmark/__init__.py	`100.00% <100.00%> (ø)`
torch/utils/benchmark/utils/compare.py	`97.96% <100.00%> (+0.62%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f2e5ae4...386afb8. Read the comment docs.

dr-ci · 2020-10-09T03:37:43Z

💊 CI failures summary and remediations

As of commit 931d85b (more details on the Dr. CI page):

1/1 failures introduced in this PR---

1 failure not recognized by patterns:

Job	Step	Action
^{binary_linux_libtorch_3_7m_cpu_devtoolset7_shared-with-deps_build}	^Build	🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 74 times.

torch/utils/benchmark/utils/timer.py

ezyang · 2020-10-09T16:19:17Z

test/benchmark_utils/callgrind_artifacts.json

@@ -0,0 +1,1187 @@
+{


Are you sure you actually want to check this in? Need to keep reading to find out why this is used, but this seems a bit fragile.

It's to hermetically mock a collected Callgrind so some of the supporting functionality can be tested as a fast unit test.

I see now that this is a test fixture.

It would have been nice if this fixture were smaller. Not sure if this is actually possible, most of the lines are coming from Python.

torch/utils/benchmark/utils/common.py

ezyang · 2020-10-09T16:23:35Z

You say that it is strict safe, but I don't see adjustments to the mypy config to ensure these keep getting checked as strict. CI typechecking is important for ensuring people continue to keep things strict. If there are upstream type problems I suggest suppressing them.

ezyang · 2020-10-09T17:31:52Z

torch/utils/benchmark/utils/valgrind_wrapper/timer_interface.py

+            "counter()",
+            globals={"counter": pickle.loads(pickle.dumps(counter))}
+        ).timeit(20)
+        print(counter.value)  # Still 10


I must admit, I'm not terribly convinced by this argument. Yes, side effects run inside the timer may get disregarded. But there isn't really any reason to presuppose that the user cared about the side effects at all in the first place. After all, they're calling a timer on a piece of code in a loop. It's incredibly unlikely that they actually wanted to call the operation 10 times; if they are doing a side-effectful operation, they're just doing it to exercise some piece of code that they're interested in timing.

If the user doesn't care about the side effects, then blacklisting Tensor seems like it's just unnecessarily making people's life harder when they have some nontrivial setup that they don't want to have to handle inside Timer.

A better reason to block globals is if the serialization/deserialization process perturbs the representation of a tensor in such a way that would result in the timing to be different. This is not an idle concern; for example, if a tensor lives in pinned CPU memory, I'm reasonably certain this wouldn't get preserved by a dump, and that will change the performance of certain CUDA operations.

You don't have to do this in this PR, but some more thoughts: it seems better and more explicit to make the user say that they are doing some sort of serialization. Allow the transfer of primitive types (where the serialization is well defined) but then make the user actually do serialization/deserialization if they want to. It will make it more obvious that something is going on (and if there is a bug in the user's serialization code, it will save them a lot of heartbreak).

I've updated the language, and added a CopyIfCallgrind wrapper for users to declare that they're willing to have their classes serialized. I realized that there's a slight chicken and egg problem. You might want setup to execute before globals are loaded so you can setup the environment for unpickle to succeed, but you would want the reverse if you plan to use setup to revive bytes in globals. timeit has the latter behavior (I thought it was the former so I need to switch the codegen order), but that seems to imply that CopyIfCallgrind has to also allow an optional per-variable setup.

"model": benchmark_utils.CopyIfCallgrind( MyModule(), setup=f"""\ import sys sys.path.append({repr(os.path.split(os.path.abspath(__file__))[0])}) from test_benchmark_utils import MyModule """ )

Isn't pickle fun?

(Also because pickle recursively unpickles I don't think we can automagically generate the CopyIfCallgrind.setup code)

test/benchmark_utils/test_benchmark_utils.py

ezyang · 2020-10-09T17:39:52Z

test/benchmark_utils/test_benchmark_utils.py

+        }
+
+        with open("/tmp/test_callgrind_artifacts.json", "wt") as f:
+            json.dump(artifacts, f, indent=4)


The fixture for generating the test data should be actual code and be executed in CI (with some basic sanity test on the output) to prevent it from bitrotting.

It's now a proper function, though unlike expecttest it doesn't auto-regen since the diff can be quite large due to changed build dir prefixes.

ezyang · 2020-10-09T17:46:45Z

test/benchmark_utils/test_benchmark_utils.py

+        stats_no_data, stats_with_data = load_test_example()
+
+        self.assertEqual(stats_no_data.counts(), 8869966)
+        self.assertEqual(stats_no_data.counts(denoise=True), 8728096)


Should have instructions for how to update the test numbers (and check that they're right!) if you refresh the fixture

ezyang · 2020-10-09T17:50:01Z

test/benchmark_utils/test_benchmark_utils.py

+            ("pass", 8e-9),
+            ("cheap_fn()", 4e-6),
+            ("expensive_fn()", 20e-6),
+        )


Can't you just define this as a dictionary in the first place?

I'm always wary of adding mutable class fields, so it's just habit at this point.

robieta · 2020-10-09T22:59:25Z

I added a load_inline path for the binding macros so folks can back test on older versions of PyTorch. Just drop the entire torch.utils.benchmark folder into the old version and it should just work. It's a hack, but should be a useful shim for the transition period. (After which we can easily rip it out.)

CC @ailzhang @albanD

robieta · 2020-10-13T06:16:05Z

@ezyang I updated the way testing works. I have a sneaking suspicion that you'll either love it or hate it. (Hopefully the former...) The TL;DR is that for all the "string check heavy" tests, the test and regeneration pass are 99% the same, just swapping out store to golden file for check against golden file. One of the properties of the current tests that I wanted to preserve was the ability to read the tests to sanity check them. To that end the generation emits two artifacts per test: foo.json which the unit test actually runs against, and foo.txt which is formatted for human consumption. (emitted together so there's no drift.) Which also (hopefully) has the added benefit that the diffs for the human readable files will make it easier to review any future changes.

This of course all needs to be documented in the test file itself, but I figured I'd give you a chance to digest the high level approach while I write docstrings and type annotations.

torch/utils/benchmark/utils/timer.py

robieta · 2020-10-14T04:37:32Z

You say that it is strict safe, but I don't see adjustments to the mypy config to ensure these keep getting checked as strict. CI typechecking is important for ensuring people continue to keep things strict. If there are upstream type problems I suggest suppressing them.

Done. I had to change mypy-strict.ini to exclude torch (and numpy) to avoid transitive failures.

robieta · 2020-10-14T06:51:32Z

Test failure appears unrelated.

robieta · 2020-10-14T16:33:41Z

Clang-tidy build is failing due to an unrelated issue. #46315 seems like it should fix it, it just hasn't been picked up by fbcode/warm yet.

facebook-github-bot

@robieta has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ezyang · 2020-10-15T15:02:32Z

mypy-strict.ini

+    torch/utils/benchmark/utils/valgrind_wrapper/*.py
+
+[mypy-torch.utils.benchmark.utils.*]
+follow_imports = normal


Is this follow_imports line really necessary? (If it is, does that mean we also need it for tools.codegen.gen too?)

This is because of the

[mypy-torch.*] follow_imports = skip

block. I added a comment explaining why.

test/benchmark_utils/test_benchmark_utils.py

facebook-github-bot

@robieta has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

robieta · 2020-10-15T16:01:50Z

Thanks for the reviews!

facebook-github-bot

@robieta has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

robieta · 2020-10-15T17:14:01Z

binary_linux_libtorch_3_7m_cpu_devtoolset7_shared-with-deps_build and docker-pytorch-linux-xenial-py3-clang5-android-ndk-r19c failures are unrelated. (I observed that they happen on other PRs)

facebook-github-bot

@robieta has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-10-16T00:13:21Z

@robieta merged this pull request in dda95e6.

clean up benchmark utils

f7d3777

robieta requested review from ezyang and ngimel October 8, 2020 07:36

Taylor Robie added 2 commits October 8, 2020 08:47

delint

86ef7ad

Merge branch 'fbcode/warm' into gh/taylorrobie/timer_papercuts

31702a2

Taylor Robie added 2 commits October 8, 2020 16:10

fix tests and add docstrings

0fc8826

delint

399a271

ngimel reviewed Oct 9, 2020

View reviewed changes

add new test location to run_test.py

bc4c24c

ezyang reviewed Oct 9, 2020

View reviewed changes

torch/utils/benchmark/utils/common.py Show resolved Hide resolved

ezyang reviewed Oct 9, 2020

View reviewed changes

test/benchmark_utils/test_benchmark_utils.py Show resolved Hide resolved

ezyang reviewed Oct 9, 2020

View reviewed changes

Taylor Robie added 2 commits October 9, 2020 14:38

update docstrings

cd5a2e3

JIT callgrind bindings when back testing using collect_callgrind

aea5666

Taylor Robie added 4 commits October 9, 2020 16:00

delint

0563c9b

fix back test shim

36d6088

address PR comments for docs

c284256

automate artifact generation for benchmark utils tests

1329806

ngimel reviewed Oct 13, 2020

View reviewed changes

torch/utils/benchmark/utils/timer.py Outdated Show resolved Hide resolved

torch/utils/benchmark/utils/timer.py Outdated Show resolved Hide resolved

Taylor Robie added 2 commits October 13, 2020 18:45

delint and add timer files to mypy strict (let's see what breaks...)

02a9779

fix mypy strict checking

565811d

robieta mentioned this pull request Oct 14, 2020

[fx] make sure args/kwargs are immutable #46121

Closed

Merge branch 'fbcode/warm' into gh/taylorrobie/timer_papercuts

312f811

Merge branch 'fbcode/warm' into gh/taylorrobie/timer_papercuts

1c15e1a

facebook-github-bot reviewed Oct 14, 2020

View reviewed changes

whitelist is apparently now a banned term

386afb8

ezyang reviewed Oct 15, 2020

View reviewed changes

test/benchmark_utils/test_benchmark_utils.py Show resolved Hide resolved

ezyang approved these changes Oct 15, 2020

View reviewed changes

address PR comments

994e648

facebook-github-bot reviewed Oct 15, 2020

View reviewed changes

Merge branch 'fbcode/warm' into gh/taylorrobie/timer_papercuts

d025488

facebook-github-bot reviewed Oct 15, 2020

View reviewed changes

robieta pushed a commit that referenced this pull request Oct 15, 2020

patch #46023

4176132

Taylor Robie added 2 commits October 15, 2020 11:33

fix type annotation to avoid runtime crash

9292f0a

Merge branch 'fbcode/warm' into gh/taylorrobie/timer_papercuts

931d85b

facebook-github-bot reviewed Oct 15, 2020

View reviewed changes

facebook-github-bot closed this in dda95e6 Oct 15, 2020

robieta mentioned this pull request Oct 15, 2020

[v1.7.0] Release Tracker #45592

Closed

facebook-github-bot added the Merged label Oct 16, 2020

robieta deleted the gh/taylorrobie/timer_papercuts branch January 11, 2021 21:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More Timer refinement #46023

More Timer refinement #46023

robieta commented Oct 8, 2020 •

edited

robieta commented Oct 8, 2020

robieta commented Oct 8, 2020

robieta commented Oct 8, 2020

codecov bot commented Oct 9, 2020 •

edited

dr-ci bot commented Oct 9, 2020 •

edited by facebook-github-bot

ezyang Oct 9, 2020

robieta Oct 9, 2020

ezyang Oct 9, 2020

ezyang Oct 9, 2020

ezyang commented Oct 9, 2020

ezyang Oct 9, 2020

ezyang Oct 9, 2020

robieta Oct 13, 2020

robieta Oct 13, 2020

robieta Oct 13, 2020

ezyang Oct 9, 2020

robieta Oct 13, 2020

ezyang Oct 9, 2020

ezyang Oct 9, 2020

robieta Oct 13, 2020

robieta commented Oct 9, 2020

robieta commented Oct 13, 2020

robieta commented Oct 14, 2020

robieta commented Oct 14, 2020

robieta commented Oct 14, 2020

facebook-github-bot left a comment

ezyang Oct 15, 2020

robieta Oct 15, 2020

facebook-github-bot left a comment

robieta commented Oct 15, 2020

facebook-github-bot left a comment

robieta commented Oct 15, 2020

facebook-github-bot left a comment

facebook-github-bot commented Oct 16, 2020

More Timer refinement #46023

More Timer refinement #46023

Conversation

robieta commented Oct 8, 2020 • edited

robieta commented Oct 8, 2020

robieta commented Oct 8, 2020

robieta commented Oct 8, 2020

codecov bot commented Oct 9, 2020 • edited

Codecov Report

dr-ci bot commented Oct 9, 2020 • edited by facebook-github-bot

💊 CI failures summary and remediations

1 failure not recognized by patterns:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ezyang commented Oct 9, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robieta commented Oct 9, 2020

robieta commented Oct 13, 2020

robieta commented Oct 14, 2020

robieta commented Oct 14, 2020

robieta commented Oct 14, 2020

facebook-github-bot left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

robieta commented Oct 15, 2020

facebook-github-bot left a comment

Choose a reason for hiding this comment

robieta commented Oct 15, 2020

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Oct 16, 2020

robieta commented Oct 8, 2020 •

edited

codecov bot commented Oct 9, 2020 •

edited

dr-ci bot commented Oct 9, 2020 •

edited by facebook-github-bot