[Pass] Profiling TVM compiler passes #7500

altanh · 2021-02-22T20:15:18Z

This is a basic prototype, looking to have some discussion about the design and what API people would like. This profiler handles nested passes.

cc @tkonolige @masahi @mbrookhart @jroesch

altanh · 2021-02-22T20:32:22Z

example output:

tests/python/frontend/pytorch/test_object_detection.py sequential: 16636769us (94.70%)
	RemoveUnusedFunctions: 747us (0.00%)
	ToBasicBlockNormalForm: 16075us (0.10%)
	sequential: 179031us (1.08%)
		InferType: 38727us (21.63%)
		Legalize: 50240us (28.06%)
			InferType: 39085us (77.80%)
		InferType: 39895us (22.28%)
		Legalize: 50149us (28.01%)
			InferType: 38553us (76.88%)
	InferType: 40402us (0.24%)
	Legalize: 67676us (0.41%)
		InferType: 39490us (58.35%)
	EtaExpand: 10162us (0.06%)
	InferType: 40605us (0.24%)
	SimplifyInference: 49953us (0.30%)
		InferType: 39228us (78.53%)
	InferType: 40696us (0.24%)
	EliminateCommonSubexpr: 117125us (0.70%)
		InferType: 36413us (31.09%)
	InferType: 36998us (0.22%)
	SimplifyExpr: 424215us (2.55%)
		InferType: 73673us (17.37%)
		InferType: 74374us (17.53%)
		InferType: 74723us (17.61%)
		InferType: 75850us (17.88%)
		InferType: 37884us (8.93%)
	InlinePrimitives: 58960us (0.35%)
		Inline: 10235us (17.36%)
		DeadCodeElimination: 48721us (82.63%)
			InferType: 37255us (76.47%)
	FoldConstant: 3839048us (23.08%)
		sequential: 1074us (0.03%)
			InferType: 710us (66.11%)
			FuseOps: 158us (14.71%)
				InferType: 88us (55.70%)
			ToANormalForm: 61us (5.68%)
			InferType: 137us (12.76%)
			...

mbrookhart · 2021-02-22T21:18:48Z

After a first pass, this looks mostly good to me. Any idea how much overhead this brings to running the passes?

tkonolige · 2021-02-22T21:28:14Z

Is the percentage after the pass relative to the global scope or to the parent?

altanh · 2021-02-22T21:28:32Z

After a first pass, this looks mostly good to me. Any idea how much overhead this brings to running the passes?

I'll do some basic measurements but it should be pretty negligible. With a runtime flag, the worse case would be like 2 additional boolean comparisons per pass invocation when profiling is disabled.

altanh · 2021-02-22T21:31:21Z

Is the percentage after the pass relative to the global scope or to the parent?

Parent, that made the most sense to me but I can see global scope also being useful (maybe more so actually). This would be easy to change. Something that I think people will generally want is a way of exporting the profiler data across FFI to enable more flexible Python-based data analysis (rather than e.g. having to parse stdout or a string), although I'm not sure how the data should be represented.

edit: I could probably just patch up the PassProfile object to use TVM's FFI Object and ObjectRef interface, but not sure how the C++ chrono types would be handled.

tkonolige

Looks mostly good!

src/ir/transform.cc

altanh · 2021-02-24T19:44:15Z

After a first pass, this looks mostly good to me. Any idea how much overhead this brings to running the passes?

I did a few compilation runs on a large model that takes ~50 seconds and didn't notice any large performance hits, seemed to be mostly hidden by normal compilation variance. In any case, I'm disabling the profiling by default and added API to enable/disable as needed. I think this should be good to go for now.

altanh · 2021-02-24T19:48:01Z

I updated the printer to additionally show time spent in the pass itself (excluding sub-passes), along with percentage relative to the total time. I described the exact formatting in the Python docstring, but open to changing. Also, long term I'm planning on exposing the profiling data through the Object FFI so that users can customize output/analysis, but I'll do that in a separate PR.

Here's new example output:

InferType: 242us [242us] (0.03%; 0.03%)
InferType: 278us [278us] (0.04%; 0.04%)
InferType: 2501us [2501us] (0.34%; 0.34%)
sequential: 1us [1us] (0.00%; 0.00%)
sequential: 678773us [90us] (91.82%; 91.82%)
	RemoveUnusedFunctions: 92us [92us] (0.01%; 0.01%)
	ToBasicBlockNormalForm: 1219us [1219us] (0.16%; 0.18%)
	sequential: 11724us [12us] (1.59%; 1.73%)
		InferType: 2573us [2573us] (0.35%; 21.95%)
		Legalize: 3059us [738us] (0.41%; 26.09%)
			InferType: 2322us [2322us] (0.31%; 75.89%)
		InferType: 2619us [2619us] (0.35%; 22.34%)
		Legalize: 3460us [865us] (0.47%; 29.51%)
			InferType: 2595us [2595us] (0.35%; 75.00%)
	InferType: 2783us [2783us] (0.38%; 0.41%)
	Legalize: 4525us [2064us] (0.61%; 0.67%)
		InferType: 2461us [2461us] (0.33%; 54.38%)

zhiics

Thank for the nice work. Could you add a unit test to it so that people can easily know how to use the profiler?

altanh · 2021-02-25T00:42:54Z

Thank for the nice work. Could you add a unit test to it so that people can easily know how to use the profiler?

Thanks, how should I do this? I can add something like tests/python/relay/test_pass_profiler.py that just shows how to use it but there isn't really a "correctness" property I can check. I'll go ahead and do this but let me know if you have something more specific.

zhiics · 2021-02-25T02:52:44Z

Thank for the nice work. Could you add a unit test to it so that people can easily know how to use the profiler?

Thanks, how should I do this? I can add something like tests/python/relay/test_pass_profiler.py that just shows how to use it but there isn't really a "correctness" property I can check. I'll go ahead and do this but let me know if you have something more specific.

Yeah, this should be fine. Is it possible to assert that the passes being observed are printed?

altanh · 2021-02-25T06:25:54Z

Yeah, this should be fine. Is it possible to assert that the passes being observed are printed?

Hmm, I think I'll change the API to return a String rather than printing to stdout. This should make things more flexible and I'll be able to check the output for the passes in the unit test. I will do this tomorrow- thanks!

altanh · 2021-03-01T16:19:08Z

bump @zhiics, sorry for the delay

This should be good for merging now

* basic pass profiler prototype * allow enable/disable of pass profiling * lint * add example pass profiler usage as test * render pass profiles to String instead of stdout

basic pass profiler prototype

f847d38

tkonolige approved these changes Feb 23, 2021

View reviewed changes

src/ir/transform.cc Outdated Show resolved Hide resolved

src/ir/transform.cc Outdated Show resolved Hide resolved

allow enable/disable of pass profiling

0aab617

altanh marked this pull request as ready for review February 24, 2021 19:44

altanh changed the title ~~[WIP][Pass] Profiling TVM compiler passes~~ [Pass] Profiling TVM compiler passes Feb 24, 2021

lint

95b0af0

jroesch approved these changes Feb 24, 2021

View reviewed changes

altanh mentioned this pull request Feb 24, 2021

[Profiling] Unify profiling into standardized formats/APIs #7523

Closed

mbrookhart approved these changes Feb 24, 2021

View reviewed changes

zhiics reviewed Feb 24, 2021

View reviewed changes

add example pass profiler usage as test

9696bf8

render pass profiles to String instead of stdout

64b304f

jroesch merged commit 3a02e0b into apache:main Mar 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pass] Profiling TVM compiler passes #7500

[Pass] Profiling TVM compiler passes #7500

altanh commented Feb 22, 2021

altanh commented Feb 22, 2021

mbrookhart commented Feb 22, 2021

tkonolige commented Feb 22, 2021

altanh commented Feb 22, 2021

altanh commented Feb 22, 2021 •

edited

Loading

tkonolige left a comment

altanh commented Feb 24, 2021

altanh commented Feb 24, 2021

zhiics left a comment

altanh commented Feb 25, 2021

zhiics commented Feb 25, 2021

altanh commented Feb 25, 2021

altanh commented Mar 1, 2021

[Pass] Profiling TVM compiler passes #7500

[Pass] Profiling TVM compiler passes #7500

Conversation

altanh commented Feb 22, 2021

altanh commented Feb 22, 2021

mbrookhart commented Feb 22, 2021

tkonolige commented Feb 22, 2021

altanh commented Feb 22, 2021

altanh commented Feb 22, 2021 • edited Loading

tkonolige left a comment

Choose a reason for hiding this comment

altanh commented Feb 24, 2021

altanh commented Feb 24, 2021

zhiics left a comment

Choose a reason for hiding this comment

altanh commented Feb 25, 2021

zhiics commented Feb 25, 2021

altanh commented Feb 25, 2021

altanh commented Mar 1, 2021

altanh commented Feb 22, 2021 •

edited

Loading