[fx/profiler] debug the fx.profiler / add an example test script for fx.profiler #1730

super-dainiu · 2022-10-18T07:48:10Z

What's new?

I provide an example script for the fx.profiler performance test. It is marked as skip.

Remarks

After I upgraded transformers to the newest version, GPT-2 of hugging face no longer uses torch.nn.MultiheadAttention for transformer blocks. So the torch.nn.functional.softmax caused some unexpecting problems. They also used torch.finfo object, which is weird. I managed to fix them.

Tests

super-dainiu · 2022-10-18T07:48:46Z

colossalai/fx/profiler/memory.py

+            return n.target in [torch.nn.functional.relu, torch.nn.functional.softmax]
        elif n.op == 'call_module':
-            return type(n.graph.owning_module.get_submodule(n.target)) in [torch.nn.ReLU]
+            return type(n.graph.owning_module.get_submodule(n.target)) in [torch.nn.ReLU, torch.nn.Softmax]


as I mentioned, softmax is another relu_like node

I think this is gonna confuse the future maintainer, what is a relu_like node?

I will think about this later. Unfortunately, I am also confused lol. Maybe I should add this to constants.py?

at least you should define this category of ops so that its semantics is clear.

Okay, I will make some comments there

colossalai/fx/profiler/tensor.py

colossalai/fx/profiler/profiler.py

FrankLeeeee · 2022-10-18T07:55:38Z

tests/test_fx/test_profiler/test_profiler_meta_info_prop.py

+    return forward_mem, param_mem
+
+
+@pytest.mark.skip("Test for performance, no need for CI")


Mark this with colossalai.testing.run_on_environment_flag so that we can still run the test in the future without code change.

Should it be colossalai.testing.run_on_environment_flag('fx.profiler')?

FrankLeeeee · 2022-10-18T07:55:47Z

tests/test_fx/test_profiler/test_profiler_meta_info_prop.py

+        del model, gm
+
+
+@pytest.mark.skip("Test for performance, no need for CI")


This is the same.

super-dainiu · 2022-10-18T08:55:53Z

colossalai/fx/profiler/profiler.py

+        if target in RELU_LIKE_OPS:
+            do_not_cache = True


Here I also did some modification

super-dainiu · 2022-10-18T08:56:16Z

colossalai/fx/profiler/memory.py

+        """Check if a node is a ReLU-like node.
+        ReLU-like nodes have the following properties:
+        - They are either `call_function` or `call_module`
+        - Their output tensors are directly saved for backward
+        - Their input tensors are not saved for backward
+
+        An example is `torch.nn.functional.softmax` which has (forward + backward):
+        def forward(self, input_2):
+            _softmax_default = torch.ops.aten._softmax.default(input_2, None, None);  input_2 = None
+            zeros_like_default = torch.ops.aten.zeros_like.default(_softmax_default, dtype = None, layout = None, device = None, pin_memory = None)
+            detach_default = torch.ops.aten.detach.default(_softmax_default);  _softmax_default = None
+            _softmax_backward_data_default = torch.ops.aten._softmax_backward_data.default(zeros_like_default, detach_default, None, None);  zeros_like_default = detach_default = None
+            detach_default_1 = torch.ops.aten.detach.default(_softmax_backward_data_default);  _softmax_backward_data_default = None
+            detach_default_2 = torch.ops.aten.detach.default(detach_default_1);  detach_default_1 = None
+
+        Args:
+            n (Node): A node from the graph
+
+        Returns:
+            bool: Whether the node is a ReLU-like node
+        """


A docstring is added

FrankLeeeee · 2022-10-19T04:07:42Z

colossalai/fx/profiler/constants.py

+RELU_LIKE_OPS = [
+    torch.nn.functional.relu,
+    torch.nn.functional.softmax,
+]


Still you should define a term for this kind of ops, relu_like does not suggest any useful info.

Or call them something like OUTPUT_SAVED_OPS?

I put a docstring in the function which used this constant to illustrate.

Yup I see it. Just think it could be clearer. I can accept it at the moment but hope it can be improved.

Okay, I will come back to refactoring the profiler after @Cypher30 finishes the backward estimation. I already have some ideas to remove some of these constants without affecting the performance.

super-dainiu · 2022-10-19T05:47:23Z

colossalai/fx/profiler/memory.py

@@ -71,14 +74,35 @@ def calculate_fwd_tmp(n: Node) -> int:
        fwd_tmp (int): the result of `fwd_tmp`
    """

-    def is_relu_node(n: Node) -> bool:
+    def is_relu_like_node(n: Node) -> bool:


And it's here

super-dainiu added 3 commits October 18, 2022 14:51

[fx/profiler] fix.

505da7b

[fx/profiler] add test.

9ab8b2d

[fx] fix file names.

f872a91

super-dainiu requested review from Cypher30 and FrankLeeeee October 18, 2022 07:48

super-dainiu commented Oct 18, 2022

View reviewed changes

colossalai/fx/profiler/tensor.py Show resolved Hide resolved

super-dainiu commented Oct 18, 2022

View reviewed changes

colossalai/fx/profiler/profiler.py Show resolved Hide resolved

FrankLeeeee reviewed Oct 18, 2022

View reviewed changes

super-dainiu added 3 commits October 18, 2022 16:43

[fx] add docstring and comment.

aea3bb4

[fx] polish profiler.py.

b4827bd

[fx] fix profiler.

a3949c7

super-dainiu commented Oct 18, 2022

View reviewed changes

super-dainiu requested a review from FrankLeeeee October 18, 2022 08:56

super-dainiu added Run Build and Test and removed Run Build and Test labels Oct 18, 2022

[fx] fix import errors.

2b2cf10

super-dainiu added the Run Build and Test label Oct 18, 2022

super-dainiu added 2 commits October 18, 2022 17:06

[fx] fix profiler.

23396e8

[fx] fix profiler.

8571313

FrankLeeeee reviewed Oct 19, 2022

View reviewed changes

super-dainiu commented Oct 19, 2022

View reviewed changes

super-dainiu added 2 commits October 19, 2022 13:55

[fx] fix names.

c70ace4

[fx] fix names.

77bf8fd

FrankLeeeee approved these changes Oct 19, 2022

View reviewed changes

super-dainiu merged commit 30874f1 into hpcaitech:main Oct 19, 2022

super-dainiu deleted the debug/profiler branch October 19, 2022 06:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fx/profiler] debug the fx.profiler / add an example test script for fx.profiler #1730

[fx/profiler] debug the fx.profiler / add an example test script for fx.profiler #1730

super-dainiu commented Oct 18, 2022

super-dainiu Oct 18, 2022

FrankLeeeee Oct 18, 2022

super-dainiu Oct 18, 2022

FrankLeeeee Oct 18, 2022

super-dainiu Oct 18, 2022

FrankLeeeee Oct 18, 2022

super-dainiu Oct 18, 2022

FrankLeeeee Oct 18, 2022

super-dainiu Oct 18, 2022

super-dainiu Oct 18, 2022

FrankLeeeee Oct 19, 2022

FrankLeeeee Oct 19, 2022

super-dainiu Oct 19, 2022

FrankLeeeee Oct 19, 2022

super-dainiu Oct 19, 2022 •

edited

Loading

super-dainiu Oct 19, 2022

		return forward_mem, param_mem


		@pytest.mark.skip("Test for performance, no need for CI")

		del model, gm


		@pytest.mark.skip("Test for performance, no need for CI")

[fx/profiler] debug the fx.profiler / add an example test script for fx.profiler #1730

[fx/profiler] debug the fx.profiler / add an example test script for fx.profiler #1730

Conversation

super-dainiu commented Oct 18, 2022

What's new?

Remarks

Tests

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

super-dainiu Oct 19, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

super-dainiu Oct 19, 2022 •

edited

Loading