[XLA:CPU] Improve Trace emitter #10164

sachinmuradi · 2024-02-29T23:51:50Z

This PR improves tracing a bit.

Without this PR, you would see operations in trace view as :

This way , it s difficult to see a) target of custom call b) can't group operations based on type of hlo operation and hence makes op list huge

This PR addresses these problems and now you should see tensorboard trace view as :

ddunl · 2024-03-12T21:39:25Z

LGTM but interested in @penpornk 's review. Thanks for the PR!!

kamaljeeti · 2024-03-25T17:08:31Z

Hi @penpornk , can you review this PR? Thanks.

penpornk

Thank you for the PR and sorry for the delay! This fell through the crack.

penpornk · 2024-03-25T19:19:12Z

xla/service/cpu/ir_emitter.cc

+    // For custom call, passing custom call target is more informative
+    hlo_name_str = hlo->custom_call_target();
+  } else {
+    hlo_name_str = HloOpcodeString(hlo->opcode());


I'm not sure about this part since it would mean we can't tell the difference between each of the same HLO instructions anymore, e.g., dot.123 vs dot.42, etc. Maybe it would be better to improve the trace summarizer instead?

penpornk · 2024-03-25T19:29:44Z

xla/service/cpu/ir_emitter.cc

+  auto hlo_name_str = hlo->name();
+  if (hlo->opcode() == HloOpcode::kCustomCall) {
+    // For custom call, passing custom call target is more informative
+    hlo_name_str = hlo->custom_call_target();


For custom call, it's probably still good to know the custom call id. Having custom-call.<id> as prefix should still be helpful, e.g., custom-call.46.__onednn$matmul.

sachinmuradi · 2024-03-29T20:32:53Z

@penpornk Thank you for review, I modified the code a bit , to encode more information to TraceMe API, hope this addresses your comments, let me know your thoughts on this.

I am trying to avoid the long list generated at ops summary table in trace view, which sometimes is very slow to scroll if we have many hlo ops. It may be better if we just show the hlo opcodes in table and show which type of op is spending more time in execution, this is similar to what I have seen when using Tensorflow - Tensorboard trace view.

As to distinguish the hlo op name, I encoded the hlo name and some metadata information as well, that way user will be able to point the exact hlo op name for that op type. Here is an example screenshot :

1) When clicked on reduce-window op in trace (TF-xla example) :

2) When clicked on onednn matmul (TF-XLA example) :

3) Additional example of onednn softmax with JAX-XLA :

penpornk

This is great! Thank you very much!

Imported from GitHub PR openxla/xla#10164 This PR improves tracing a bit. Without this PR, you would see operations in trace view as : ![image](https://github.com/openxla/xla/assets/43043975/f5ae19c9-851d-44fe-bf75-9a456425a387) This way , it s difficult to see a) target of custom call b) can't group operations based on type of hlo operation and hence makes op list huge This PR addresses these problems and now you should see tensorboard trace view as : ![image](https://github.com/openxla/xla/assets/43043975/5a10b3ef-ea61-4ac5-90e3-92ead1354aa4) Copybara import of the project: -- 8cd6363890107ce7af2ed3630d577ad4d20a745b by Sachin Muradi <sachin.muradi@intel.com>: Update Trace emitter -- a2813ad7439da8934b98b12623193c3b120eb9ca by Sachin Muradi <sachin.muradi@intel.com>: more update Merging this change closes #10164 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#10164 from Intel-tensorflow:sachin/tracemeupdate a2813ad7439da8934b98b12623193c3b120eb9ca PiperOrigin-RevId: 622874494

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#10164 from Intel-tensorflow:sachin/tracemeupdate a2813ad7439da8934b98b12623193c3b120eb9ca PiperOrigin-RevId: 622356332

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#10164 from Intel-tensorflow:sachin/tracemeupdate a2813ad7439da8934b98b12623193c3b120eb9ca PiperOrigin-RevId: 623012044

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#10164 from Intel-tensorflow:sachin/tracemeupdate a2813ad7439da8934b98b12623193c3b120eb9ca PiperOrigin-RevId: 622332776

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#10164 from Intel-tensorflow:sachin/tracemeupdate a2813ad7439da8934b98b12623193c3b120eb9ca PiperOrigin-RevId: 622357346

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#10164 from Intel-tensorflow:sachin/tracemeupdate a2813ad7439da8934b98b12623193c3b120eb9ca PiperOrigin-RevId: 622513596

Imported from GitHub PR openxla/xla#10164 This PR improves tracing a bit. Without this PR, you would see operations in trace view as : ![image](https://github.com/openxla/xla/assets/43043975/f5ae19c9-851d-44fe-bf75-9a456425a387) This way , it s difficult to see a) target of custom call b) can't group operations based on type of hlo operation and hence makes op list huge This PR addresses these problems and now you should see tensorboard trace view as : ![image](https://github.com/openxla/xla/assets/43043975/5a10b3ef-ea61-4ac5-90e3-92ead1354aa4) Copybara import of the project: -- 8cd6363890107ce7af2ed3630d577ad4d20a745b by Sachin Muradi <sachin.muradi@intel.com>: Update Trace emitter -- a2813ad7439da8934b98b12623193c3b120eb9ca by Sachin Muradi <sachin.muradi@intel.com>: more update Merging this change closes #10164 PiperOrigin-RevId: 623095394

Update Trace emitter

8cd6363

github-actions bot added the kokoro:force-run Forces CI to rerun label Feb 29, 2024

github-actions bot assigned kamaljeeti and xla-rotation Feb 29, 2024

kokoro-team removed the kokoro:force-run Forces CI to rerun label Feb 29, 2024

kamaljeeti requested a review from sergeykozub March 1, 2024 07:25

ddunl requested a review from penpornk March 12, 2024 21:38

penpornk suggested changes Mar 25, 2024

View reviewed changes

sachinmuradi added 2 commits March 29, 2024 12:28

Merge branch 'main' into sachin/tracemeupdate

7f7010c

more update

a2813ad

github-actions bot added the kokoro:force-run Forces CI to rerun label Mar 29, 2024

kokoro-team removed the kokoro:force-run Forces CI to rerun label Mar 29, 2024

penpornk approved these changes Apr 8, 2024

View reviewed changes

penpornk added the kokoro:force-run Forces CI to rerun label Apr 8, 2024

kokoro-team removed the kokoro:force-run Forces CI to rerun label Apr 8, 2024

copybara-service bot mentioned this pull request Apr 8, 2024

PR #10164: [XLA:CPU] Improve Trace emitter tensorflow/tensorflow#65240

Closed

copybara-service bot closed this in 9945521 Apr 9, 2024

copybara-service bot mentioned this pull request Apr 9, 2024

Automated Code Change tensorflow/tensorflow#65294

Closed

copybara-service bot mentioned this pull request Apr 9, 2024

Automated Code Change tensorflow/tensorflow#65298

Merged

copybara-service bot mentioned this pull request Apr 9, 2024

Automated Code Change tensorflow/tensorflow#65181

Merged

copybara-service bot mentioned this pull request Apr 9, 2024

Automated Code Change tensorflow/tensorflow#65307

Merged

copybara-service bot mentioned this pull request Apr 9, 2024

Automated Code Change tensorflow/tensorflow#65198

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[XLA:CPU] Improve Trace emitter #10164

[XLA:CPU] Improve Trace emitter #10164

sachinmuradi commented Feb 29, 2024

ddunl commented Mar 12, 2024

kamaljeeti commented Mar 25, 2024

penpornk left a comment

penpornk Mar 25, 2024

penpornk Mar 25, 2024

sachinmuradi commented Mar 29, 2024

penpornk left a comment

[XLA:CPU] Improve Trace emitter #10164

[XLA:CPU] Improve Trace emitter #10164

Conversation

sachinmuradi commented Feb 29, 2024

ddunl commented Mar 12, 2024

kamaljeeti commented Mar 25, 2024

penpornk left a comment

Choose a reason for hiding this comment

penpornk Mar 25, 2024

Choose a reason for hiding this comment

penpornk Mar 25, 2024

Choose a reason for hiding this comment

sachinmuradi commented Mar 29, 2024

penpornk left a comment

Choose a reason for hiding this comment