Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[XLA:CPU] Improve Trace emitter #10164

Closed

Conversation

sachinmuradi
Copy link
Contributor

This PR improves tracing a bit.

Without this PR, you would see operations in trace view as :
image

This way , it s difficult to see a) target of custom call b) can't group operations based on type of hlo operation and hence makes op list huge

This PR addresses these problems and now you should see tensorboard trace view as :
image

@github-actions github-actions bot added the kokoro:force-run Forces CI to rerun label Feb 29, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Forces CI to rerun label Feb 29, 2024
@ddunl ddunl requested a review from penpornk March 12, 2024 21:38
@ddunl
Copy link
Member

ddunl commented Mar 12, 2024

LGTM but interested in @penpornk 's review. Thanks for the PR!!

@kamaljeeti
Copy link
Contributor

Hi @penpornk , can you review this PR? Thanks.

Copy link
Member

@penpornk penpornk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR and sorry for the delay! This fell through the crack.

// For custom call, passing custom call target is more informative
hlo_name_str = hlo->custom_call_target();
} else {
hlo_name_str = HloOpcodeString(hlo->opcode());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this part since it would mean we can't tell the difference between each of the same HLO instructions anymore, e.g., dot.123 vs dot.42, etc. Maybe it would be better to improve the trace summarizer instead?

auto hlo_name_str = hlo->name();
if (hlo->opcode() == HloOpcode::kCustomCall) {
// For custom call, passing custom call target is more informative
hlo_name_str = hlo->custom_call_target();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For custom call, it's probably still good to know the custom call id. Having custom-call.<id> as prefix should still be helpful, e.g., custom-call.46.__onednn$matmul.

@github-actions github-actions bot added the kokoro:force-run Forces CI to rerun label Mar 29, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Forces CI to rerun label Mar 29, 2024
@sachinmuradi
Copy link
Contributor Author

@penpornk Thank you for review, I modified the code a bit , to encode more information to TraceMe API, hope this addresses your comments, let me know your thoughts on this.

I am trying to avoid the long list generated at ops summary table in trace view, which sometimes is very slow to scroll if we have many hlo ops. It may be better if we just show the hlo opcodes in table and show which type of op is spending more time in execution, this is similar to what I have seen when using Tensorflow - Tensorboard trace view.

As to distinguish the hlo op name, I encoded the hlo name and some metadata information as well, that way user will be able to point the exact hlo op name for that op type. Here is an example screenshot :

1) When clicked on reduce-window op in trace (TF-xla example) :

image

2) When clicked on onednn matmul (TF-XLA example) :

image

3) Additional example of onednn softmax with JAX-XLA :

image

Copy link
Member

@penpornk penpornk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Thank you very much!

@penpornk penpornk added the kokoro:force-run Forces CI to rerun label Apr 8, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Forces CI to rerun label Apr 8, 2024
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 8, 2024
Imported from GitHub PR openxla/xla#10164

This PR improves tracing a bit.

Without this PR, you would see operations in trace view as :
![image](https://github.com/openxla/xla/assets/43043975/f5ae19c9-851d-44fe-bf75-9a456425a387)

This way , it s difficult to see a) target of custom call b) can't group operations based on type of hlo operation and hence makes op list huge

This PR addresses these problems and now you should see tensorboard trace view as :
![image](https://github.com/openxla/xla/assets/43043975/5a10b3ef-ea61-4ac5-90e3-92ead1354aa4)

Copybara import of the project:

--
8cd6363890107ce7af2ed3630d577ad4d20a745b by Sachin Muradi <sachin.muradi@intel.com>:

Update Trace emitter

--
a2813ad7439da8934b98b12623193c3b120eb9ca by Sachin Muradi <sachin.muradi@intel.com>:

more update

Merging this change closes #10164

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#10164 from Intel-tensorflow:sachin/tracemeupdate a2813ad7439da8934b98b12623193c3b120eb9ca
PiperOrigin-RevId: 622874494
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 9, 2024
Imported from GitHub PR openxla/xla#10164

This PR improves tracing a bit.

Without this PR, you would see operations in trace view as :
![image](https://github.com/openxla/xla/assets/43043975/f5ae19c9-851d-44fe-bf75-9a456425a387)

This way , it s difficult to see a) target of custom call b) can't group operations based on type of hlo operation and hence makes op list huge

This PR addresses these problems and now you should see tensorboard trace view as :
![image](https://github.com/openxla/xla/assets/43043975/5a10b3ef-ea61-4ac5-90e3-92ead1354aa4)

Copybara import of the project:

--
8cd6363890107ce7af2ed3630d577ad4d20a745b by Sachin Muradi <sachin.muradi@intel.com>:

Update Trace emitter

--
a2813ad7439da8934b98b12623193c3b120eb9ca by Sachin Muradi <sachin.muradi@intel.com>:

more update

Merging this change closes #10164

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#10164 from Intel-tensorflow:sachin/tracemeupdate a2813ad7439da8934b98b12623193c3b120eb9ca
PiperOrigin-RevId: 622874494
@copybara-service copybara-service bot closed this in 9945521 Apr 9, 2024
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 9, 2024
FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#10164 from Intel-tensorflow:sachin/tracemeupdate a2813ad7439da8934b98b12623193c3b120eb9ca
PiperOrigin-RevId: 622356332
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 9, 2024
FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#10164 from Intel-tensorflow:sachin/tracemeupdate a2813ad7439da8934b98b12623193c3b120eb9ca
PiperOrigin-RevId: 623012044
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 9, 2024
FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#10164 from Intel-tensorflow:sachin/tracemeupdate a2813ad7439da8934b98b12623193c3b120eb9ca
PiperOrigin-RevId: 622332776
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 9, 2024
FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#10164 from Intel-tensorflow:sachin/tracemeupdate a2813ad7439da8934b98b12623193c3b120eb9ca
PiperOrigin-RevId: 622357346
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 9, 2024
FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#10164 from Intel-tensorflow:sachin/tracemeupdate a2813ad7439da8934b98b12623193c3b120eb9ca
PiperOrigin-RevId: 622513596
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 9, 2024
Imported from GitHub PR openxla/xla#10164

This PR improves tracing a bit.

Without this PR, you would see operations in trace view as :
![image](https://github.com/openxla/xla/assets/43043975/f5ae19c9-851d-44fe-bf75-9a456425a387)

This way , it s difficult to see a) target of custom call b) can't group operations based on type of hlo operation and hence makes op list huge

This PR addresses these problems and now you should see tensorboard trace view as :
![image](https://github.com/openxla/xla/assets/43043975/5a10b3ef-ea61-4ac5-90e3-92ead1354aa4)

Copybara import of the project:

--
8cd6363890107ce7af2ed3630d577ad4d20a745b by Sachin Muradi <sachin.muradi@intel.com>:

Update Trace emitter

--
a2813ad7439da8934b98b12623193c3b120eb9ca by Sachin Muradi <sachin.muradi@intel.com>:

more update

Merging this change closes #10164

PiperOrigin-RevId: 623095394
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants