Skip to content

Conversation

wfu-fb
Copy link
Contributor

@wfu-fb wfu-fb commented Jan 25, 2024

There is a need for users to pick their own addr2line binary in their deployment due to reasons like default addr2line being too slow etc... This option would allow user quickly experiment other alternatives.

@wfu-fb wfu-fb requested a review from aaronenyeshi as a code owner January 25, 2024 23:06
Copy link

pytorch-bot bot commented Jan 25, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/118328

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 2b03626 with merge base 25f7219 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link

linux-foundation-easycla bot commented Jan 25, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

Copy link
Contributor

@zdevito zdevito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making it possible to use faster addr2line implementations. I have a few minor comments below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Torch already has some built in functions to get environment variables. Is one of them appropriate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, only reason I did this is to copy the style in torch/csrc/utils/cpp_stacktraces.cpp. Will be happy to use these built-ins, which one you are referring to here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of ways this binary might not work (mightn't be a file, might not be executable, might not exist). But this only checks for one. It might be preferable to just let the call to addr2line fail when invoked rather than try to enumerate all possible failures here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was also wondering what's the best expectation of this "precheck" step when writing this part. Also I am wondering if we should do this limited scope precheck in the very beginning of the torch initialization phase, because I want to best mitigate the case that after a long run in the end when we have something interesting to dump only to find out we had a typo in the binary path which ruined the chance to collect info. I was originally planning to fail the execution completely on the spot to make it "fail fast" instead of the current falling back to the default addr2line with a few warning lines. Curious to see your take here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument here can just be replaced with addr2line_binary_

@XilunWu
Copy link
Contributor

XilunWu commented Jan 26, 2024

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased main onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout main && git pull --rebase)

@wfu-fb
Copy link
Contributor Author

wfu-fb commented Jan 27, 2024

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 27, 2024
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@wfu-fb
Copy link
Contributor Author

wfu-fb commented Jan 28, 2024

@pytorchbot label "topic: not user facing"

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Jan 28, 2024
@aaronenyeshi
Copy link
Member

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

jeffdaily pushed a commit to ROCm/pytorch that referenced this pull request Feb 8, 2024
There is a need for users to pick their own addr2line binary in their deployment due to reasons like default addr2line being too slow etc... This option would allow user quickly experiment other alternatives.
Pull Request resolved: pytorch#118328
Approved by: https://github.com/zdevito, https://github.com/aaronenyeshi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants