Skip to content

Conversation

@colesbury
Copy link
Member

Summary: Shrink binary size to reduce relocation overflows. The most important change is to split intrusive_ptr::reset_() into two functions and mark the bigger one as C10_NOINLINE.

Differential Revision: D87308588

@pytorch-bot
Copy link

pytorch-bot bot commented Nov 18, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/168080

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 5a562a7 with merge base a4e0720 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-codesync
Copy link

meta-codesync bot commented Nov 18, 2025

@colesbury has exported this pull request. If you are a Meta employee, you can view the originating Diff in D87308588.

@meta-codesync
Copy link

meta-codesync bot commented Nov 18, 2025

@colesbury has imported this pull request. If you are a Meta employee, you can view this in D87308588.

Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good if we can't measure impact on practical workloads!

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 18, 2025
if (C10_UNLIKELY(
detail::has_pyobject(combined) &&
detail::refcount(combined) == 2)) {
if (detail::has_pyobject(combined) && detail::refcount(combined) == 2) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why drop unlikely?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In increases the binary size without a corresponding improvement in performance. There wasn't really a good reason to add it in the first place.

@Skylion007
Copy link
Collaborator

If we really care about binary size, way faster win would just be enabling Link Time Optimization on the binaries. I have an open issue for this, last got blocked by IntelOneLib/DnnLib having issues with really large statically linked library on MSVC and old clang compiler issues (the later should be fixed now?)

Summary:
Shrink binary size to reduce relocation overflows. The most important change is to split `intrusive_ptr::reset_()` into two functions and mark the bigger one as `C10_NOINLINE`.


Reviewed By: albanD

Differential Revision: D87308588

Pulled By: colesbury
@colesbury
Copy link
Member Author

Ugh... I messed up the CI by pushing the same commit to another PR, so re-exporting now

friend class pybind11::class_;

void retain_() {
void retain_() noexcept {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not actually noexcept in debug build 🤣

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that noexcept means that any escaping exceptions aren't recoverable (they trigger std::terminate). I don't think it means that the function never throws an exception.

The other non-trivial functions here (including reset_() and ~intrusive_ptr()) and can also throw exceptions in rare circumstances, such as broken assertions.

@facebook-github-bot
Copy link
Contributor

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@colesbury colesbury deleted the export-D87308588 branch November 19, 2025 05:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request fb-exported Merged meta-exported topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants