Skip to content

Conversation

benjaminglass1
Copy link
Collaborator

@benjaminglass1 benjaminglass1 commented May 28, 2025

Stack from ghstack (oldest at bottom):

The main change of this commit is building cpp_wrapper code asynchronously, with a thread for the wrapper code, a thread for the kernel code, and a final thread to link. This improves build speeds, even while enabling an LTO link to improve performance.

Additionally, significantly reduce the cold-start time for detecting compatible precompiled headers. We had accidentally preprocessed the underlying header in -O3, which is deeply unnecessary to simply detect content changes.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @Lucaskabela

[ghstack-poisoned]
Copy link

pytorch-bot bot commented May 28, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/154551

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 04c49e9 with merge base e785c08 (image):

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]
benjaminglass1 added a commit that referenced this pull request May 29, 2025
Appears to help compile-time performance on CPU.

ghstack-source-id: 94d2cad
Pull Request resolved: #154551
[ghstack-poisoned]
benjaminglass1 added a commit that referenced this pull request May 30, 2025
Appears to help compile-time performance on CPU.

ghstack-source-id: d589a24
Pull Request resolved: #154551
[ghstack-poisoned]
[ghstack-poisoned]
benjaminglass1 added a commit that referenced this pull request May 31, 2025
Appears to help compile-time performance on CPU.

ghstack-source-id: 921544f
Pull Request resolved: #154551
[ghstack-poisoned]
benjaminglass1 added a commit that referenced this pull request Jun 4, 2025
Appears to help compile-time performance on CPU.

ghstack-source-id: a4cb258
Pull Request resolved: #154551
[ghstack-poisoned]
benjaminglass1 added a commit that referenced this pull request Jun 9, 2025
Appears to help compile-time performance on CPU.

ghstack-source-id: 38da1c6
Pull Request resolved: #154551
[ghstack-poisoned]
benjaminglass1 added a commit that referenced this pull request Jun 10, 2025
Appears to help compile-time performance on CPU.

ghstack-source-id: a5363e6
Pull Request resolved: #154551
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
benjaminglass1 added a commit that referenced this pull request Jul 10, 2025
Appears to help compile-time performance on CPU.

ghstack-source-id: 80067c1
Pull Request resolved: #154551
Comment on lines +1606 to +1631
if _IS_WINDOWS:
# https://learn.microsoft.com/en-us/cpp/build/walkthrough-compile-a-c-program-on-the-command-line?view=msvc-1704
# https://stackoverflow.com/a/31566153
cmd = (
f"{self._compiler} {self._include_dirs_args} {self._definitions_args} "
f"{self._cflags_args} {self._sources_args} "
f"{self._passthrough_parameters_args} {self._output}"
)
if self._do_link:
cmd += (
f" /LD /link {self._libraries_dirs_args} {self._libraries_args} "
f"{self._ldflags_args}"
)
if self._do_link:
cmd += f" {ldflags_args} {libraries_args} {libraries_dirs_args}"
return cmd

command_line = format_build_command(
compiler=self._compiler,
sources=self._sources_args,
include_dirs_args=self._include_dirs_args,
definitions_args=self._definitions_args,
cflags_args=self._cflags_args,
ldflags_args=self._ldflags_args,
libraries_args=self._libraries_args,
libraries_dirs_args=self._libraries_dirs_args,
passthrough_args=self._passthrough_parameters_args,
output=self._output,
return normalize_path_separator(cmd)

cmd = (
f"{self._compiler} {self._sources_args} {self._definitions_args} "
f"{self._cflags_args} {self._include_dirs_args} "
f"{self._passthrough_parameters_args} {self._output}"
)
return command_line
if self._do_link:
cmd += (
f" {self._ldflags_args} {self._libraries_args} "
f"{self._libraries_dirs_args}"
)
return cmd
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is functionally neutral; it just removes an interior function that added extra indentation and was only called here.

[ghstack-poisoned]
[ghstack-poisoned]
benjaminglass1 added a commit that referenced this pull request Jul 15, 2025
Appears to help compile-time performance on CPU.

ghstack-source-id: a80d586
Pull Request resolved: #154551
[ghstack-poisoned]
benjaminglass1 added a commit that referenced this pull request Jul 28, 2025
Appears to help compile-time performance on CPU.

ghstack-source-id: ffb990e
Pull Request resolved: #154551
[ghstack-poisoned]
benjaminglass1 added a commit that referenced this pull request Jul 29, 2025
Appears to help compile-time performance on CPU.

ghstack-source-id: 6a69685
Pull Request resolved: #154551
[ghstack-poisoned]
benjaminglass1 added a commit that referenced this pull request Jul 30, 2025
The main change of this commit is building cpp_wrapper code asynchronously, with a thread for the wrapper code, a thread for the kernel code, and a final thread to link. This improves build speeds, even while enabling an LTO link to improve performance.

Additionally, significantly reduce the cold-start time for detecting compatible precompiled headers. We had accidentally preprocessed the underlying header in -O3, which is deeply unnecessary to simply detect content changes.

ghstack-source-id: ad74c56
Pull Request resolved: #154551
benjaminglass1 added a commit that referenced this pull request Jul 30, 2025
The main change of this commit is building cpp_wrapper code asynchronously, with a thread for the wrapper code, a thread for the kernel code, and a final thread to link. This improves build speeds, even while enabling an LTO link to improve performance.

Additionally, significantly reduce the cold-start time for detecting compatible precompiled headers. We had accidentally preprocessed the underlying header in -O3, which is deeply unnecessary to simply detect content changes.

ghstack-source-id: ad74c56
Pull Request resolved: #154551
Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label Sep 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants