Skip to content

[pull] main from pytorch:main#87

Merged
pull[bot] merged 4 commits intoais-developer:mainfrom
pytorch:main
Jun 3, 2025
Merged

[pull] main from pytorch:main#87
pull[bot] merged 4 commits intoais-developer:mainfrom
pytorch:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull bot commented Jun 3, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

BoyuanFeng and others added 4 commits June 3, 2025 07:40
For graph partition, `write_get_raw_stream_header_once` is done once so the autotune code may not have the header. This PR additionally calls `write_get_raw_stream_header` in `codegen_device_guard_enter` before `get_raw_stream` is used.

Pull Request resolved: #154698
Approved by: https://github.com/oulgen
As per comment in #111471 (comment) the tests are failing due to hypothesis. This PR adds a skip to those tests.
Pull Request resolved: #152819
Approved by: https://github.com/eqy
#154764)

We observed that guard overhead at runtime using profiler traces was
higher than reported in this profiling function at the compile time.
After investigation, we found that f_locals are already in cache and
that was causing the guard overhead to be way smaller while profiling
during the compilation. To be more realistic, we flush the cache here.

Profiling the guard overhead during compilation (in addition to at
runtime) allows faster iteration time, and logging in tlparse and
internal databases.

Pull Request resolved: #154764
Approved by: https://github.com/zou3519, https://github.com/jansel, https://github.com/StrongerXi
@pull pull bot added the ⤵️ pull label Jun 3, 2025
@pull pull bot merged commit 635b73e into ais-developer:main Jun 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants