Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
715 commits
Select commit Hold shift + click to select a range
de893e9
Always build USE_DISTRIBUTED. (#160449)
ezyang Sep 4, 2025
01edcd4
Make distributed modules importable even when backend not built (#159…
ezyang Sep 4, 2025
2fa0520
[BE][pytree] cleanup parameterized pytree tests (#160842)
XuehaiPan Sep 5, 2025
92a4302
[cutlass backend] Add FP8 tests for multiple linears (#160782)
henrylhtsang Sep 4, 2025
771f369
[Inductor] Improve RoPE (#161420)
BoyuanFeng Sep 5, 2025
c10195e
[C10d][Gloo] Enable complex datatype support in ProcessGroupGloo (#15…
shunzhiwen Sep 5, 2025
a00cdc1
[CD][BE] Get rid of SETUPTOOLS and PYYAML extra pins (#162266)
malfet Sep 5, 2025
70d36e0
Making batching rule for F.embedding DTensor-aware (#162117)
zou3519 Sep 4, 2025
79fcd52
symbolic cpp channels_last_contiguous (#160402)
laithsakka Sep 5, 2025
01ab325
[DCP][Quantization] Fix the issue when scale vector is in a different…
saumishr Sep 5, 2025
e0a62b2
[aot-precompile] default-filter global guards (#162090)
dolpm Sep 5, 2025
8d50355
[CD][EZ] Update libtorch python version to 3.10 (#162297)
malfet Sep 5, 2025
9c03d6b
[CD][BE] Delete Python-3.9 case (#162265)
malfet Sep 5, 2025
4d4abec
allow user to pass in custom partitioner function (#157580)
xuanzhang816 Sep 5, 2025
486b20b
Add return-max-scores to flex-attention (#161667)
drisspg Sep 5, 2025
081cab0
Resize to 0 if not going to be used (#161730)
drisspg Sep 5, 2025
1463714
[dynamo] Graph break on on user-defined class in compiled region (#16…
rtimpe Sep 4, 2025
4f72d93
re-land triton runtime implementation" (#162217)
dolpm Sep 6, 2025
0f45aaf
Disable autocast when running joint graph passes (#162304)
yf225 Sep 6, 2025
7f4ff79
remove deprecated vllm test (#162306)
yangw-dev Sep 6, 2025
291cd11
[inductor] estimate peak memory in codegen only when buffer reuse (#1…
ruisizhang123 Sep 6, 2025
145a3a7
[CUDA 13][cuDNN] Bump CUDA 13 to cuDNN 9.13.0 (#162268)
eqy Sep 6, 2025
c3ceca2
codebase structure documentation to include torchgen (#162261)
Raman-RH Sep 6, 2025
20629b1
Add contiguous subgraph transformation threshold (#162192)
exclamaforte Sep 6, 2025
b2b4add
Docs on export joint with descriptors (#159006)
ezyang Aug 12, 2025
c0983e6
[Graph Partition] interface for custom cg wrapper (#162207)
BoyuanFeng Sep 6, 2025
a3e5466
Revert "Resize to 0 if not going to be used (#161730)"
pytorchmergebot Sep 6, 2025
da4db4b
Fix `DeviceMesh._flatten` docstring example (#162277)
mariosasko Sep 6, 2025
20b47ac
[fx] fix qualified name for methods of torch.Tensor (#162224)
isuruf Sep 4, 2025
aac1a50
Add api info for torch._C._nn.pyi (#162148)
orangeH25 Sep 6, 2025
bc50597
torch.zeros bound checks for symint (#161976)
morrison-turnansky Sep 6, 2025
c98ddac
Fixed comment to match logic in distributed_c10d.py (#162158)
Codeboi007 Sep 6, 2025
28f4ab0
Add -Wno-ctad-maybe-unsupported compiler flag (#162223)
0xjeffro Sep 6, 2025
0ff8eab
Revert "[dynamo] Graph break on on user-defined class in compiled reg…
pytorchmergebot Sep 6, 2025
9aedb3c
[AOTI-FX] Support registering custom FX backends (#162317)
blaine-rister Sep 6, 2025
5985e28
[CUDA 13][cuDNN][Windows] Roll back cuDNN upgrade from 9.13 to 9.12 o…
eqy Sep 6, 2025
b6d0a9e
MXFP8 grouped GEMM support for torch._scaled_grouped_mm + submodule b…
danielvegamyhre Sep 6, 2025
ae0edc1
[3/N] Enable 6 fsdp test on Intel GPU (#161601)
daisyden Sep 6, 2025
047603d
New export implementation with flat inp/out (#162167)
tugsbayasgalan Sep 4, 2025
541aa23
[inductor] fix TemplateBuffer.extract_read_writes (#162221)
shunting314 Sep 6, 2025
1a588ac
[inductor] rename deps during refreshing (#162303)
shunting314 Sep 6, 2025
5927a70
NLLLoss: validate target is 0D when input is 1D (#161412)
mansiag05 Sep 6, 2025
48e3be3
[while_loop][autograd] add hop while_loop_stack_output (#160467)
ydwu4 Sep 5, 2025
2b8a839
[while_loop][autograd] support autograd_key of while_loop (#160483)
ydwu4 Sep 5, 2025
5211f1f
[export] Move example inputs in move_to_device_pass (#162301)
yiming0416 Sep 6, 2025
e3068cd
[dynamo] Use relaxed CLOSURE_MATCH guard then ID_MATCH (#162247)
anijain2305 Sep 5, 2025
b919560
[nativert] AOTI lowering and packaging as NativeRT delegate (#162285)
yiming0416 Sep 7, 2025
2a45837
[inductor] fuse for scalar shared data (#162311)
shunting314 Sep 6, 2025
fea2077
[vllm hash update] update the pinned vllm hash (#162314)
pytorchupdatebot Sep 7, 2025
eac3d6f
Revert "[inductor] fuse for scalar shared data (#162311)"
pytorchmergebot Sep 7, 2025
104f268
Revert "Add return-max-scores to flex-attention (#161667)"
pytorchmergebot Sep 7, 2025
93fb23d
Build vLLM nightly wheels (#162000)
huydhn Sep 7, 2025
ada43ed
Revert "[inductor] pdl inductor option (disabled by default) (#160928)"
pytorchmergebot Sep 7, 2025
7a83cf4
Revert " [while_loop][autograd] support autograd_key of while_loop (#…
pytorchmergebot Sep 7, 2025
9ad5e8e
Improve typing of ONNX decorators with ParamSpec (#162332)
Vinayak-Pawar Sep 7, 2025
4348db0
Revert "[inductor][ez] V.choices.get_mm_configs returns list of Choic…
pytorchmergebot Sep 7, 2025
093ab5f
Revert "[inductor] add kernel template choice (ktc) (#161347)"
pytorchmergebot Sep 7, 2025
df59c21
Revert "[BE] Cleanup stale comments/copy from `gemm` (#162001)"
pytorchmergebot Sep 7, 2025
e246a85
Revert "[1/N] Port 5 _composable/fsdp distributed test cases to Intel…
pytorchmergebot Sep 7, 2025
8235c4f
Revert "[ROCm] Enabling several UTs (#161715)"
pytorchmergebot Sep 7, 2025
ff2de5d
Revert "[2/N]Port several test files under test/distributed to Intel …
pytorchmergebot Sep 7, 2025
ec2e368
[while_loop][autograd] support autograd_key of while_loop (#160483)
ydwu4 Sep 7, 2025
eb9073a
[easy] [precompile] Convert CompileArtifacts to callable (#162169)
jamesjwu Sep 7, 2025
5babb4d
Add BundledAOTAutogradSerializableCallable (#162170)
jamesjwu Sep 7, 2025
103f725
[associative_scan] Autograd separated (#139939)
bohnstingl Sep 8, 2025
c9ac8c2
[audio hash update] update the pinned audio hash (#162315)
pytorchupdatebot Sep 8, 2025
29e09a6
Revert "Make distributed modules importable even when backend not bui…
pytorchmergebot Sep 8, 2025
1e0656f
Revert "Always build USE_DISTRIBUTED. (#160449)"
pytorchmergebot Sep 8, 2025
fb0afa8
[inductor][triton] more JITCallable._hash_lock support (#162244)
davidberard98 Sep 5, 2025
31d5c67
[inductor][triton] support static cuda launcher after triton # 7866 (…
davidberard98 Sep 5, 2025
5b90e85
[AsyncTP] Fixes AsyncMM (#162040)
fegin Sep 8, 2025
32911ff
[xla hash update] update the pinned xla hash (#162372)
pytorchupdatebot Sep 8, 2025
e101411
Update slow tests (#161395)
pytorchupdatebot Sep 8, 2025
3f59933
[upstream triton] update triton pin to triton 3.5 (#162278)
davidberard98 Sep 5, 2025
25c170b
[inductor] Runtime estimations: use nccl estimator; mm only benchmark…
IvanKobzarev Sep 8, 2025
53297f6
Revert "[audio hash update] update the pinned audio hash (#162315)"
pytorchmergebot Sep 8, 2025
a92773e
Revert "Use vectorized stores for all dtypes in cat (#161649)"
pytorchmergebot Sep 8, 2025
f044fa2
[AsyncTP] Use assertEqual instead of allClose for bf16 tests (#162041)
fegin Sep 8, 2025
8e076d8
Don't call check_has_torch_dispatch in THPVariable_NewWithVar if we a…
swolchok Sep 6, 2025
49c446c
Add C++ function for torch.distributed.tensor._op_schema.is_view_op (…
swolchok Sep 6, 2025
5793dd7
[Intel GPU] Integrate OneDNN SDPA training forward and backward (#161…
LuFinch Sep 8, 2025
ebd29a1
[inductor] fuse for scalar shared data (#162311)
shunting314 Sep 8, 2025
72e6717
Avoid crash with release_available_cached_blocks (#162269)
morrison-turnansky Sep 8, 2025
de5dc1f
[cuDNN][SDPA][Nested Tensor] add forward/backward caching support for…
eqy Sep 8, 2025
bc4176c
CD Windows CUDA 13.0 build - fix packaging of cuda dlls (#162383)
atalman Sep 8, 2025
314d47a
[audio hash update] update the pinned audio hash (#162315)
pytorchupdatebot Sep 8, 2025
fbcabb4
Handle f([]) vs. f() in fake tensor caching (#162284)
angelayi Sep 8, 2025
d80297a
Always build USE_DISTRIBUTED. (#160449)
ezyang Sep 8, 2025
a0d0266
Make distributed modules importable even when backend not built (#159…
ezyang Sep 8, 2025
4e50651
[DTensor] fix F.one_hot (#162307)
zou3519 Sep 8, 2025
9c991b6
[CD] [aarch64] Add CUDA 12.6 and 12.8 to build matrix, remove 12.9 bu…
tinglvv Sep 8, 2025
8ec01f3
[bucketing] custom_ops mode to hide inductor copies overhead (#161499)
IvanKobzarev Sep 8, 2025
ec2c137
[BE]: Update cudnn frontend submodule to 1.14.1 (#162347)
Skylion007 Sep 8, 2025
8f11465
Add std::any_of to ConvParams struct (#162334)
benjaminglass1 Sep 6, 2025
26a1b9c
[dynamo] fix resume_execution.py KeyError in Python 3.11+ (#162318)
williamwen42 Sep 8, 2025
015423b
Add fp16-overflow regression test (#162401)
malfet Sep 8, 2025
5d819f3
Revert "[associative_scan] Autograd separated (#139939)"
pytorchmergebot Sep 8, 2025
dd44faa
Revert "Modify ROCm MI2xx-based workflows to run on cron schedule (#1…
pytorchmergebot Sep 8, 2025
fecd968
Graph split event tracker (#159795)
haowu14 Sep 8, 2025
85fe94e
make should_swap more dde friendly (#162099)
laithsakka Sep 8, 2025
2c538c9
rewrite __maybe_broadcast should_expand check for unbacked (#162109)
laithsakka Sep 8, 2025
711c8c8
shape guards (#161178)
avikchaudhuri Sep 8, 2025
ac9ccd0
Add return-max-scores to flex-attention (#161667)
drisspg Sep 8, 2025
5fd6b6a
[refactor] add helper sizevars function, is_size_one, for size==1 che…
ColinPeppler Sep 8, 2025
189a054
Remove guard_size_oblivious from default contiguity python check, and…
laithsakka Sep 8, 2025
07f0730
[associative_scan] Autograd separated (#139939)
bohnstingl Sep 8, 2025
c0fc86b
Fix aarch64 wheel pack (#159481)
atalman Sep 8, 2025
8485aac
[precompile] Fix inlined source tracking with generators. (#162389)
zhxchen17 Sep 9, 2025
897c4e7
Move to small wheel approach for CUDA SBSA wheel (#160720)
tinglvv Sep 9, 2025
ed77e23
Revert "[dynamo] Constant fold torch.autograd._profiler_enabled (#158…
pytorchmergebot Sep 9, 2025
6eb14ac
[Inductor] Fix cross-device scalar lowering - cpu scalar with cuda te…
karthickai Sep 8, 2025
a951f43
Avoid redundant PyTuple_GetSize call in _maybe_handle_torch_function …
swolchok Sep 8, 2025
eab2afe
fastpath type Tensor in THPVariable_NewWithVar (#161634)
swolchok Sep 8, 2025
12db2a7
Call checkLong in is_int_or_symint, completing TODO (#161692)
swolchok Sep 8, 2025
a8a187b
Overload _get_operation_for_overload_or_packet & friends to accept Ar…
swolchok Sep 8, 2025
e025c0f
Dynamo: set_eval_frame microoptimization (#162220)
swolchok Sep 8, 2025
583bbf7
[MPS] Add `native_dropout` and `native_dropout_backward` (#162108)
kurtamohler Sep 8, 2025
a965f09
[export] Update PT2 archive docs (#162308)
yiming0416 Sep 9, 2025
d8b6622
testing infra and some fixes (#162183)
tugsbayasgalan Sep 8, 2025
7b8a645
[inductor] fix 3d tiled online softmax (#162341)
shunting314 Sep 8, 2025
1641606
Revert "Add BundledAOTAutogradSerializableCallable (#162170)"
pytorchmergebot Sep 9, 2025
4c45090
[DTensor] Check if tracing for sharding propagation to handle unhasha…
azahed98 Sep 9, 2025
98ecc0f
[SymmMem] Add team pool to hold duplicated teams for the same rank gr…
kwen2501 Sep 8, 2025
065c446
[SymmMem] Use global pe for put and get (#162394)
kwen2501 Sep 8, 2025
847d7f2
[CUDA-13] Implement workaround for cudaErrorNotSupported (#162412)
malfet Sep 9, 2025
f216d64
[SymmMem] Better tuning of A2AV based on accurate node boundary (#162…
kwen2501 Sep 9, 2025
607327b
[vllm hash update] update the pinned vllm hash (#162356)
pytorchupdatebot Sep 9, 2025
7ad40de
[audio hash update] update the pinned audio hash (#162437)
pytorchupdatebot Sep 9, 2025
8494afb
Add missing fstream include to fix std::ofstream compilation error (#…
0xjeffro Sep 9, 2025
4590438
[fx] fix qualified name for methods of torch.Tensor (#162407)
isuruf Sep 8, 2025
60d0092
Revert "testing infra and some fixes (#162183)"
pytorchmergebot Sep 9, 2025
7feb8fc
[SymmMEM] Allow to import _SymmetricMemory when NVSHMEM is not availa…
fegin Sep 9, 2025
d85392a
Add BundledAOTAutogradSerializableCallable (#162170)
jamesjwu Sep 7, 2025
d49205f
Add more tests for vllm and clean out the old vllm test (#162292)
yangw-dev Sep 9, 2025
4840a1a
Run vLLM tests on all trunk commits before 2.9 branch cut (#161797)
huydhn Sep 9, 2025
cd2c98a
[Release 2.9] Release only changes (#162493)
atalman Sep 9, 2025
ce928e1
CUDA 13.0 Windows Nvidia Driver Update to 580.88 (#162501)
pytorchbot Sep 9, 2025
c31a818
[CD] Aarch64 Fix packaging ``libarm_compute.so`` and other libraries …
pytorchbot Sep 10, 2025
152383b
fix typo: summit -> submit (#162597)
pytorchbot Sep 12, 2025
0ac9fa4
[ez][CI] Fix docs push in nightly workflow (#163085)
pytorchbot Sep 16, 2025
1076941
[ONNX] Fix rotary_embedding_23 implementation (#163041)
pytorchbot Sep 17, 2025
44baf2f
fix deterministic scatter_add path for multi-d tensors (#162977)
pytorchbot Sep 17, 2025
aebf427
[Release 2.9] Update torch-xpu-ops commit pin (#162935)
CuiYifeng Sep 17, 2025
7f8ba48
Fix the regression issue caused by non-arrch64 platforms not hitting …
pytorchbot Sep 17, 2025
9718af1
Support vmap + custom autograd function/improve DTensor constructor i…
pytorchbot Sep 17, 2025
baab5c6
[ONNX] Update export docstring & Set fallback=False by default (#162637)
pytorchbot Sep 17, 2025
ffa6f63
Revert "Make distributed modules importable even when backend not bui…
Camyll Sep 19, 2025
bc158eb
[SymmMem] Fix NVSHMEM plugin + Triton 3.5 (#163262)
pytorchbot Sep 19, 2025
76bebf3
[Release 2.9] [cuDNN][SDPA][submodule] Roll-back cuDNN frontend upgra…
eqy Sep 19, 2025
b1aae80
[Cherry Pick][Graph Partition] allow sharing default device context (…
BoyuanFeng Sep 19, 2025
25d8c0b
Add decomp rule to assert_tensor_metadata for BatchedTensors (#163361)
pytorchbot Sep 19, 2025
a576d48
Skip test_ind_worker_queue on Windows and macOS (flaky) (#163363)
pytorchbot Sep 19, 2025
35c55da
[Graph Partition] improve custom op output alias (#163380)
pytorchbot Sep 19, 2025
ddd5074
[CI] Update NVIDIA driver to `580.82.07` (#163522)
pytorchbot Sep 22, 2025
f83cf07
[graph partition] Add way to register custom rule (#163310) (#163395)
zou3519 Sep 23, 2025
7cf37ae
[2.9 cherry pick][triton] update 3.5 pin to bbb06c0334a6772b92d24bde5…
davidberard98 Sep 23, 2025
579794e
[SymmMem] Fix put_signal + wait_until hang (#163458)
pytorchbot Sep 23, 2025
4966d05
CUDA 13.0 Warning update for supported architectures (#163633)
pytorchbot Sep 23, 2025
47cb45e
Update pytorch_sphinx_theme2 to latest hash (#163655)
pytorchbot Sep 23, 2025
715dca6
[export] Remove .contiguous() when saving weights to raw bytes (#163662)
pytorchbot Sep 23, 2025
6c058c1
Move ROCM trunk wheel builds to 3.10 (#163804)
pytorchbot Sep 24, 2025
1dadb61
[BE] Introduce `CONDA_ROOT_DIR` (#163805)
pytorchbot Sep 24, 2025
5322dab
Update pytorch.org links in docs/conf.py (#163703)
pytorchbot Sep 24, 2025
be29c5b
Add analytics ID to cpp docs (#163695)
pytorchbot Sep 24, 2025
7d024a6
[BE] Update Python min version to 3.10 (#162310) (#163802)
Camyll Sep 24, 2025
96f0c0f
Fix some edge cases (#163106)
pytorchbot Sep 25, 2025
300bade
[Cherry-Pick] [CD] CUDA 13 specific followup changes. Remove sm50-70 …
atalman Sep 25, 2025
9952b87
[CD] CUDA 13.0 fix preload logic to include nvidia/cu13/lib/ (#163766)
pytorchbot Sep 25, 2025
c0577aa
Use cuda nvrtc so file based on cuda version used by torch (#163642) …
atalman Sep 25, 2025
b0dc908
[CD] Simplify NVIDIA driver installation step (#163349) (#163790)
atalman Sep 25, 2025
87c5d4a
[cherrypick] [CI] Move Windows build/tests to Python-3.10 #162862 (#1…
Camyll Sep 25, 2025
132d9fa
Revert "[BE] Update Python min version to 3.10 (#162310)" (#163882)
atalman Sep 25, 2025
0154ca1
[BE] Update Python min version to 3.10 (#162310) (#163885)
Camyll Sep 25, 2025
49dab18
[CD] Add statically linked windows libraries to exclude list (#163862)
pytorchbot Sep 25, 2025
fc8bf12
Fix cpp build (#163887)
pytorchbot Sep 25, 2025
824d59f
[CI] Install libuv for Win testing (#163907)
pytorchbot Sep 26, 2025
63da9d2
[Release 2.9] Update torch-xpu-ops commit pin (#163622)
CuiYifeng Sep 26, 2025
57dc688
[CI] Fix test_triton_wait_until hang (#163914)
pytorchbot Sep 26, 2025
f9e495f
Move inductor jobs 3.9->3.10 (#163954)
pytorchbot Sep 26, 2025
7cadf8a
[Inductor][Intel GPU] Save `threads_per_warp` from tirton compiled ke…
pytorchbot Sep 26, 2025
5340e74
[Reland][163423] Promote `@requires_nvshmem` instead of `enable_trito…
pytorchbot Sep 26, 2025
9993043
[dist] handle discontiguous allgather/reducescatter inputs (#163987)
pytorchbot Sep 26, 2025
daa3d04
[SymmMem] Fix memory allocation hold-up (#163375)
pytorchbot Sep 26, 2025
d7a703e
[SymmMem] Barrier on team instead of world (#163376)
pytorchbot Sep 26, 2025
37e2626
Update the operator benchmarking, to benchmark using torch.compile (#…
pytorchbot Sep 29, 2025
45e257f
[cuDNN][conv][64-bit] Disable cuDNN for 64-bit depthwise convs again …
pytorchbot Sep 29, 2025
11f776c
[cuDNN][SDPA] Disable dropout for cuDNN SDPA on 9.11 - 9.13 (#164026)
pytorchbot Sep 29, 2025
709f4f6
[cuDNN][Convolution] Disable cuDNN for 3D convolutions with kernel si…
pytorchbot Sep 29, 2025
b64fc8e
Fix operator benchmark issue#162708 (#164140)
pytorchbot Sep 29, 2025
a2c7704
Add operator benchmarking run to CI nightly (#164151)
pytorchbot Sep 29, 2025
20100b7
[c10d] P2P tensors must be dense (#163981)
pytorchbot Sep 29, 2025
d1b63e2
Skip test_conv3d_cudnn_broken on ROCM (#164163)
pytorchbot Sep 29, 2025
22d46b5
[CUDA] revert PR 130472 (#163379)
pytorchbot Sep 29, 2025
21fec65
Use linux.g4dn.4xlarge.nvidia.gpu for cuda 12.4 legacy driver tests (…
pytorchbot Sep 29, 2025
a21a4bf
[CI] Move libtorch-cpu-shared-with-deps-release-build to python 3.10 …
pytorchbot Sep 29, 2025
72cf48e
[AARCH64][CD][CUDA13][Triton][PTXAS] Turn on BUILD_BUNDLE_PTXAS=1 (…
pytorchbot Sep 30, 2025
005e3e8
Clean up obsoleted vLLM tests (#164282)
pytorchbot Sep 30, 2025
e70d9f5
[vllm hash update] update the pinned vllm hash (#164190) (#164312)
huydhn Oct 1, 2025
71282c8
Update Sphinx theme (#164147) (#164254)
svekars Oct 1, 2025
a5feacb
[SDPA] [MPS] Fixes regression in 2.8.0 for scaled_dot_product_attenti…
pytorchbot Oct 1, 2025
f227c88
[MPSHooks] Release pending command encoder (#164365)
pytorchbot Oct 1, 2025
3abee62
Fix warn message (#164367)
pytorchbot Oct 1, 2025
3e8a062
Update Microsoft C++ Redistributable to the latest version (#164369)
pytorchbot Oct 1, 2025
764f655
[MPS] Chunk fillBuffer into 4Gb slices (#164370)
pytorchbot Oct 1, 2025
881c2cc
Update Gloo submodule (#164371)
pytorchbot Oct 1, 2025
1cd83de
[Flex attention] Fix flex attention head broadcast (#164368)
pytorchbot Oct 1, 2025
31c72b8
[a2av] Separate in/out splits into two tensors (#164028)
pytorchbot Oct 1, 2025
10b501f
[Flex] Fix silent correctness w/ backpropping grads (#164366)
pytorchbot Oct 1, 2025
d6e8411
Make sure Windows CUDA 12.8 build follow same arches as Linux builds …
pytorchbot Oct 2, 2025
017d857
fix pickling for BitwiseFn (#163861)
pytorchbot Oct 2, 2025
2f6387e
[CherrryPick][2.9] Cherry pick request for `Reapply "Make functionali…
Lucaskabela Oct 2, 2025
fd36458
[Cherry-Pick] Work Around exposing statically linked libstdc++ CXX11 …
atalman Oct 2, 2025
c74f057
Pin conda version for Docker builds (#164579)
pytorchbot Oct 3, 2025
3b57315
[ROCm] Increase binary build timeout to 5 hours (300 minutes) (#164770)
pytorchbot Oct 6, 2025
d4c4307
Fix docker build issue after 164575 (#164779)
pytorchbot Oct 6, 2025
b015422
fix cpp extension distributed warning spew (#164785)
pytorchbot Oct 6, 2025
42f0c2c
update the baseline data for the operator benchmark (#164789)
pytorchbot Oct 7, 2025
6f12be2
CUDA 13.0 builds fix on Amazon Linux 2023 (#164893)
pytorchbot Oct 8, 2025
26e023a
[MPS] Update OS version in error message (#164949)
pytorchbot Oct 8, 2025
0fabc3b
CUDA aarch64 12.6 and 12.8 builds fix triton constraints (#165022)
pytorchbot Oct 9, 2025
476d3c1
[release/2.8] Enable wheels
jithunnair-amd Apr 22, 2025
8e6d64f
[release/2.8] Upgrade numpy versions; Use different package versions …
jithunnair-amd Jul 16, 2025
3ff7844
Use ROCm/triton and update triton.txt
jithunnair-amd Jul 16, 2025
fe59c33
Add related_commits file (#2396)
pragupta Jul 22, 2025
14b0f0e
Add QA automation scripts for running PyTorch unit tests
jithunnair-amd Feb 19, 2025
99627b5
[AUTOGENERATED] [release/2.5] [ROCm][layer_norm] Use __builtin_amdgcn…
rocm-mici Dec 18, 2024
275c05b
[release/2.7] Update test_binary_ufuncs.py after numpy upgrade (#2289)
ethanwee1 Jul 1, 2025
214e0f3
Clean up CUDA state between tests (#2335)
rraminen Jul 14, 2025
4361e47
[AUTOGENERATED] [release/2.7] [release/2.6] Fix dtype before comparin…
okakarpa Jul 15, 2025
b324b36
[release/2.8] enable py3.13 (#2366)
ethanwee1 Jul 17, 2025
edcb143
[SWDEV-539076] Initial naive foreach autotune support (#2377)
jataylo Jul 18, 2025
4142eef
[release/2.7] [SWDEV-543214] Reland #2416 Fix warps runtime (#2421)
jataylo Jul 30, 2025
5e67be1
[AUTOGENERATED] [release/2.8] [release/2.7] [SWDEV-543214] Reland #24…
okakarpa Aug 4, 2025
c58ceb1
[AUTOGENERATED] [release/2.8] [SWDEV-539215] - Autotune support for p…
okakarpa Aug 11, 2025
406100f
[SWDEV-539119] [release/2.8] Add fast_tanh support (#2484)
jataylo Aug 12, 2025
fcc0d85
[AUTOGENERATED] [release/2.8] Change triton package name depending on…
dhonnappa-amd Aug 15, 2025
16b8239
[release/2.8] Define uint32 t when ROCM_VERSION >= 70000 (#2513)
rraminen Aug 19, 2025
2711b3e
[AUTOGENERATED] [release/2.8] [ROCm] OffsetCalc Unroll Optimization (…
dhonnappa-amd Sep 3, 2025
5b2a37c
Bug fix and optimisation for persistent reduction kernel tuning (#2596)
jataylo Sep 4, 2025
7b8bc05
[ROCm] Fix indexing_backward_kernel perf (#2667)
jerrymannil Sep 22, 2025
506d5ce
[ROCm] Improve perf for elementwise broadcast with mixed dtype (#2672)
jerrymannil Sep 23, 2025
55b2445
[ROCm] Implement float32 copy kernel (#2683)
jerrymannil Sep 25, 2025
123b638
Bump triton to 3.5.x
pragupta Oct 9, 2025
426b2e8
Update fbgemm submodule to avoid ck errors
pragupta Oct 9, 2025
31b3b8e
Merge remote-tracking branch 'upstream/release/2.9' into release/2.9_…
github-actions[bot] Oct 14, 2025
06ee6e4
Merge pull request #2709 from ROCm/release/2.9_IFU_2025-10-14
jithunnair-amd Oct 14, 2025
c126ff5
Update version to 2.9.0
jithunnair-amd Oct 15, 2025
4fe15f2
[ROCm] Fix non-stride-one backwards indexing performance
rocm-repo-management-api[bot] Oct 16, 2025
9bb5bae
[release/2.9] remove amdgpu-coerce-illegal-types=1 (#2720)
ethanwee1 Oct 17, 2025
fa57f9c
[ROCm] Adjust grid size for non-unit stride backwards indexing
rocm-repo-management-api[bot] Oct 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
178 changes: 178 additions & 0 deletions .automation_scripts/parse_xml_results.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
""" The Python PyTorch testing script.
##
# Copyright (c) 2024 Advanced Micro Devices, Inc. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
"""

import xml.etree.ElementTree as ET
from pathlib import Path
from typing import Any, Dict, Tuple

# Backends list
BACKENDS_LIST = [
"dist-gloo",
"dist-nccl"
]

TARGET_WORKFLOW = "--rerun-disabled-tests"

def get_job_id(report: Path) -> int:
# [Job id in artifacts]
# Retrieve the job id from the report path. In our GHA workflows, we append
# the job id to the end of the report name, so `report` looks like:
# unzipped-test-reports-foo_5596745227/test/test-reports/foo/TEST-foo.xml
# and we want to get `5596745227` out of it.
try:
return int(report.parts[0].rpartition("_")[2])
except ValueError:
return -1

def is_rerun_disabled_tests(root: ET.ElementTree) -> bool:
"""
Check if the test report is coming from rerun_disabled_tests workflow
"""
skipped = root.find(".//*skipped")
# Need to check against None here, if not skipped doesn't work as expected
if skipped is None:
return False

message = skipped.attrib.get("message", "")
return TARGET_WORKFLOW in message or "num_red" in message

def parse_xml_report(
tag: str,
report: Path,
workflow_id: int,
workflow_run_attempt: int,
work_flow_name: str
) -> Dict[Tuple[str], Dict[str, Any]]:
"""Convert a test report xml file into a JSON-serializable list of test cases."""
print(f"Parsing {tag}s for test report: {report}")

job_id = get_job_id(report)
print(f"Found job id: {job_id}")

test_cases: Dict[Tuple[str], Dict[str, Any]] = {}

root = ET.parse(report)
# TODO: unlike unittest, pytest-flakefinder used by rerun disabled tests for test_ops
# includes skipped messages multiple times (50 times by default). This slows down
# this script too much (O(n)) because it tries to gather all the stats. This should
# be fixed later in the way we use pytest-flakefinder. A zipped test report from rerun
# disabled test is only few MB, but will balloon up to a much bigger XML file after
# extracting from a dozen to few hundred MB
if is_rerun_disabled_tests(root):
return test_cases

for test_case in root.iter(tag):
case = process_xml_element(test_case)
if tag == 'testcase':
case["workflow_id"] = workflow_id
case["workflow_run_attempt"] = workflow_run_attempt
case["job_id"] = job_id
case["work_flow_name"] = work_flow_name

# [invoking file]
# The name of the file that the test is located in is not necessarily
# the same as the name of the file that invoked the test.
# For example, `test_jit.py` calls into multiple other test files (e.g.
# jit/test_dce.py). For sharding/test selection purposes, we want to
# record the file that invoked the test.
#
# To do this, we leverage an implementation detail of how we write out
# tests (https://bit.ly/3ajEV1M), which is that reports are created
# under a folder with the same name as the invoking file.
case_name = report.parent.name
for ind in range(len(BACKENDS_LIST)):
if BACKENDS_LIST[ind] in report.parts:
case_name = case_name + "_" + BACKENDS_LIST[ind]
break
case["invoking_file"] = case_name
test_cases[ ( case["invoking_file"], case["classname"], case["name"], case["work_flow_name"] ) ] = case
elif tag == 'testsuite':
case["work_flow_name"] = work_flow_name
case["invoking_xml"] = report.name
case["running_time_xml"] = case["time"]
case_name = report.parent.name
for ind in range(len(BACKENDS_LIST)):
if BACKENDS_LIST[ind] in report.parts:
case_name = case_name + "_" + BACKENDS_LIST[ind]
break
case["invoking_file"] = case_name

test_cases[ ( case["invoking_file"], case["invoking_xml"], case["work_flow_name"] ) ] = case

return test_cases

def process_xml_element(element: ET.Element) -> Dict[str, Any]:
"""Convert a test suite element into a JSON-serializable dict."""
ret: Dict[str, Any] = {}

# Convert attributes directly into dict elements.
# e.g.
# <testcase name="test_foo" classname="test_bar"></testcase>
# becomes:
# {"name": "test_foo", "classname": "test_bar"}
ret.update(element.attrib)

# The XML format encodes all values as strings. Convert to ints/floats if
# possible to make aggregation possible in Rockset.
for k, v in ret.items():
try:
ret[k] = int(v)
except ValueError:
pass
try:
ret[k] = float(v)
except ValueError:
pass

# Convert inner and outer text into special dict elements.
# e.g.
# <testcase>my_inner_text</testcase> my_tail
# becomes:
# {"text": "my_inner_text", "tail": " my_tail"}
if element.text and element.text.strip():
ret["text"] = element.text
if element.tail and element.tail.strip():
ret["tail"] = element.tail

# Convert child elements recursively, placing them at a key:
# e.g.
# <testcase>
# <foo>hello</foo>
# <foo>world</foo>
# <bar>another</bar>
# </testcase>
# becomes
# {
# "foo": [{"text": "hello"}, {"text": "world"}],
# "bar": {"text": "another"}
# }
for child in element:
if child.tag not in ret:
ret[child.tag] = process_xml_element(child)
else:
# If there are multiple tags with the same name, they should be
# coalesced into a list.
if not isinstance(ret[child.tag], list):
ret[child.tag] = [ret[child.tag]]
ret[child.tag].append(process_xml_element(child))
return ret
Loading