Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
277 commits
Select commit Hold shift + click to select a range
12c1246
[ROCm][CI] remove amdgpu from install_rocm.sh (#166575)
jeffdaily Oct 30, 2025
b4403bf
Add waitcounters for torch.compile subprocess pool (#164527)
c00w Oct 30, 2025
e380028
[inductor][choices] lookup table choices 1/3 (#164978)
coconutruben Oct 27, 2025
cf7756d
Bump uv from 0.9.5 to 0.9.6 in /.ci/lumen_cli (#166578)
dependabot[bot] Oct 30, 2025
311ea0d
shrink_group implementation to expose ncclCommShrink API (#164518)
brchang24 Oct 30, 2025
0187db8
[ROCm][CI] Create periodic-rocm-mi200.yml (#166544)
amdfaa Oct 30, 2025
5cbdade
Fix a syntactic error in test_indexing.py (#166390)
cyyever Oct 30, 2025
791ca80
Enable local tensor mode for DTensor attention and convolution tests …
dzmitry-huba Oct 29, 2025
9051940
address DDE in matmul decomp (#166541)
laithsakka Oct 29, 2025
0918bf3
[xpu][test] Reuse native_mm and mix_order_reduction for Intel GPU. (#…
etaf Oct 30, 2025
845da9c
[ONNX] Ignore pyrefly errors in torchlib (#166588)
justinchuby Oct 30, 2025
476b149
bwd pass (#164504)
liangel-02 Oct 29, 2025
75f798e
[inductor][mi350] add tech specs for MI350 (#166576)
nmacchioni Oct 30, 2025
f20bf77
[audio hash update] update the pinned audio hash (#166597)
pytorchupdatebot Oct 30, 2025
f5cb9a4
[user-streams] Fix stream graph output semantics (#164819)
mlazos Oct 30, 2025
79aee77
[user-streams] Add current stream source (#165211)
mlazos Oct 30, 2025
a533526
[user-streams] Track symbolic current stream (#165212)
mlazos Oct 30, 2025
d46d8d6
[triton][sigmoid] Fix kernel cache and serialization issue for triton…
XueningXu Oct 30, 2025
f1af679
[user-streams] Handle returning the current stream with/without devic…
mlazos Oct 30, 2025
2829d48
[xpu][test][1/N] Port 3 fsdp distributed test cases to Intel GPU (#16…
libohao1201 Oct 30, 2025
39e5cdd
[2/N] Add strict parameter to Python zip calls (#166257)
cyyever Oct 30, 2025
3292092
[xpu][fix] [Inductor] Avoid using tl.sqrt_rn on XPU before triton is …
jianyizh Oct 30, 2025
369f2d6
[3/N] fix typo in other folders (#166606)
lingebeng Oct 30, 2025
2de4cf2
[1/N] Remove unused loop variables (#166258)
cyyever Oct 30, 2025
f607510
Revert "[2/N] Add strict parameter to Python zip calls (#166257)"
pytorchmergebot Oct 30, 2025
9ee1afb
Revert "[user-streams] Handle returning the current stream with/witho…
pytorchmergebot Oct 30, 2025
95b5534
Revert "[user-streams] Track symbolic current stream (#165212)"
pytorchmergebot Oct 30, 2025
fa8e073
Revert "[triton][sigmoid] Fix kernel cache and serialization issue fo…
pytorchmergebot Oct 30, 2025
7563f61
Make bucketing aware of collective LIFO semantics (#166324)
eellison Oct 29, 2025
ad02bd1
Revert "[user-streams] Add current stream source (#165211)"
pytorchmergebot Oct 30, 2025
ad55907
[triton][sigmoid] Fix kernel cache and serialization issue for triton…
XueningXu Oct 30, 2025
6a5a436
DTensor: C++ compute_global_tensor_info (#162990)
swolchok Oct 30, 2025
bbb7d22
[inductor] print 0.0 as 0 for triton (#164291)
isuruf Oct 22, 2025
3f18247
Revert "Fix comparing inductor actual strides vs bw graph for activat…
pytorchmergebot Oct 30, 2025
08b0a8f
[Inductor] Fix an inductor_provenance bug (#166432)
desertfire Oct 29, 2025
2df2c31
[devx] Fix invalid symbol definition emitted in fx_graph_runnable.py …
desertfire Oct 29, 2025
fb545fb
Add MXFP4 grouped gemm support via. FBGEMM kernels (#166530)
slayton58 Oct 29, 2025
e83be70
Fix pyrefly errors on main (#166548)
maggiemoss Oct 30, 2025
0a3ac47
Revert "[user-streams] Fix stream graph output semantics (#164819)"
pytorchmergebot Oct 30, 2025
c37802a
use multi-dtype bucketing (#166527)
eellison Oct 30, 2025
629293f
bucket all reduce (#166528)
eellison Oct 30, 2025
694d205
Revert "shrink_group implementation to expose ncclCommShrink API (#16…
pytorchmergebot Oct 30, 2025
ba71e9c
[DeviceMesh] Isolate pg creation logic in Device Mesh into a separate…
fduwjj Oct 30, 2025
a553ea9
Fix missing symbol when printing guards (#165723)
aorenste Oct 29, 2025
a5c3c08
[Pytorch] Use exp_u20 for aarch64's erf (#166594)
Nicoshev Oct 30, 2025
8f40a0c
Revert "address DDE in matmul decomp (#166541)"
pytorchmergebot Oct 30, 2025
4acc66f
Make PT2 compile backprop through custom op without autograd key a ha…
ezyang Oct 29, 2025
fcd5f8c
[CodeClean] Remove the Unused MACRO for AOT Inductor Runtime (#165139)
fffrog Oct 30, 2025
398775a
[CodeClean] Replace std::runtime_error with TORCH_CHECK (#165119)
fffrog Oct 30, 2025
639a0b1
Remove torch.distributed.tensor.OpSchema.has_symints (#163667)
swolchok Oct 30, 2025
694db5f
Use 'is' in callable comparisons (#166624)
cyyever Oct 30, 2025
b939de2
Avoid writing temporary modules to disk (#157713)
apmorton Oct 30, 2025
8221ee6
[xpu] Fix type annotation for ProcessGroupXCCL (#166418)
frost-intel Oct 30, 2025
0ec0549
Introduce a new API torch.xpu.get_per_process_memory_fraction (#165511)
guangyey Oct 15, 2025
181ee3b
fix: Add missing signals_to_handle to launcher logging (#166631)
leopold-tzafon Oct 30, 2025
a7fd0b4
[ROCm][CI] fix disk space message (#166645)
amdfaa Oct 30, 2025
ad3a56a
Add a compile-time flag to trigger verbose logging for device-side as…
drdarshan Oct 30, 2025
56838ba
[CP][BE][1/2] Refactor the code structure (#166456)
fegin Oct 30, 2025
52db601
Enable verify_dynamo on Python 3.13 (#166497)
cyyever Oct 30, 2025
f911d64
[CUDA] xFail `max-autotune` grouped gemm tests on devices with insuff…
eqy Oct 30, 2025
99b05d1
Better 1x128, 128x128 error handling on non-Hopper (#166639)
slayton58 Oct 30, 2025
0d50e5d
[3/N] Fix unused loop variables (#166509)
cyyever Oct 30, 2025
80ba6e4
Add warning when users have incomplete setup for type checking (#166603)
maggiemoss Oct 30, 2025
df71b70
[cuDNN][conv] Re-enable cuDNN for 3D convolutions (fixed in 9.15+) (#…
eqy Oct 30, 2025
7692fa0
[Code Clean] Clean asserts in torch/ao/quantization/fx/* (#165420)
zhudada0120 Oct 30, 2025
5fc2c7a
[ROCm][inductor] More configs for pointwise kernels. (#166470)
naromero77amd Oct 30, 2025
f5543e3
[wip] fix searchsorted non dense (#165064)
eellison Oct 30, 2025
45c3f02
[ROCm] moved gfx1100 back to experimental status for AOTriton (#166397)
k-artem Oct 30, 2025
7e3b9d1
[CP][BE][2/2] Refactor the code structure (#166501)
fegin Oct 30, 2025
b9bcb37
[DebugMode] store stringify args by default (#166347)
pianpwk Oct 29, 2025
984e64b
[inductor] Fix constant folder (#166655)
angelayi Oct 30, 2025
7a0cd8e
[ROCm] Disable `__builtin_amdgcn_rcpf` for gfx90a (#166454)
pragupta Oct 30, 2025
bfb47ec
[dynamo] support tracing new typing union syntax X | Y (#166599)
williamwen42 Oct 30, 2025
5d288bc
[BE] Move GreenContext implementation details to cpp (#166462)
malfet Oct 31, 2025
98d640b
Remove AT_USE_HIPSPARSE_GENERIC_API (#166393)
cyyever Oct 31, 2025
47f0024
[CI][BE] Factor out repeated test code (#166481)
malfet Oct 30, 2025
3206677
Fix torch.full with dynamic tensor fill_value in torch.compile (#166554)
amaldevh Oct 31, 2025
24b6eb7
[Inductor] Enable Custom op Autotune Decompositions and Parameter Tun…
tianrengao Oct 31, 2025
1257706
[MPS] Fix crash when max/min ops called for complex types (#166214)
malfet Oct 28, 2025
a6b1ef1
[GraphPartition] cache get_free_symbol_uses (#166338)
BoyuanFeng Oct 31, 2025
1129605
[ROCm][CI] create ROCm 7.1 images for binary builds (#166665)
jeffdaily Oct 31, 2025
d3be06c
[MTIAGraph][Pytorch][2/n] Add binding for Python to C++, and hook for…
andyanwang Oct 31, 2025
d3e511f
[Inductor] support masked vectorization for the tail_loop for fp8 dat…
jiayisunx Oct 30, 2025
f1e4c42
[BE][Typing][Dynamo] Type misc files in `torch/_dynamo/variables/` (#…
Lucaskabela Oct 31, 2025
e3ae059
Add CUDA MXFP4 scaled mm support via. FBGEMM (#166526)
slayton58 Oct 30, 2025
7d39401
Revert "[BE][Typing][Dynamo] Type misc files in `torch/_dynamo/variab…
pytorchmergebot Oct 31, 2025
797cd80
[dynamo, nested graph breaks] codegen dead nested cells correctly (#1…
williamwen42 Oct 31, 2025
1dec8a6
[dynamo, nested graph breaks] add disable_nested_graph_breaks decorat…
williamwen42 Oct 31, 2025
267d019
[dynamo] fix error_on_graph_break bug where non-empty checkpoint resu…
williamwen42 Oct 31, 2025
85b035c
[nativert] Downcast triton double arguments to floats (#166620)
minjang Oct 31, 2025
7d67a41
make FXConverter.generate use V.fake_mode instead of _detect_fake_mod…
jazlyn5 Oct 31, 2025
030de07
[2/N] Use 'is' in callable comparisons (#166685)
cyyever Oct 31, 2025
fc8ac12
[4/N] Remove unused loop variables in tests (#166690)
cyyever Oct 31, 2025
108bb22
[pytree] add `treespec_{leaf,tuple,dict}` functions for args_spec mod…
XuehaiPan Oct 31, 2025
0d3a4f7
[CD] Enable Inductor performance test for xpu (#166289)
chuanqi129 Oct 31, 2025
fd68d40
[xpu][feature] Integrate OneDNN SDPA training forward/backward into X…
LuFinch Oct 31, 2025
c01636e
Fixes the sparse tensor issue (#163535)
arkadip-maitra Oct 31, 2025
b083193
[inductor] Mark / restrict tests that only work if ATen is used for m…
kundaMwiza Oct 31, 2025
657f8c3
Revert "Fix torch.full with dynamic tensor fill_value in torch.compil…
pytorchmergebot Oct 31, 2025
26534e9
Revert "[GraphPartition] cache get_free_symbol_uses (#166338)"
pytorchmergebot Oct 31, 2025
4e8ba37
Revert "[BE] Move GreenContext implementation details to cpp (#166462)"
pytorchmergebot Oct 31, 2025
5bcfdae
Revert "Make PT2 compile backprop through custom op without autograd …
pytorchmergebot Oct 31, 2025
160ab53
Update weight tensor initialization in RMSNormalization (#166550)
justinchuby Oct 31, 2025
034e951
[CUDA][cuBLASLt] addmm -- extend bias fusions to cases with (1 by n) …
nikitaved Oct 31, 2025
69be99e
Remove manually synced arch versions in `tools/nightly.py` (#166616)
XuehaiPan Oct 30, 2025
24e94e0
[ROCm][CI] create ROCm 7.1 magma tarball (#166693)
jeffdaily Oct 31, 2025
fee7624
[PT2] set choice handler in config (#166607)
xuanzhang816 Oct 31, 2025
1e3600b
[MPS] Move `logaddexp/logaddexp2` to Metal and support complex (#166670)
kurtamohler Oct 30, 2025
c3b71d5
[ROCm][CI] remove relaxed tolerance for tf32 tests (#166478)
jeffdaily Oct 31, 2025
aa9c96a
[BE][Typing][Dynamo] Type misc files in `torch/_dynamo/variables/` (#…
Lucaskabela Oct 31, 2025
1212359
update Node.is_impure check if subgraph contains impure ops (#166609)
jazlyn5 Oct 31, 2025
fcc1063
Revert "[BE][Typing][Dynamo] Type misc files in `torch/_dynamo/variab…
pytorchmergebot Oct 31, 2025
365ed62
Document LibTorch ABI more, add README to headeronly (#166661)
janeyx99 Oct 30, 2025
ffaa657
Revise deprecation warning for ONNX exporter (#166692)
justinchuby Oct 31, 2025
239e7b5
[ROCm][CI] upgrade nightly wheels to ROCm 7.1 (#166730)
jeffdaily Oct 31, 2025
0947765
Cache even more work for return_and_correct_aliasing (#166365)
swolchok Oct 30, 2025
b71966f
[PyTorch] Improve aarch64 performance of bfloat16 ops - retry (#16602…
Nicoshev Oct 31, 2025
85b85f6
Revert "[pytree] add `treespec_{leaf,tuple,dict}` functions for args_…
pytorchmergebot Oct 31, 2025
b470e59
partitioner option to ignore partitioner_tag for abstract usage (#166…
IvanKobzarev Oct 31, 2025
30157d3
Add regional aot eager support to AOTAutogradCacheEntry (#166650)
jamesjwu Oct 30, 2025
08f4535
Refactor AOTAutogradCacheEntry into AOTAutogradResult (#166656)
jamesjwu Oct 31, 2025
d2be06f
[cpu][fix] Update ACL version to fix crashes with tensor sizes > 2^3…
fadara01 Oct 31, 2025
ef8d97e
fix broken nn_convolution test (#166666)
Camyll Oct 31, 2025
856a7a5
Add missing device to namedtensor tests (#166717)
cyyever Oct 31, 2025
cf9a834
[BE] Move GreenContext implementation details to cpp (#166462)
malfet Oct 31, 2025
70aeb49
[dynamo] clarify graph break handling/logging in symbolic_convert (#1…
williamwen42 Oct 31, 2025
8209a05
[Pytorch] Enable aarch64 convert autovec only on clang (#166739)
Nicoshev Oct 31, 2025
4a7bc1d
[BE][Typing][Dynamo] Type misc files in `torch/_dynamo/variables/` (#…
Lucaskabela Oct 31, 2025
e404388
[dynamo, 3.14] fix segfault due to improper create_call_function_ex (…
williamwen42 Oct 31, 2025
d97144d
[5/N] Remove unused loop variables in tests (#166716)
cyyever Oct 31, 2025
93a70c7
Revert "Add CUDA MXFP4 scaled mm support via. FBGEMM (#166526)"
pytorchmergebot Oct 31, 2025
4e7232c
[MPS] Fix `smooth_l1_loss` backward for fp16 (#166687)
malfet Oct 31, 2025
b09fb48
[CD] Upgrade GCC version to 13 for XPU build (#162474)
chuanqi129 Oct 31, 2025
dfebdca
[GraphPartition] cache get_free_symbol_uses (#166338)
BoyuanFeng Oct 31, 2025
9970fb9
Fix Tril Triu SymInt (#166627)
parsshar-RH Oct 31, 2025
2699f54
Revert "[xpu][feature] Integrate OneDNN SDPA training forward/backwar…
pytorchmergebot Oct 31, 2025
5166743
[FlexFlash] Wire up mask_mod + blockmask to flash impl (#166359)
drisspg Oct 31, 2025
d80ae73
compile_worker: Make a timer class (#166465)
c00w Oct 31, 2025
9261a1f
[MPS] Error out when BatchNorm is called for Complex (#166215)
malfet Oct 31, 2025
fd5da81
[AI Codemod][DevmateFBSourceTestFailureBot] Fix for T243177299 ("Your…
pdesupinski Oct 31, 2025
8d59904
add shape check for avg_pool2d (#161952)
jiayisunx Oct 30, 2025
83cc38d
[precompile] Preserve default arguments for dynamo capture (#166654)
zhxchen17 Nov 1, 2025
e2dc32f
Replace decltype(auto) with auto (#166537)
cyyever Nov 1, 2025
f91899c
[2/N] Add strict parameter to Python zip calls (#166257)
cyyever Nov 1, 2025
3dc92d6
Remove setup-env instructions; it's confusing (#166749)
ezyang Nov 1, 2025
60333de
Revert "Remove setup-env instructions; it's confusing (#166749)"
pytorchmergebot Nov 1, 2025
e8fadba
[pytree] add `treespec_{leaf,tuple,dict}` functions for args_spec mod…
XuehaiPan Nov 1, 2025
9d6597b
Correctly use test parameters (#166726)
cyyever Nov 1, 2025
4316df8
[3.14] Fix torch.package.importer (#166767)
malfet Oct 31, 2025
f0745dd
Replace c10::call_once with static initialization (#166381)
cyyever Nov 1, 2025
1aef88c
Avoid DDE in narrow with unbacked start (#166361)
laithsakka Oct 29, 2025
4cc64d6
[inductor] pre grad graph bisecting (#166344)
shunting314 Nov 1, 2025
b3861ac
[reland] Warn if AccumulateGrad stream does not match producer node s…
soulitzer Nov 1, 2025
84776e1
Make PT2 compile backprop through custom op without autograd key a ha…
ezyang Nov 1, 2025
3b5d38a
Fix comparing inductor actual strides vs bw graph for activations sho…
laithsakka Oct 29, 2025
82d86ba
[inductor] track reduction before splitting (#166053)
shunting314 Oct 31, 2025
13549e0
Revert "Avoid DDE in narrow with unbacked start (#166361)"
pytorchmergebot Nov 1, 2025
401c2f9
[FP8][H100][TF32] Disable tf32 for emulated reference computation in …
eqy Nov 1, 2025
82fafb3
Revert "Make PT2 compile backprop through custom op without autograd …
pytorchmergebot Nov 1, 2025
0d81bb7
[3/N] Use 'is' in callable comparisons (#166780)
cyyever Nov 1, 2025
764c54e
[DebugMode] dispatch call hooks (#166348)
pianpwk Oct 31, 2025
a663eb9
[FlexFlash] CuteDSL flat indexer needs to be colexigraphic in coordin…
drisspg Nov 1, 2025
0573747
[inductor] more aggressive mix order reduction (#166382)
shunting314 Oct 31, 2025
04d6a6f
[inductor] Make mix-order-reduction split size not depends on split-r…
shunting314 Oct 31, 2025
c3dc0c7
[Inductor] mix order reduction heuristics and tuning (#166585)
shunting314 Oct 31, 2025
a19e92d
report geomean for norm bwd benchmarking (#166675)
shunting314 Oct 31, 2025
9f9dbe0
add a curve for customized compilation in the kernel benchmarking scr…
shunting314 Oct 31, 2025
b7d348a
[vision hash update] update the pinned vision hash (#166771)
pytorchupdatebot Nov 2, 2025
0674e0a
Fix: list index out of range with softmax when using 0 dim (#166547)
krastogi-in Nov 2, 2025
f013e80
[user-streams] Fix stream graph output semantics (#164819)
mlazos Nov 2, 2025
bc03d7c
[user-streams] Add current stream source (#165211)
mlazos Nov 2, 2025
cee0363
[user-streams] Track symbolic current stream (#165212)
mlazos Nov 2, 2025
76780b1
[user-streams] Handle returning the current stream with/without devic…
mlazos Nov 2, 2025
d962bed
[user-streams] Add basic stream tests (#164523)
mlazos Nov 2, 2025
18f4259
[dynamo] Remove retrieving objects by ID (#162905)
mlazos Nov 2, 2025
e471800
[user-streams] cleanup StreamVariable signature (#166471)
mlazos Nov 2, 2025
2986666
[user-streams] Switch to fx annotations at trace time (#166472)
mlazos Nov 2, 2025
5e05a0a
Revert "Fix: list index out of range with softmax when using 0 dim (#…
pytorchmergebot Nov 2, 2025
bb54296
Fix source_fn_stack being None (#166728)
tugsbayasgalan Nov 2, 2025
6c7cad6
Use Python 3.10 typing (#148418)
cyyever Nov 2, 2025
23b57a4
Remove setup-env instructions; it's confusing (#166749)
ezyang Nov 2, 2025
c8adc08
[Fix] Optimize max unpooling index validation using aminmax (#165394)
lingebeng Nov 2, 2025
16212f0
[Sparse] support for exp op (#166801)
Isalia20 Nov 2, 2025
6268883
[MPS] Refactor `torch.cat` and add fast path for contiguous inputs (#…
kurtamohler Oct 30, 2025
9c22bbb
Add min/max support for barebones uint types (#166813)
ezyang Nov 2, 2025
3ca216a
Add claude skills for uint support and AT_DISPATCH_V2 (#166814)
ezyang Nov 2, 2025
7c203b8
[BE] Using std::move to reduce copy constructor calls by one. (#163599)
thenumberouscode Nov 2, 2025
3eddf04
Revert "Add min/max support for barebones uint types (#166813)"
pytorchmergebot Nov 2, 2025
3b43159
[export] Fix static_input_indices for aot_export_joint (#166761)
angelayi Nov 3, 2025
4a7fefd
[dynamo] fix pos-only names should can be collected in `**kwargs` (#1…
XuehaiPan Nov 2, 2025
fee1ac9
[DebugMode] add stack traces (#166440)
pianpwk Nov 2, 2025
392acee
[6/N] Remove unused loop variables in tests (#166785)
cyyever Nov 3, 2025
1c4ced2
[2/N] Correctly use test parameters (#166783)
cyyever Nov 3, 2025
69fb3eb
Fix: type promotion in FakeTensor (#166522)
krastogi-in Nov 3, 2025
a5f0007
torch.cond supports autograd now (#165908)
ezyang Nov 3, 2025
5a3930a
Revert "Back out "Do not decompose in functionalization/proxy tensor …
ezyang Nov 2, 2025
3f54010
[3/N] Add clang-tidy readability checks (#164692)
cyyever Nov 3, 2025
e1d011d
[2/N] Change C-style casts to static_cast or reinterpret_cast (#165891)
cyyever Nov 3, 2025
e0791fc
Give full Dynamo stack traces in CI (#160417)
ezyang Aug 31, 2025
9501405
[caffe2] Ignore -Wswitch-enum warnings (#166760)
NSProgrammer Nov 3, 2025
061fa73
Reapply "Back out "Do not decompose in functionalization/proxy tensor…
pytorchmergebot Nov 3, 2025
defac66
[xla hash update] update the pinned xla hash (#166845)
pytorchupdatebot Nov 3, 2025
ae038f8
[inductor] Collectives estimations: option to use nccl estimator for …
IvanKobzarev Nov 3, 2025
a4077b5
Revert "[MPS] Error out when BatchNorm is called for Complex (#166215)"
pytorchmergebot Nov 3, 2025
5d62307
Revert "Give full Dynamo stack traces in CI (#160417)"
pytorchmergebot Nov 3, 2025
1656b25
Revert "[MPS] Fix `smooth_l1_loss` backward for fp16 (#166687)"
pytorchmergebot Nov 3, 2025
61bcc8d
Revert "Fixes torch.compile(nn.ModuleList()) changes bool() behavior …
pytorchmergebot Nov 3, 2025
d177900
[Code Clean] Clean asserts in torch/ao/quantization (root, quantizer,…
zhudada0120 Nov 3, 2025
a2da693
Remove nightly pth check from pyrefly (#166857)
ezyang Nov 3, 2025
76bb27e
Revert "Back out "Do not decompose in functionalization/proxy tensor …
ezyang Nov 2, 2025
335b5c7
Avoid std::copy_n in CopyKernel and IndexKernel (#143544)
cyyever Nov 3, 2025
73da7a4
[MPS] Error out when BatchNorm is called for Complex (#166215)
malfet Nov 3, 2025
f33abae
Switch to pyrefly as only type checker (#166197)
maggiemoss Nov 3, 2025
3f6538f
Remove tools from BC linter (#166858)
ezyang Nov 3, 2025
94f2657
[Inductor] addmm with bias -> unfuse bias if there is a pointwise/red…
nikitaved Nov 3, 2025
104b868
Fix build error by checking cuda version in CUDAGreenContext (#166800)
irshadcc Nov 3, 2025
984b096
[ROCm][CI] Change rocm.yml and inductor-rocm.yml cron schedule to run…
amdfaa Nov 3, 2025
f3fa560
Integrate NVIDIA cuSolver backend into ATen/Linalg (initial implement…
johannesz-codes Nov 3, 2025
7b29926
Update test jobs in pull workflow to c7i (#165646)
zxiiro Nov 3, 2025
5b17ef3
Update docs-build to c7i (#166727)
zxiiro Nov 3, 2025
bcad4f2
[FSDP][Replicate] final version integrating 1D device mesh replicate …
anshul-si Oct 28, 2025
d67d807
[FSDP][Replicate] added two replicate overload declarations and chang…
anshul-si Oct 28, 2025
2f3f88f
Revert "[FSDP][Replicate] added two replicate overload declarations a…
pytorchmergebot Nov 3, 2025
fa0fd6b
Revert "[FSDP][Replicate] final version integrating 1D device mesh re…
pytorchmergebot Nov 3, 2025
aa4a8c9
[Inductor][Triton][FP8] Support tile-wise (1x128) scaling in Inductor…
jananisriram Nov 3, 2025
e3bd7bd
[FP8] Enable FP16 output support for torch scaled_mm when using CUTLA…
oyye Nov 3, 2025
c761999
Avoid DDE in narrow with unbacked start (#166361)
laithsakka Nov 2, 2025
71a2e93
[cuDNN][SDPA] Check-in test for #166211 (#166570)
eqy Nov 3, 2025
3af1f7b
[easy][MTIAGraph][Pytorch] clang-format files (#166805)
andyanwang Nov 3, 2025
612ead1
[distributed] Replace assert statements with AssertionError exception…
RohitRathore1 Nov 3, 2025
ee1bc3f
Manylinux ROCm docker images. use devtoolset-13 (#166764)
atalman Nov 3, 2025
68e31e2
[CUDA] Skip pynvml test on platforms that don't have complete support…
eqy Nov 3, 2025
c10975d
Revert "Avoid DDE in narrow with unbacked start (#166361)"
pytorchmergebot Nov 3, 2025
5125872
Fix unused assignments (#166791)
cyyever Nov 3, 2025
83cd626
[opaque_obj_v2] make_fx support (#165005)
angelayi Nov 3, 2025
77b9399
[random] Add `generator` arg to `rand*_like` APIs (#166160)
KarhouTam Nov 3, 2025
3a38ec7
[inductor] Expand use of generic benchmark function (#164938)
kundaMwiza Nov 3, 2025
6725ee8
Fix cuda blas build error due to extra && (#166811)
irshadcc Nov 3, 2025
b8855e7
Add conv ops to operator microbenchmark (#166331)
jainapurva Nov 3, 2025
01d8d85
[MTIAGraph][Pytorch][2.1/n] Add API to destroy graph C++ instance (#1…
andyanwang Nov 3, 2025
27cfdd9
[export] Return more information from tracing context in graph captur…
zhxchen17 Nov 1, 2025
7d1b976
[export] Make dict_keys_getitem tracable. (#166776)
zhxchen17 Nov 1, 2025
11f73d7
[export] Downgrade captured buffers as normal constants. (#166777)
zhxchen17 Nov 1, 2025
eea8ff2
Fix torch.full with dynamic tensor fill_value in torch.compile (#166554)
amaldevh Nov 3, 2025
86b2d82
Revert "[Inductor] addmm with bias -> unfuse bias if there is a point…
pytorchmergebot Nov 3, 2025
6c98657
Add some Triton related suppressions that don't show on CI (#166868)
ezyang Nov 3, 2025
2b7e4c3
[DCP] Add option to use PrefixStore to create checkpoint background p…
kevinmtang Nov 3, 2025
616314c
[FSDP][Replicate] final version integrating 1D device mesh replicate …
anshul-si Nov 3, 2025
2eea9c4
Merge remote-tracking branch 'upstream/main' into develop_IFU_20251103
github-actions[bot] Nov 3, 2025
86a7a33
Fix merge conflict
pragupta Nov 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .bc-linter.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,4 @@ exclude:
- "**/benchmarks/**"
- "**/test_*.py"
- "**/*_test.py"
- "tools/**"
5 changes: 4 additions & 1 deletion .ci/docker/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -195,13 +195,16 @@ case "$tag" in
NINJA_VERSION=1.9.0
TRITON=yes
;;
pytorch-linux-jammy-xpu-n-py3)
pytorch-linux-jammy-xpu-n-py3 | pytorch-linux-jammy-xpu-n-py3-inductor-benchmarks)
ANACONDA_PYTHON_VERSION=3.10
GCC_VERSION=11
VISION=yes
XPU_VERSION=2025.2
NINJA_VERSION=1.9.0
TRITON=yes
if [[ $tag =~ "benchmarks" ]]; then
INDUCTOR_BENCHMARKS=yes
fi
;;
pytorch-linux-jammy-py3-gcc11-inductor-benchmarks)
ANACONDA_PYTHON_VERSION=3.10
Expand Down
2 changes: 1 addition & 1 deletion .ci/docker/common/install_acl.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

set -eux

ACL_VERSION=${ACL_VERSION:-"v25.02"}
ACL_VERSION=${ACL_VERSION:-"v52.6.0"}
ACL_INSTALL_DIR="/acl"

# Clone ACL
Expand Down
10 changes: 9 additions & 1 deletion .ci/docker/common/install_conda.sh
Original file line number Diff line number Diff line change
Expand Up @@ -49,12 +49,20 @@ if [ -n "$ANACONDA_PYTHON_VERSION" ]; then
export SYSROOT_DEP="sysroot_linux-64=2.17"
fi

# Install correct Python version
# Also ensure sysroot is using a modern GLIBC to match system compilers
if [ "$ANACONDA_PYTHON_VERSION" = "3.14" ]; then
as_jenkins conda create -n py_$ANACONDA_PYTHON_VERSION -y\
python="3.14.0" \
${SYSROOT_DEP} \
-c conda-forge
else
# Install correct Python version
# Also ensure sysroot is using a modern GLIBC to match system compilers
as_jenkins conda create -n py_$ANACONDA_PYTHON_VERSION -y\
python="$ANACONDA_PYTHON_VERSION" \
${SYSROOT_DEP}

fi
# libstdcxx from conda default channels are too old, we need GLIBCXX_3.4.30
# which is provided in libstdcxx 12 and up.
conda_install libstdcxx-ng=12.3.0 --update-deps -c conda-forge
Expand Down
4 changes: 0 additions & 4 deletions .ci/docker/common/install_rocm.sh
Original file line number Diff line number Diff line change
Expand Up @@ -40,11 +40,7 @@ EOF

# Default url values
rocm_baseurl="http://repo.radeon.com/rocm/apt/${ROCM_VERSION}"
amdgpu_baseurl="https://repo.radeon.com/amdgpu/${ROCM_VERSION}/ubuntu"

# Add amdgpu repository
UBUNTU_VERSION_NAME=`cat /etc/os-release | grep UBUNTU_CODENAME | awk -F= '{print $2}'`
echo "deb [arch=amd64] ${amdgpu_baseurl} ${UBUNTU_VERSION_NAME} main" > /etc/apt/sources.list.d/amdgpu.list

# Add rocm repository
wget -qO - http://repo.radeon.com/rocm/rocm.gpg.key | apt-key add -
Expand Down
4 changes: 2 additions & 2 deletions .ci/docker/common/install_rocm_magma.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ function do_install() {

rocm_version_nodot=${rocm_version//./}

# https://github.com/icl-utk-edu/magma/pull/65
MAGMA_VERSION=d6e4117bc88e73f06d26c6c2e14f064e8fc3d1ec
# post merge of https://github.com/icl-utk-edu/magma/pull/65
MAGMA_VERSION=c0792ae825fb36872784892ea643dd6f3456bc5f
magma_archive="magma-rocm${rocm_version_nodot}-${MAGMA_VERSION}-1.tar.bz2"

rocm_dir="/opt/rocm"
Expand Down
2 changes: 1 addition & 1 deletion .ci/docker/manywheel/Dockerfile_2_28
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ FROM cpu_final as rocm_final
ARG ROCM_VERSION=6.0
ARG PYTORCH_ROCM_ARCH
ENV PYTORCH_ROCM_ARCH ${PYTORCH_ROCM_ARCH}
ARG DEVTOOLSET_VERSION=11
ARG DEVTOOLSET_VERSION=13
ENV LDFLAGS="-Wl,-rpath=/opt/rh/gcc-toolset-${DEVTOOLSET_VERSION}/root/usr/lib64 -Wl,-rpath=/opt/rh/gcc-toolset-${DEVTOOLSET_VERSION}/root/usr/lib"
# Somewhere in ROCm stack, we still use non-existing /opt/rocm/hip path,
# below workaround helps avoid error
Expand Down
2 changes: 1 addition & 1 deletion .ci/docker/manywheel/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ case ${image} in
manylinux2_28-builder:xpu)
TARGET=xpu_final
GPU_IMAGE=amd64/almalinux:8
DOCKER_GPU_BUILD_ARG=" --build-arg DEVTOOLSET_VERSION=11"
DOCKER_GPU_BUILD_ARG=" --build-arg DEVTOOLSET_VERSION=13"
MANY_LINUX_VERSION="2_28"
;;
*)
Expand Down
23 changes: 14 additions & 9 deletions .ci/docker/requirements-ci.txt
Original file line number Diff line number Diff line change
Expand Up @@ -136,10 +136,11 @@ numba==0.61.2 ; python_version > "3.9"
#test_nn.py, test_namedtensor.py, test_linalg.py, test_jit_cuda_fuser.py,
#test_jit.py, test_indexing.py, test_datapipe.py, test_dataloader.py,
#test_binary_ufuncs.py
numpy==2.0.2 ; python_version == "3.9"
numpy==2.1.2 ; python_version > "3.9"
numpy==2.1.2; python_version > "3.9" and python_version < "3.14"
numpy==2.3.4; python_version >= "3.14"

pandas==2.2.3
pandas==2.2.3; python_version >= "3.9" and python_version < "3.14"
pandas==2.3.3; python_version >= "3.14"

#onnxruntime
#Description: scoring engine for Open Neural Network Exchange (ONNX) models
Expand All @@ -151,7 +152,8 @@ opt-einsum==3.3
#Pinned versions: 3.3
#test that import: test_linalg.py

optree==0.13.0
optree==0.13.0 ; python_version < "3.14"
optree==0.17.0 ; python_version >= "3.14"
#Description: A library for tree manipulation
#Pinned versions: 0.13.0
#test that import: test_vmap.py, test_aotdispatch.py, test_dynamic_shapes.py,
Expand Down Expand Up @@ -249,8 +251,8 @@ scikit-image==0.22.0
#Pinned versions: 0.20.3
#test that import:

scipy==1.13.1 ; python_version == "3.9"
scipy==1.14.1 ; python_version > "3.9"
scipy==1.14.1 ; python_version > "3.9" and python_version < "3.14"
scipy==1.16.2 ; python_version >= "3.14"
# Pin SciPy because of failing distribution tests (see #60347)
#Description: scientific python
#Pinned versions: 1.10.1
Expand Down Expand Up @@ -321,7 +323,8 @@ pywavelets==1.7.0 ; python_version >= "3.12"
#Pinned versions: 1.4.1
#test that import:

lxml==5.3.0
lxml==5.3.0 ; python_version < "3.14"
lxml==6.0.2 ; python_version >= "3.14"
#Description: This is a requirement of unittest-xml-reporting

PyGithub==2.3.0
Expand All @@ -331,7 +334,9 @@ sympy==1.13.3
#Pinned versions:
#test that import:

onnx==1.19.1
onnx==1.19.1 ; python_version < "3.14"
# Unpin once Python 3.14 is supported. See onnxruntime issue 26309.
onnx==1.18.0 ; python_version == "3.14"
#Description: Required by onnx tests, and mypy and test_public_bindings.py when checking torch.onnx._internal
#Pinned versions:
#test that import:
Expand All @@ -356,7 +361,7 @@ pwlf==2.2.1
#test that import: test_sac_estimator.py

# To build PyTorch itself
pyyaml==6.0.2
pyyaml==6.0.3
pyzstd
setuptools==78.1.1
packaging==23.1
Expand Down
5 changes: 4 additions & 1 deletion .ci/docker/ubuntu-xpu/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -54,12 +54,15 @@ ENV OPENSSL_DIR /opt/openssl
RUN rm install_openssl.sh

ARG INDUCTOR_BENCHMARKS
ARG ANACONDA_PYTHON_VERSION
ENV ANACONDA_PYTHON_VERSION=$ANACONDA_PYTHON_VERSION
COPY ./common/install_inductor_benchmark_deps.sh install_inductor_benchmark_deps.sh
COPY ./common/common_utils.sh common_utils.sh
COPY ci_commit_pins/huggingface-requirements.txt huggingface-requirements.txt
COPY ci_commit_pins/timm.txt timm.txt
COPY ci_commit_pins/torchbench.txt torchbench.txt
RUN if [ -n "${INDUCTOR_BENCHMARKS}" ]; then bash ./install_inductor_benchmark_deps.sh; fi
RUN rm install_inductor_benchmark_deps.sh common_utils.sh timm.txt huggingface-requirements.txt
RUN rm install_inductor_benchmark_deps.sh common_utils.sh timm.txt huggingface-requirements.txt torchbench.txt

# Install XPU Dependencies
ARG XPU_VERSION
Expand Down
2 changes: 1 addition & 1 deletion .ci/lumen_cli/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ dependencies = [
"GitPython==3.1.45",
"docker==7.1.0",
"pytest==7.3.2",
"uv==0.9.5"
"uv==0.9.6"
]

[tool.setuptools]
Expand Down
8 changes: 7 additions & 1 deletion .ci/magma-rocm/Makefile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
SHELL=/usr/bin/env bash

DOCKER_CMD ?= docker
DESIRED_ROCM ?= 7.0
DESIRED_ROCM ?= 7.1
DESIRED_ROCM_SHORT = $(subst .,,$(DESIRED_ROCM))
PACKAGE_NAME = magma-rocm
# inherit this from underlying docker image, do not pass this env var to docker
Expand All @@ -16,6 +16,7 @@ DOCKER_RUN = set -eou pipefail; ${DOCKER_CMD} run --rm -i \
magma-rocm/build_magma.sh

.PHONY: all
all: magma-rocm71
all: magma-rocm70
all: magma-rocm64

Expand All @@ -24,6 +25,11 @@ clean:
$(RM) -r magma-*
$(RM) -r output

.PHONY: magma-rocm71
magma-rocm71: DESIRED_ROCM := 7.1
magma-rocm71:
$(DOCKER_RUN)

.PHONY: magma-rocm70
magma-rocm70: DESIRED_ROCM := 7.0
magma-rocm70:
Expand Down
6 changes: 3 additions & 3 deletions .ci/magma-rocm/build_magma.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@ set -eou pipefail
# The script expects DESIRED_CUDA and PACKAGE_NAME to be set
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"

# https://github.com/icl-utk-edu/magma/pull/65
MAGMA_VERSION=d6e4117bc88e73f06d26c6c2e14f064e8fc3d1ec
# post merge of https://github.com/icl-utk-edu/magma/pull/65
MAGMA_VERSION=c0792ae825fb36872784892ea643dd6f3456bc5f

# Folders for the build
PACKAGE_FILES=${ROOT_DIR}/magma-rocm/package_files # metadata
Expand All @@ -20,7 +20,7 @@ mkdir -p ${PACKAGE_DIR} ${PACKAGE_OUTPUT}/linux-64 ${PACKAGE_BUILD} ${PACKAGE_RE

# Fetch magma sources and verify checksum
pushd ${PACKAGE_DIR}
git clone https://github.com/jeffdaily/magma
git clone https://github.com/icl-utk-edu/magma
pushd magma
git checkout ${MAGMA_VERSION}
popd
Expand Down
2 changes: 1 addition & 1 deletion .ci/pytorch/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -426,7 +426,7 @@ fi
if [[ "$BUILD_ENVIRONMENT" != *libtorch* && "$BUILD_ENVIRONMENT" != *bazel* ]]; then
# export test times so that potential sharded tests that'll branch off this build will use consistent data
# don't do this for libtorch as libtorch is C++ only and thus won't have python tests run on its build
python tools/stats/export_test_times.py
PYTHONPATH=. python tools/stats/export_test_times.py
fi
# don't do this for bazel or s390x or riscv64 as they don't use sccache
if [[ "$BUILD_ENVIRONMENT" != *s390x* && "$BUILD_ENVIRONMENT" != *riscv64* && "$BUILD_ENVIRONMENT" != *-bazel-* ]]; then
Expand Down
8 changes: 6 additions & 2 deletions .ci/pytorch/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -572,6 +572,8 @@ fi

if [[ "${TEST_CONFIG}" == *cpu* ]]; then
DYNAMO_BENCHMARK_FLAGS+=(--device cpu)
elif [[ "${TEST_CONFIG}" == *xpu* ]]; then
DYNAMO_BENCHMARK_FLAGS+=(--device xpu)
else
DYNAMO_BENCHMARK_FLAGS+=(--device cuda)
fi
Expand Down Expand Up @@ -665,6 +667,8 @@ test_perf_for_dashboard() {
device=cuda_b200
elif [[ "${TEST_CONFIG}" == *rocm* ]]; then
device=rocm
elif [[ "${TEST_CONFIG}" == *xpu* ]]; then
device=xpu
fi

for mode in "${modes[@]}"; do
Expand Down Expand Up @@ -1649,7 +1653,7 @@ test_operator_microbenchmark() {

cd "${TEST_DIR}"/benchmarks/operator_benchmark

for OP_BENCHMARK_TESTS in matmul mm addmm bmm; do
for OP_BENCHMARK_TESTS in matmul mm addmm bmm conv; do
$TASKSET python -m pt.${OP_BENCHMARK_TESTS}_test --tag-filter long \
--output-json-for-dashboard "${TEST_REPORTS_DIR}/operator_microbenchmark_${OP_BENCHMARK_TESTS}_compile.json" \
--benchmark-name "PyTorch operator microbenchmark" --use-compile
Expand Down Expand Up @@ -1757,7 +1761,7 @@ elif [[ "${TEST_CONFIG}" == *torchbench* ]]; then
else
# Do this after checkout_install_torchbench to ensure we clobber any
# nightlies that torchbench may pull in
if [[ "${TEST_CONFIG}" != *cpu* ]]; then
if [[ "${TEST_CONFIG}" != *cpu* && "${TEST_CONFIG}" != *xpu* ]]; then
install_torchrec_and_fbgemm
fi
PYTHONPATH=/torchbench test_dynamo_benchmark torchbench "$id"
Expand Down
2 changes: 2 additions & 0 deletions .clang-tidy
Original file line number Diff line number Diff line change
Expand Up @@ -60,9 +60,11 @@ performance-*,
readability-container-size-empty,
readability-delete-null-pointer,
readability-duplicate-include,
readability-named-parameter,
readability-misplaced-array-index,
readability-redundant*,
readability-simplify-subscript-expr,
readability-static-definition-in-anonymous-namespace
readability-string-compare,
-readability-redundant-access-specifiers,
-readability-redundant-control-flow,
Expand Down
Loading
Loading