Skip to content

Pull requests: NVIDIA/TransformerEngine

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Added MCore FSDP support for TE
#1890 opened Jun 18, 2025 by sanandaraj5597 Loading…
Handle dtypes more carefully in multi-tensor Adam bug Something isn't working
#1888 opened Jun 17, 2025 by timmoon10 Loading…
6 of 13 tasks
[Pytorch] CP + THD + chunked attention support.
#1887 opened Jun 17, 2025 by pggPL Draft
1 of 13 tasks
pipeline aware cpu offload
#1886 opened Jun 17, 2025 by liuzhenhai93 Loading…
8 tasks done
[Draft] Test PR for CI
#1885 opened Jun 16, 2025 by jberchtold-nvidia Loading…
13 tasks
Tongliu/router fusion
#1883 opened Jun 16, 2025 by Autumn1998 Loading…
13 tasks
[PyTorch] Limit max time for distributed PyTorch tests testing Improvements to tests or testing infrastructure
#1877 opened Jun 13, 2025 by timmoon10 Loading…
6 of 14 tasks
Optimize reshaping tensors in the te.ops.Sequential implementation
#1876 opened Jun 12, 2025 by janekb04 Loading…
7 of 13 tasks
Fix cppunittest test.sh for editable installs
#1869 opened Jun 11, 2025 by jberchtold-nvidia Loading…
8 of 13 tasks
[PyTorch|common] Optimize unpadding kernel for FP8
#1866 opened Jun 11, 2025 by xiaoxi-wangfj Loading…
9 tasks
[PyTorch] Add save_original_input in Linear/GroupedLinear to save memory
#1865 opened Jun 11, 2025 by hxbai Loading…
1 of 13 tasks
[JAX] GEMM custom op 2.5.0
#1855 opened Jun 6, 2025 by denera Loading…
8 of 13 tasks
[PyTorch Debug] Fixed the empty tensor bug in statistics computation
#1843 opened Jun 3, 2025 by pggPL Loading…
8 of 13 tasks
TE Gemma tutorial attempt#2
#1839 opened Jun 2, 2025 by sudhakarsingh27 Draft
1 task done
Make quantize_ respect the usages of the quantizer
#1836 opened May 31, 2025 by ptrendx Loading…
13 tasks
Add cuBLASMp-backed GEMM-like API to TE common
#1824 opened May 27, 2025 by mk-61 Loading…
4 of 13 tasks
[PyTorch][MoE] Reduce CPU Overhead By Fuse Torch Empty Calls performance Performance issues
#1793 opened May 16, 2025 by zhongbozhu Loading…
1 of 13 tasks
[PyTorch] Update PyTorch FSDP2 test to cover all TE layer types testing Improvements to tests or testing infrastructure
#1777 opened May 12, 2025 by denera Loading…
8 of 13 tasks
[PyTorch] Draft of new activation offloading API
#1762 opened May 8, 2025 by pggPL Draft
13 tasks
cache sequence chunk ids for reordering
#1757 opened May 7, 2025 by xrennvidia Draft
13 tasks
Zr te doc edits
#1745 opened May 2, 2025 by zredeaux07 Loading…
12 tasks
[PyTorch] Refactor activation offloading of quantized tensors.
#1738 opened Apr 30, 2025 by pggPL Loading…
8 of 13 tasks
ProTip! no:milestone will show everything without a milestone.