-
Notifications
You must be signed in to change notification settings - Fork 55
Issues: NVIDIA/Fuser
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
max_persistent_buffer_size may be smaller than total_reduction_numel
bug
Something isn't working
#4075
opened Mar 13, 2025 by
naoyam
Persistent buffer with broadcast results in inconsistent parallelization
bug
Something isn't working
#4074
opened Mar 13, 2025 by
naoyam
Refactor New feature or request
IndexLowering::handle(const LoadStoreOp* ldst)
enhancement
#4058
opened Mar 11, 2025 by
rdspring1
RFE: Take contiguity caching into nvFuser
enhancement
New feature or request
#4043
opened Mar 7, 2025 by
csarofeen
inplace update done via aliased outputs should have more strict checks
#4036
opened Mar 7, 2025 by
jjsjann123
benchmarking suite should initialize cuda graphs / profiler interaction
Python Benchmarks
#4008
opened Mar 4, 2025 by
tfogal
take_along_axis validation error (race condition?)
bug
Something isn't working
#4003
opened Mar 3, 2025 by
naoyam
checking for compatible allocation domain on
Fusion::replaceOutput
#3994
opened Feb 28, 2025 by
jjsjann123
cudaErrorMisalignedAddress
when sweeping matmul problems with NN, TN, and TT layouts.
Matmuls
#3966
opened Feb 25, 2025 by
rdspring1
Incorrect results when problem size M is not divisible by 16.
Matmuls
#3963
opened Feb 25, 2025 by
rdspring1
Optimize TMA Store logic to handle pipelining and aliasing.
Matmuls
#3961
opened Feb 25, 2025 by
rdspring1
Missing block sync after epilogue compute but before stmatrix (Correctness)
Matmuls
#3960
opened Feb 25, 2025 by
rdspring1
Swizzle tiles in matmul without introducing larger grid due to nondivisible splits
Matmuls
#3942
opened Feb 21, 2025 by
jacobhinkle
Allow separate sub-DAG for load and compute warp groups with warp-specialized circular buffering.
Matmuls
TMA
#3941
opened Feb 21, 2025 by
rdspring1
MarkAliasesPrepare to recognize meta ops with DID loop split.
allocation domain
issues related to allocation domain support
Multi-GPU
#3902
opened Feb 15, 2025 by
wujingyue
Fix ReorderShardedAxis and MakeReshardingContiguous for DID loop split.
Multi-GPU
#3900
opened Feb 15, 2025 by
wujingyue
Feature request: Consider privatization instead of forwarding in fusion segmentation
Segmentation
Issues related to nvFuser Segmentation
#3832
opened Feb 5, 2025 by
naoyam
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.