ParCIS Lab, BUPT
Popular repositories Loading
-
FlashSparse
FlashSparse PublicFlashSparse significantly reduces the computation redundancy for unstructured sparsity (for SpMM and SDDMM) on Tensor Cores through a Swap-and-Transpose mapping strategy. FlashSparse is accepted by…
-
DNN-cpp-proxies
DNN-cpp-proxies PublicC++/MPI proxies for distributed training of deep neural networks.
C++ 1
Repositories
- FlashSparse Public
FlashSparse significantly reduces the computation redundancy for unstructured sparsity (for SpMM and SDDMM) on Tensor Cores through a Swap-and-Transpose mapping strategy. FlashSparse is accepted by PPoPP 2025.
- Chimera Public
Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.
- Ok-Topk Public
Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k communication volume which is asymptotically optimal) with the decentralized parallel Stochastic Gradient Descent (SGD) optimizer, and its convergence is proved theoretically and empirically.
- Magicube Public
Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…