-
-
Notifications
You must be signed in to change notification settings - Fork 1
Publications
Ryan Swann, Muhammad Osama, Karthik Sangaiah, and Jalal Mahmud. Seer: Predictive Runtime Kernel Selection for Irregular Problems. IEEE/ACM International Symposium on Code Generation and Optimization (CGO), CGO 2024, March 2024.
Muhammad Osama, Serban D. Porumbescu, and John D. Owens. A Programming Model for GPU Load Balancing. In Proceedings of the 28th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2023, February–March 2023.
Muhammad Osama, Duane Merrill, Cris Cecka, Michael Garland, and John D. Owens. Stream-K: Work-centric Parallel Decomposition for Dense Matrix-Matrix Multiplication on the GPU. arXiv, January 2023. Appeared as a poster paper in Proceedings of the 28th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2023, February–March 2023.
Muhammad Osama. GPU Load Balancing. Doctoral dissertation, University of California, Davis, December 2022.
Cameron Shinn, Collin Michael McCarthy, Muhammad Osama, Saurav Muralidharan, John D. Owens. The Sparsity Roofline. ArXiv.
Muhammad Osama, Serban D. Porumbescu, and John D. Owens. Essentials of Parallel Graph Analytics. In Proceedings of The 36th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2022, pages 314-317, May 2022.
Muhammad Osama, Minh Truong, Carl Yang, Aydin Buluç, and John D. Owens. Graph Coloring on the GPU. In Proceedings of The 33rd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2019, pages 231–240, May 2019.
Yuechao Pan, Muhammad Osama, and John D. Owens. Synchronous vs. Asynchronous GPU Graph Frameworks. The 7th Workshop on Multi-core and Rack Scale Systems (MARS), April 2017.
Yangzihao Wang, Yuechao Pan, Andrew Davidson, Yuduo Wu, Carl Yang, Leyuan Wang, Muhammad Osama, Chenshan Yuan, Weitang Liu, Andy T. Riffel, and John D. Owens. Gunrock: GPU Graph Analytics. ACM Transactions on Parallel Computing (TOPC), January 2017.
© 2023 Muhammad Osama
🐧 Home
- mosama at ucdavis dot edu
- github/neoblizz
- Resume
Ph.D. works and important research contributions.
- Gunrock - GPU Graph Analytics
- Load-balancing - Irregular-Parallel Computations on GPUs
- Essentials - GPU/C++ Graph Analytics Simplified
- Essentials of Parallel Graph Analytics
- 🔒Improved Scheduling for Dense Linear-Algebra
- 🔒Load balancing Sparse-Tensor Tensor (SpTT) Contractions
- 🔒GPU Fusion/Mosaic: Multi-GPU Virtualization in CUDART and CUDA Driver
Some research, some fun stuff.
- Boids++
- Blender & Inverse Kinematics
- CUDAGL
- CUPTI++
- Capturing Conditional Inheritance in C++
- Cybersecurity: Netflow Graph Processing using Gunrock
Projects I have planned for the future.
-
__ignore__
keyword (CUDA) - Guide to CUDA Optimizations
- ...
- gunrock/loops
- gunrock/gunrock
- stdgraph/graph-v2
- gunrock/essentials
- moderngpu/moderngpu
- spmv
- gunrock/essentials-cpp
- cupti-plus-plus
My personal blog.
- ...
Random, useful notes.
- HIPYFY.me
- Speed-of-light analysis of SpMM
- Streams synchronization by Stephen Jones
- Streaming data as iterators
- lb
- C++ Tricks by Daisy Hollman