cuda_filter set cudaStream in global scope breaking multi-stream support #5153

nv-dlasalle · 2023-01-11T18:27:27Z

🐛 Bug

This line:
https://github.com/dmlc/dgl/blob/master/src/array/cuda/cuda_filter.cu#L21
Should be inside of the function calls so as to capture the current cuda stream.

I believe this has been an issue since multi-stream support was officially added, as what was once a constexpr default stream declared globally got converted to a call to get the current cuda stream by #4480 and #4503, but remained in global scope.

This bug essentially causes undefined behavior as alternate cuda streams may access uninitialized GPU memory.

The text was updated successfully, but these errors were encountered:

…5153) (#5157) * Add failing unit test * Add fix * Remove extra newline * skip cpu test Co-authored-by: Xin Yao <yaox12@outlook.com>

* [Bugfix] Fix the call of tar.extractall() method (#5139) * [Sparse] Add elementwise mul and div method, and polish the python doc. (#5142) * add mul and div * polish * equivalent * indent * double Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * Polish (#5149) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * polish (#5150) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * blabla (#5152) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * typo (#5133) * fix (#5151) Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Sparse] Reduce peak memory of SpSpAdd. (#5143) * [Sparse] Reduce peak memory usage of SpSpAdd * Update * Update * [Sparse] Use efficient implementation for Diag @ Sparse and Sparse @ Diag. (#5147) * change regex (#5158) * [Bugfix] Replace global cudaStream in Filter with runtime calls (fix #5153) (#5157) * Add failing unit test * Add fix * Remove extra newline * skip cpu test Co-authored-by: Xin Yao <yaox12@outlook.com> * [Sparse] Use NotImplemented to let Python dispatch types (#5160) * use NotImplemented * add TypeError tests * format * lint * fix type * fix docstring * lint * remove redundant condition * Delete train_sampling_unsupervised.py (#5161) * [CI] fix device configure when run on GPU (#5154) * [Doc] remove deprected Node/EdgeDataLoader (#5171) * [Sparse] Fix sddmm docstring (#5169) * [Sparse] Make functions compatible with PyTorch scalar tensors (#5163) * use NotImplemented * format * extend to pytorch scalar * reformat * reformat * lint * [Sparse] Rename as_sparse to to_sparse, dense to to_dense. (#5170) * as_sp_to_sp * dense * revert_mock * test * revert Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> * [Sparse] Refactor matmul interface. (#5162) * [Sparse] Refactor matmul interface. * Update * [Sparse] Support sparse matrix dividing scalar (#5173) * use NotImplemented * format * extend to pytorch scalar * sparse div scalar * oops * Apply suggestions from code review Co-authored-by: Mufei Li <mufeili1996@gmail.com> Co-authored-by: Mufei Li <mufeili1996@gmail.com> * [Sparse] Use X to represent dense tensors in sddmm (#5174) * [Sparse] Use X to represent dense tensors in sddmm * indent * pylint: disable=invalid-name Co-authored-by: Hongzhi (Steve), Chen <chenhongzhi.nkcs@gmail.com> * [Sparse] Sparse matrix subtraction (#5164) * use NotImplemented * format * sparse sub * address comments * lint * Update elementwise_op_diag.py * ugh Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> * [BUILD] enable dgl sparse build in default (#5175) * enable dgl sparse build in default * disable sparse build on cugraph * update * [Doc] sparse lib tutorial and colab links (#5181) * Created using Colaboratory * nbsphinx & nblink * Created using Colaboratory * update notebook * minor update Co-authored-by: Chang Liu <chang.liu@utexas.edu> Co-authored-by: Steve <ubuntu@ip-172-31-34-29.ap-northeast-1.compute.internal> Co-authored-by: Quan (Andy) Gan <coin2028@hotmail.com> Co-authored-by: czkkkkkk <zekucai@gmail.com> Co-authored-by: nv-dlasalle <63612878+nv-dlasalle@users.noreply.github.com> Co-authored-by: Xin Yao <yaox12@outlook.com> Co-authored-by: Xin Yao <xiny@nvidia.com> Co-authored-by: Rhett Ying <85214957+Rhett-Ying@users.noreply.github.com> Co-authored-by: Mufei Li <mufeili1996@gmail.com> Co-authored-by: Minjie Wang <wmjlyjemaine@gmail.com>

…mlc#5153) (dmlc#5157) * Add failing unit test * Add fix * Remove extra newline * skip cpu test Co-authored-by: Xin Yao <yaox12@outlook.com>

nv-dlasalle self-assigned this Jan 11, 2023

frozenbugs added the Release Blocker Issues that blocks release label Jan 12, 2023

nv-dlasalle mentioned this issue Jan 12, 2023

Replace global cudaStream in Filter with runtime calls (fix #5153) #5157

Merged

7 tasks

yaox12 closed this as completed in #5157 Jan 12, 2023

yaox12 added a commit that referenced this issue Jan 12, 2023

[Bugfix] Replace global cudaStream in Filter with runtime calls (fix #…

751b4c2

…5153) (#5157) * Add failing unit test * Add fix * Remove extra newline * skip cpu test Co-authored-by: Xin Yao <yaox12@outlook.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuda_filter set cudaStream in global scope breaking multi-stream support #5153

cuda_filter set cudaStream in global scope breaking multi-stream support #5153

nv-dlasalle commented Jan 11, 2023

cuda_filter set cudaStream in global scope breaking multi-stream support #5153

cuda_filter set cudaStream in global scope breaking multi-stream support #5153

Comments

nv-dlasalle commented Jan 11, 2023

🐛 Bug