Avoid unintended eager cuda initialization #199

dkasapp · 2021-12-24T19:22:21Z

We noticed the package initialization for sru is eagerly triggering the initialization because of the following stack of module imports sru.modules -> sru.ops -> cuda_functional and this last module is executing the function load of torch.utils.cpp_extension.

This was detected because of issues caused when running with the server framework in SUBPROCESS_MODE, that is forking a new process for it to run the model. We got an error complaining that CUDA had been already initialized in the parent process, which was not necessary because it is not meant to run the inference in the model.

This PR changes this loading to be more lazy, more concretely we changed the code in sru.modules to avoid the eager import of sru.ops and instead postpone it to the instantiation of a first SRUCell.

The changes in this PR have been tested doing a checkout of this branch in an AWS instance with GPU and running pytest -sv test which resulted in 141 passed, 161 warnings and no failures. So we understand this is working as expected for both CPU and GPU settings.

…u into hp/fix-import-cuda-init

This reverts commit 91fff61.

ghost · 2021-12-27T14:29:56Z

sru/modules.py

+        SRUCell.init_elementwise_recurrence_funcs()
+
+    @classmethod
+    def init_elementwise_recurrence_funcs(cls):


add a docstring to this method please

ghost · 2021-12-27T14:31:09Z

sru/modules.py

@@ -27,6 +23,11 @@ class SRUCell(nn.Module):
    scale_x: Tensor
    weight_proj: Optional[Tensor]

+    initialized = False
+    elementwise_recurrence_inference = None


Please add a note about this function initialization to the SRUCell docstring or the docstring for sru/modules.py (it lacks a module docstring now, it probably should have one)

This reverts commit 62f229f.

hpasapp and others added 10 commits December 21, 2021 16:56

fix importing sru loads cuda

38b48ec

this doesnt work, because of torchscript

07bb9bb

use env var to guard initialiaiton

8648c1f

Change circleCI to python 3.8

91fff61

migraet the init test into modules.py

7510373

Merge branch 'hp/fix-import-cuda-init' of github.com:asappresearch/sr…

744a718

…u into hp/fix-import-cuda-init

init function works

35f7322

automtically initialize, even in presence of env var

9fd778a

Revert "Change circleCI to python 3.8"

cbce3e2

This reverts commit 91fff61.

Changes for modules only

ce87ed5

dkasapp requested a review from hpasapp December 24, 2021 19:28

dkasapp assigned hpasapp Dec 24, 2021

dkasapp added 2 commits December 24, 2021 16:29

Remove unused import

ca4e006

Add a test for no eager cuda init to be run first of all

1f819c5

dkasapp requested a review from xzhang-asapp December 27, 2021 13:12

dkasapp assigned xzhang-asapp Dec 27, 2021

ghost reviewed Dec 27, 2021

View reviewed changes

dkasapp added 5 commits December 27, 2021 11:46

Added docstring

3823c4f

Attempt to fix pipeline: Bump orb and python versions

ca7aa32

Attempt to fix pipeline: Update torch wheel

edf84c6

Attempt to fix pipeline: Fix cmake installation

d18d912

Module javadoc

f54ef3d

ghost approved these changes Dec 27, 2021

View reviewed changes

dkasapp added 4 commits December 27, 2021 12:50

Bump version

24a6370

Add additional init in apply_recurrence (multi gpu issue)

62f229f

Revert "Add additional init in apply_recurrence (multi gpu issue)"

eb0da19

This reverts commit 62f229f.

Change version to 2.7.0-rc1

dabdc1a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid unintended eager cuda initialization #199

Avoid unintended eager cuda initialization #199

dkasapp commented Dec 24, 2021 •

edited

ghost Dec 27, 2021

dkasapp Dec 27, 2021

ghost Dec 27, 2021

dkasapp Dec 27, 2021 •

edited

Avoid unintended eager cuda initialization #199

Are you sure you want to change the base?

Avoid unintended eager cuda initialization #199

Conversation

dkasapp commented Dec 24, 2021 • edited

ghost Dec 27, 2021

Choose a reason for hiding this comment

dkasapp Dec 27, 2021

Choose a reason for hiding this comment

ghost Dec 27, 2021

Choose a reason for hiding this comment

dkasapp Dec 27, 2021 • edited

Choose a reason for hiding this comment

dkasapp commented Dec 24, 2021 •

edited

dkasapp Dec 27, 2021 •

edited