-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] rmm unconditionally imports both torch and cupy if they are in the environment #1211
Comments
wence-
added
? - Needs Triage
Need team to review and classify
bug
Something isn't working
labels
Feb 16, 2023
Great find. That would be great, thanks! |
harrism
added
Python
Related to RMM Python API
and removed
? - Needs Triage
Need team to review and classify
labels
Feb 22, 2023
3 tasks
rapids-bot bot
pushed a commit
that referenced
this issue
Feb 27, 2023
…1221) RMM provides callbacks to configure third-party libraries to use RMM for memory allocation. Previously, these were defined in the top-level package, but that requires (potentially expensive) import of the package we're providing a hook for, since typically we must import that package to define the callback. This makes importing RMM expensive. To avoid this, move the callbacks into (not imported by default) sub-modules in `rmm.allocators`. So, if we want to configure the CuPy allocator, we now import `rmm_cupy_allocator` from `rmm.allocators.cupy`, and don't pay the price of importing pytorch. This change **deprecates** the use of the allocator callbacks in the top-level `rmm` module in favour of explicit imports from the relevant `rmm.allocators.XXX` sub-module. Before these changes, a sampling trace of `import rmm` with pyinstrument shows: $ pyinstrument -i 0.01 importrmm.py _ ._ __/__ _ _ _ _ _/_ Recorded: 10:19:56 Samples: 67 /_//_/// /_\ / //_// / //_'/ // Duration: 0.839 CPU time: 0.837 / _/ v4.4.0 Program: importrmm.py 0.839 <module> importrmm.py:1 └─ 0.839 <module> rmm/__init__.py:1 ├─ 0.315 <module> rmm/allocators/torch.py:1 │ └─ 0.315 <module> torch/__init__.py:1 │ [96 frames hidden] torch, <built-in>, enum, inspect, tok... ├─ 0.297 <module> rmm/mr.py:1 │ └─ 0.297 <module> rmm/_lib/__init__.py:1 │ ├─ 0.216 <module> numba/__init__.py:1 │ │ [140 frames hidden] numba, abc, <built-in>, importlib, em... │ ├─ 0.040 <module> numba/cuda/__init__.py:1 │ │ [34 frames hidden] numba, asyncio, ssl, <built-in>, re, ... │ ├─ 0.030 __new__ enum.py:180 │ │ [5 frames hidden] enum, <built-in> │ └─ 0.011 [self] None └─ 0.227 <module> rmm/allocators/cupy.py:1 └─ 0.227 <module> cupy/__init__.py:1 [123 frames hidden] cupy, pytest, _pytest, attr, <built-i... That is, almost a full second to import things, most of which is spent importing pytorch and cupy. These modules are not needed in normal usage of RMM, so we can defer the imports. Numba is a little bit trickier, but we can also defer up-front imports, with a final result that after these changes the same `import rmm` call takes just a tenth of a second: $ pyinstrument -i 0.01 importrmm.py _ ._ __/__ _ _ _ _ _/_ Recorded: 10:37:40 Samples: 9 /_//_/// /_\ / //_// / //_'/ // Duration: 0.099 CPU time: 0.099 / _/ v4.4.0 Program: importrmm.py 0.099 <module> importrmm.py:1 └─ 0.099 <module> rmm/__init__.py:1 └─ 0.099 <module> rmm/mr.py:1 └─ 0.099 <module> rmm/_lib/__init__.py:1 ├─ 0.059 <module> numpy/__init__.py:1 │ [31 frames hidden] numpy, re, sre_compile, <built-in>, s... ├─ 0.020 __new__ enum.py:180 │ [2 frames hidden] enum ├─ 0.010 <module> ctypes/__init__.py:1 │ [3 frames hidden] ctypes, <built-in> └─ 0.010 _EnumDict.__setitem__ enum.py:89 [3 frames hidden] enum Closes #1211. Authors: - Lawrence Mitchell (https://github.com/wence-) - Bradley Dice (https://github.com/bdice) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Bradley Dice (https://github.com/bdice) URL: #1221
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
As investigation in rapidsai/cudf#627, it turns out that RMM always imports cupy and torch, even if the user is never going to use them. This is expensive in the sense that it makes "import rmm" take more than half a second (even though the package itself is very small).
It would be nice to load those packages lazily/optionally.
Happy to take this on.
The text was updated successfully, but these errors were encountered: