-
Notifications
You must be signed in to change notification settings - Fork 214
Add conda-specific search into load_nvidia_dynamic_lib()
#1003
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add conda-specific search into load_nvidia_dynamic_lib()
#1003
Conversation
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
f1482bd
to
5b3c5eb
Compare
/ok to test |
This comment has been minimized.
This comment has been minimized.
…ad_nvidia_dynamic_lib_helper.py more robust to make it less likely that bugs like 2642d35 slip in.
/ok to test |
cuda_pathfinder/cuda/pathfinder/_dynamic_libs/find_nvidia_dynamic_lib.py
Outdated
Show resolved
Hide resolved
self._find_using_lib_dir(cuda_home_lib_dir) | ||
|
||
def _find_using_lib_dir(self, lib_dir: str) -> None: | ||
if IS_WINDOWS: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is pre-existing, but it seems like this is begging for some sort of interface that you can isolate into modules and then import at the top level based on platform, similar to how to the os
and pathlib
modules work. This would likely avoid a lot of in-method or in-function branching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this twice before but backed off both times after seeing where it's going, and determining it's a net loss in terms of code organization and readability.
In contrast, it was clearly a net win for the rest of the code underneath load_nvidia_dynamic_lib.py
, which also started out as interleaved code.
if result.returncode != 0: | ||
raise_child_process_failed() | ||
assert not result.stderr | ||
if result.stdout.startswith("CHILD_LOAD_NVIDIA_DYNAMIC_LIB_HELPER_DYNAMIC_LIB_NOT_FOUND_ERROR:"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some reason not to use stderr here, since that is its intended use (communicating errors)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was a deliberate choice I made while working on commit edff6ec.
See the bug-fix commit right before (2642d35) for why I worked on it.
The requirements on my mind:
-
I didn't want to start fiddling with exit codes (
result.returncode
). -
I wanted to make it so that a non-zero error code always leads to an exception.
-
I also wanted to get away from searching for substrings that are not guaranteed to be conclusive:
"DynamicLibNotFoundError: Failure finding " in result.stderr
So I introduced a tag that is practically certain to be unique: CHILD_LOAD_NVIDIA_DYNAMIC_LIB_HELPER_DYNAMIC_LIB_NOT_FOUND_ERROR:
, to be sent via stdout
for the reasons above.
I added assert os.path.isfile(abs_path) # double-check the abs_path
to be sure if I miss the tag somehow, it'll not slip through unnoticed.
2. **Conda environment** | ||
- Conda installations are discovered via ``CONDA_PREFIX``, which is | ||
predefined in activated conda environments (see |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
predefined in activated conda environments (see | |
defined in activated conda environments (see |
I don't think the pre
prefix is making a useful distinction beyond just "defined".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to "defined automatically": commit bd5e2ae
This is for people not-so-familiar with Conda. That one extra word is to reassure them that they don't have to think about defining the variable themselves.
Local testing: Linux, Conda, CTK13
|
…o `_load_lib_no_cache()`. Rename `found` variable to `finder` to make the design change (de-centralizing the search order) more obvious.
I'm intentionally holding off re-running the tests, in case there are more requests for changes. I retested locally with /usr/local/cuda CTK 13, site-packages CTK 12 & 13, Conda Windows & Linux x CTK 12 & 13. Therefore it seems very unlikely that CI testing will fail. |
cuda_pathfinder/cuda/pathfinder/_dynamic_libs/load_nvidia_dynamic_lib.py
Outdated
Show resolved
Hide resolved
…d not find anything.
cuda_pathfinder/cuda/pathfinder/_dynamic_libs/find_nvidia_dynamic_lib.py
Outdated
Show resolved
Hide resolved
cuda_pathfinder/cuda/pathfinder/_dynamic_libs/find_nvidia_dynamic_lib.py
Outdated
Show resolved
Hide resolved
I'll see if I can fix #1007 before triggering the CI here again. |
…module:: cuda.pathfinder` in cuda_pathfinder/docs/source/release/*-notes.rst
…warning: /home/rgrossekunst/forked/cuda-python/cuda_pathfinder/docs/source/contribute.rst:14: WARNING: Bullet list ends without a blank line; unexpected unindent. [docutils]
WARNING: html_static_path entry '_static' does not exist
/ok to test |
|
Add a Conda-specific search into
load_nvidia_dynamic_lib()
. This makes the search more intuitive and predictable.This PR is essentially two in one:
1. Production code changes:
2. Bug fix in test and related cleanup:
The production code changes are best understood by reviewing:
load_nvidia_dynamic_lib
docstringThe changes are surgical, based on reusing existing logic for locating dynamic libraries under
CUDA_HOME
.Currently CI testing does not include testing with Conda. The interesting parts of the outputs from additional manual local testing are below:
Piggy-backed:
Closes #1007