-
Notifications
You must be signed in to change notification settings - Fork 213
Support LoadedDL.from_distribution
#1049
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
return LoadedDL(abs_path, False, handle._handle, Distribution("system", None)) | ||
return None | ||
|
||
def find_with_system_search_linux(libname: str) -> Optional[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things stand out here:
-
This function is an almost exact copy of
load_with_system_search()
and could be implemented asreturn load_with_system_search(libname).abs_path
-
But I wouldn't want to give this function a name that starts with "find" because it is actually loading the library, which is a side-effect with very strong consequences. — See also Add
find_nvidia_dynamic_library
to find the DSO location without loading #757
To maintain that the code reflects actual (inconvenient) realities, it'd be great to avoid adding the find_with_system_search_*
functions and try_with_system_search()
in find_nvidia_dynamic_lib.py
, even if that means extra wrinkles elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we switch to a version of find_with_system_search
that doesn't load, but does some kind of existence check on the file? That would allow us to have a really clean separation between "finding" and "loading"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately not, beyond the subprocess idea that's tracked already under #757.
To double-check, I just asked ChatGPT 5 Pro (which didn't exist last time I was working on it):
This was my prompt:
I believe I asked before, but could you please give this another very thorough look?
Platform: Linux only (please ignore any other platforms)
Python versions: 3.13 (please ignore any older Python versions)
Consider this example code:
import ctypes
import os
soname = "libcublas.so.13"
cdll_mode = os.RTLD_NOW | os.RTLD_GLOBAL
try:
ctypes.CDLL(soname, cdll_mode)
except OSError:
print(f"Failure loading {soname}")
else:
print(f"Success loading {soname}")
On my workstation it prints Success loading libcublas.so.13
My question: Is there any way to implement
-
find the dynamic library (libcublas.so.13 in the example) in exactly the same way,
-
but do not actually load it,
-
WITHOUT reverse engineering the search for the file to load?
It reasoned for 15 minutes!
For the full answer, please see https://chatgpt.com/share/68dc09d0-c928-8008-9a55-a5aede0dbbf8
Bottom line
-
In‑process, no‑load, exact resolution: there isn’t a public API.
-
Practical solution that’s exact but safe: spawn a one‑shot helper ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To track some pointers here:
When asking ChatGPT about a brute-force idea, surprisingly suggested a new approach:
The interesting part from there:
A practical, non‑subprocess alternative (doesn’t load the target)
If you’re willing to slightly relax “no reverse‑engineering,” there is a clean way that doesn’t load the library and doesn’t guess the search order:
Ask the dynamic linker for the exact directory search list it would use for a dlopen()
from your process, then walk that list in order to find the first directory containing libcublas.so.13
.
On glibc, this is what dlinfo(..., RTLD_DI_SERINFO, ...)
is for: it returns the actual run‑time search path the linker will follow (including LD_LIBRARY_PATH
, RUNPATH
/RPATH
, $ORIGIN
expansions, and system directories configured via ld.so.conf
). You then check those directories in order for the file name. That way, you’re not re‑implementing the rules—you’re reading the linker’s own computed search list—and you never map the target library.
I then asked it to generate the code, which it did in a couple minutes:
But running that code, it turns out:
RTLD_DI_SERINFO
returns the directory walk order the dynamic linker would use when it actually needs to scan directories (e.g., fromLD_LIBRARY_PATH
or aDT_RPATH
/DT_RUNPATH
on the handle you pass). It does not dump everything from/etc/ld.so.conf.d/*
. For most system/extra dirs (like your CUDA entry), glibc doesn’t “search” them at runtime; it consults the binary cache/etc/ld.so.cache
to map the SONAME → full path directly. Because there’s no directory walk involved, those directories don’t appear in the SERINFO list.
The only non-reverse-engineering way to get at the binary cache is to run ldconfig -p
. So we need a subprocess and string parsing — not great IMO.
I then asked it about the equivalent on Windows, for which it also quickly generated code, but there are also caveats.
I decided to stop here:
-
Implementing
find_nvidia_dynamic_lib()
that's mostly equivalent toload_nvidia_dynamic_lib()
seems feasible now, but it's far from straightforward. -
I'm really not sure if the value of having that independent find function justifies dealing with the limitations and extra complications.
return dl | ||
|
||
def load_with_site_packages(self) -> Optional[LoadedDL]: | ||
return self._load_simple(self.finder.try_site_packages, "site-packages") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this changes the behavior in a subtle but deep way: loading of dependencies should happen in all cases. The interactions are tricky, but that's just a consequence of the realities. The existing order
- find specific lib
- load_dependencies unconditionally (no matter what was found!)
- load specific lib
is very intentional. See the existing comments in existing lines 39-39.
I guess we could change the order to
- load_dependencies unconditionally
- find specific lib
- load specific lib
but I didn't do that because there could be exceptions in the "find specific lib" step, and we'd have the side-effects from loading the dependencies already. It's not clear-cut if that's better or worse; I decided side-effects later is probably better.
I'm thinking it'd be better to be more conservative with the code organization when adding in the Distribution
code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. lets unpack the four ways of loading things back out into their own methods then. Updates to follow.
return self._load_simple(self.finder.try_with_conda_prefix, "conda") | ||
|
||
def load_with_system_search(self) -> Optional[LoadedDL]: | ||
return self._load_with_dependencies(self.finder.try_with_system_search, "system") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, if we say "system"
here, it's not actually so much about "Distribution" but our chosen approach to searching.
Two ideas to explain what I have in mind, for discussion:
-
Instead of
Distribution
we could call the new memberFoundVia
. Technically that's more to the point. -
Or we could analyze what we found and say aha, that's in a CTK package installed from a runfile, or apt, or ... distribution.
The aha part is way more tricky. Is there any value to it, for the purposes of numba-cuda?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's enough for numba-cuda
to report the source and the path. Right now we also report the name of the package the library was found from (nvcc in this case) but if the path is reported I think it's less critical to have that as well.
Finding nvvm from NVIDIA NVCC Wheel
Located at /raid/mambaforge/envs/numba-cuda/lib/python3.12/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
self.error_messages: list[str] = [] | ||
self.attachments: list[str] = [] | ||
self.abs_path: Optional[str] = None | ||
self.distribution: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not used anymore?
@dataclass | ||
class FoundVia: | ||
name: str | ||
version: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently not used: I'd lean towards not adding that in this PR, but only when we have an actual use case, which might give us ideas for a different name (or names).
abs_path: Optional[str] | ||
was_already_loaded_from_elsewhere: bool | ||
_handle_uint: int # Platform-agnostic unsigned pointer value | ||
foundvia: Optional[FoundVia] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
found_via
highly preferred (reads better, and seems to be most common when mapping from camel case to snake case).
finder.raise_not_found_error() | ||
|
||
return load_with_abs_path(libname, abs_path) | ||
def _load_lib_no_cache(libname: str) -> LoadedDL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to move this below the class _LoadNvidiaDynamicLib
code (it's more in line with how I organized the rest of the pathfinder code).
Also, this could be a one-liner:
return _LoadNvidiaDynamicLib(libname).load_lib()
loaded = load_with_system_search(self.libname) | ||
if loaded is not None: | ||
return loaded | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return load_with_system_search(self.libname)
self.finder = _FindNvidiaDynamicLib(libname) | ||
self.libname = self.finder.libname | ||
|
||
def _maybe_return_loaded(self, check_if_loaded_from_elsewhere: bool = True) -> Optional[LoadedDL]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method should run only once. Currently it's run up to four times.
check_if_loaded_from_elsewhere
→ have_abs_path
have_abs_path
should depend only on the outcome of _FindNvidiaDynamicLib(libname)
I realize this is all a bit twisty, but it's a consequence of all the exceedingly twisty logic that is provided by the operating systems.
I spent another hour or so today looking into find_nvidia_dynamic_lib()
as the primitive, which would make most of the logic here obsolete, but — again unfortunately — it turned out to be gnarly. Let's discuss tomorrow.
if loaded := self._maybe_return_loaded(True): | ||
return loaded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove in this group of four methods and call once at the top of load_lib()
.
load_dependencies(self.libname, load_nvidia_dynamic_lib) | ||
|
||
if loaded is not None: | ||
return loaded | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return load_dependencies(self.libname, load_nvidia_dynamic_lib)
seems equivalent.
return dl | ||
if dl := self.load_with_cuda_home(): | ||
dl.foundvia = FoundVia("CUDA_HOME") | ||
return dl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the unifying group of four methods, you could use a loop like this:
if loaded := self._maybe_return_loaded(self.finder.abs_path is not None):
return loaded
for load_meth, found_via in (
(self.load_with_site_packages, "site-packages"),
(self.load_with_conda_prefix, "conda"),
(self.load_with_system_search, "system-search"),
(self.load_with_cuda_home, "CUDA_HOME"),
):
if loaded := load_meth():
loaded.foundvia = FoundVia(found_via)
return loaded
|
||
from typing import Optional | ||
|
||
def _load_lib_no_cache(libname: str) -> LoadedDL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect it'll be simpler (less code) to wrap this function in a helper that adds found_via
into LoadedDL
.
def _load_lib_no_cache_impl(libname: str) -> LoadedDL, str
existing code, but we also keep track of `found_via` and return (loaded, found_via)
def _load_lib_no_cache(libname: str) -> LoadedDL:
loaded, found_via = _load_lib_no_cache_impl(libname)
loaded.found_via = FoundVia(found_via)
return loaded
Alternatively, we could pass found_via
down along with abs_path
, so that the LoadedDL
instances can be constructed with it, rather than retrofitting it in later. That's a bit more surgical, not sure how it'll shake out, but it would seem cleanest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a great direction and just needs some discussion of details and minimal testing (meeting briefly might be most efficient).
Thoughts:
LoadedDL.found_via
could just be a string, I'd try if we can achieve "not optional"?- One additional
found_via
value would be "was_already_loaded_from_elsewhere", which means the existingbool
is basically redundant, but we need to keep it to avoid an API breakage. - Minor but nice I think:
found_via = "system-search"
instead of"system"
- The
version
we can handle separately later. Coincidentally I did a little bit of related work under PR #1085 (_get_cuda_version_from_cuda_h
). However, in general this will be quite tricky, e.g. conda and site-packages could have libs and/or headers from more than one CTK version. Maybe the best way to handle this will actually be in the context of a "scoped search" as outlined under #1038.
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
WIP, just sketching things out currently
Closes #759