Fix install script for all dependencies without breaking conda (Real!) #121
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixing OSS install
... while committing coding atrocities that should get me locked up.
What's da problem?
After running the installer,
conda
crashed with:Cool I fixed that, but then this exposed a second error when running the GRPO app:
What are the root causes here?
OpenSSL collision
The activation script put
${CONDA_PREFIX}/lib
into globalLD_LIBRARY_PATH
./usr/bin/conda
uses system Python, which then loaded conda’slibcrypto.so.3
(older symbol set) instead of the system one built forOPENSSL_3.4.0
→_hashlib
import failed.Missing
libpython
at import timeThe Monarch Rust extension links dynamically against
libpython3.10.so.1.0
.After removing
${CONDA_PREFIX}/lib
from globalLD_LIBRARY_PATH
, Python could no longer findlibpython
when importing Monarch.What options did I consider while trying not to break everything?
A. Scope LD only to Python invocations aka shim around python
Wrap
python
/python3
to add${CONDA_PREFIX}/lib
just for that process:This keeps system tools (e.g.,
/usr/bin/conda
) clean while satisfying extensions that needlibpython
.B. Rebuild Monarch with a stable ABI (preferred long-term)
Build the Rust extension with PyO3’s
extension-module
+abi3
(e.g.,abi3-py310
) so it doesn’t link tolibpython
at all. This eliminates the need for any LD hacks and yields cross-minor CPython compatibility.What this PR does (current fix)
${CONDA_PREFIX}/lib
from the globalLD_LIBRARY_PATH
(prevents OpenSSL collisions).python
/python3
inactivate.d
, so only Python gets${CONDA_PREFIX}/lib
.LD_LIBRARY_PATH
to CUDAcompat/
only (driver shims) to avoid CUDA/version issues without touching unrelated system libs.Why Option A now?
conda
functionality and unblocks GRPO for OSS users immediately.Long-term plan (Option B)
features = ["extension-module", "abi3", "abi3-py310"]
) and ship anabi3
wheel.libpython
dependency, fewer version pinning issues, simpler manylinux compliance, noLD_LIBRARY_PATH
tricks.HOW DO YOU KNOW IT WORKS?!?!?!
conda --version
works (no OpenSSL error).python -c "import torch, vllm; ..."
succeeds.python -m apps.grpo.main
.Mic-drop, Joe out