Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Build] - Only use pre-compiled Legion when Python library in use is stored at /opt/conda/lib/libpython3.9.so #465

Merged
merged 3 commits into from Nov 18, 2022

Conversation

goliaro
Copy link
Collaborator

@goliaro goliaro commented Nov 10, 2022

Description of changes:

This PR introduces additional checks to ensure that we only use the pre-compiled Legion library when the configs on the user machine match those that were used when pre-compiling. In particular, we only use the pre-built library if the version of Python in use has its library file stored at /opt/conda/lib/libpython3.9.so. This is a bit limiting, but unfortunately the absolute path to the Python library is hard-coded deeply into the Legion codebase at compile time, and there is no easy way to overwrite it (or recompile the relevant part of the Legion code) when building FlexFlow.

After this change, we should still be able to use the pre-built version of Legion in CI, in Docker, and on any other machine where Python is installed at the default path through Miniconda.

Related Issues:

Linked Issues:

Issues closed by this PR:

Before merging:

  • Did you update the flexflow-third-party repo, if modifying any of the Cmake files, the build configs, or the submodules?

@williamberman
Copy link
Collaborator

Don't have a tremendous amount of context here but would it be possible to make an actual code change to the flexflow branch of legion?

I feel like the options in order of preference should be

  1. Code update to legion
  2. A .patch file that is applied on top of the legion branch we use. That patch is preferably applied during ci when the package is built. If it can't be reapplied, it is distributed along with the package and then applied at flexflow build time.
  3. Flexflow maintains a .patch file in its source that is applied at build time to legion

Apologies if I'm missing something! It's late and I'm scanning this on my phone :)

@lockshaw
Copy link
Collaborator

@gabrieleoliaro Still getting the same error

@goliaro
Copy link
Collaborator Author

goliaro commented Nov 11, 2022

@lockshaw It looks like the /opt/conda/lib/libpython3.9.so string is also baked into the librealm.so pre-compiled binary as well, so the solution from this PR is not enough. I'll look into a fix for this and keep pushing to this branch until it hopefully works.

@goliaro goliaro marked this pull request as draft November 11, 2022 18:56
auto-merge was automatically disabled November 11, 2022 18:56

Pull request was converted to draft

@goliaro goliaro marked this pull request as ready for review November 18, 2022 05:22
@goliaro goliaro changed the title [Build] - Fix path to Python library when using pre-built Legion binaries [Build] - Only use pre-compiled Legion when Python library in use is stored at /opt/conda/lib/libpython3.9.so Nov 18, 2022
@goliaro
Copy link
Collaborator Author

goliaro commented Nov 18, 2022

ok, this PR is now ready, and we should be all set with the fixes to the issues above for now. I opened a new issue (#478) to remind ourselves of the changes we will need to implement in the future to have full-support for the use of pre-compiled Legion in the generic case where the Python version is not 3.9 or the path of the python library is not /opt/conda/lib/libpython3.9.so

@goliaro goliaro enabled auto-merge (squash) November 18, 2022 05:39
@goliaro goliaro merged commit cdee5fa into flexflow:master Nov 18, 2022
@goliaro goliaro deleted the fix_python_path branch November 18, 2022 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Runtime Hang after finishing all tasks dlopen error loading libpython3.9 when using pre-built legion
4 participants