Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bin/legate doesn't find legion_top in non-editable installation #393

Closed
rohany opened this issue Sep 28, 2022 · 10 comments · Fixed by #394
Closed

bin/legate doesn't find legion_top in non-editable installation #393

rohany opened this issue Sep 28, 2022 · 10 comments · Fixed by #394
Assignees

Comments

@rohany
Copy link
Contributor

rohany commented Sep 28, 2022

I installed legate.core using ./install.py --install-dir ../install --verbose --no-build-isolation, and when running bin/legate I see the error:

[0 - 113f0adc0]    0.000081 {4}{threads}: reservation ('CPU proc 1d00000000000003') cannot be satisfied
[0 - 700001ade000]    0.050624 {6}{python}: unable to import Python module legion_top
ModuleNotFoundError: No module named 'legion_top'
Assertion failed: (0), function find_or_import_function, file /Users/rohany/Documents/nvidia/legate.core/_skbuild/macosx-10.15-x86_64-3.9/cmake-build/_deps/legion-src/runtime/realm/python/python_module.cc, line 230.
Signal 6 received by node 0, process 30628 (thread 700001ade000) - obtaining backtrace
Signal 6 received by process 30628 (thread 700001ade000) at: stack trace: 0 frames
@bryevdv
Copy link
Contributor

bryevdv commented Sep 28, 2022

@rohany can you provide with --verbose output for both the old and the new? That should give the actual complete invocation (and env, for the new version) that can be compared

@rohany
Copy link
Contributor Author

rohany commented Sep 28, 2022

verbose output from the legate driver or from the install?

@bryevdv
Copy link
Contributor

bryevdv commented Sep 28, 2022

the driver, e.g.

bldtest ❯ legate --verbose                  
WARNING: Disabling control replication for interactive run

--- Legion Python Configuration ------------------------------------------------

Legate paths:
  legate_dir       : /home/bryan/work/legate.core
  legate_build_dir : /home/bryan/work/legate.core/_skbuild/linux-x86_64-3.9/cmake-build
  bind_sh_path     : /home/bryan/work/legate.core/bind.sh
  legate_lib_path  : /home/bryan/work/legate.core/build/lib

Legion paths:
  legion_bin_path       : /home/bryan/work/legate.core/_skbuild/linux-x86_64-3.9/cmake-build/_deps/legion-build/bin
  legion_lib_path       : /home/bryan/work/legate.core/_skbuild/linux-x86_64-3.9/cmake-build/_deps/legion-build/lib
  realm_defines_h       : /home/bryan/work/legate.core/_skbuild/linux-x86_64-3.9/cmake-build/_deps/legion-build/runtime/realm_defines.h
  legion_defines_h      : /home/bryan/work/legate.core/_skbuild/linux-x86_64-3.9/cmake-build/_deps/legion-build/runtime/legion_defines.h
  legion_spy_py         : /home/bryan/work/legate.core/_skbuild/linux-x86_64-3.9/cmake-build/_deps/legion-src/tools/legion_spy.py
  legion_prof_py        : /home/bryan/work/legate.core/_skbuild/linux-x86_64-3.9/cmake-build/_deps/legion-src/tools/legion_prof.py
  legion_python         : /home/bryan/work/legate.core/_skbuild/linux-x86_64-3.9/cmake-build/_deps/legion-build/bin/legion_python
  legion_module         : /home/bryan/work/legate.core/_skbuild/linux-x86_64-3.9/cmake-build/_deps/legion-src/bindings/python/build/lib
  legion_jupyter_module : /home/bryan/work/legate.core/_skbuild/linux-x86_64-3.9/cmake-build/_deps/legion-src/jupyter_notebook

Command:
  /home/bryan/work/legate.core/_skbuild/linux-x86_64-3.9/cmake-build/_deps/legion-build/bin/legion_python --nocr -ll:py 1 -lg:local 0 -ll:cpu 4 -ll:util 2 -ll:csize 4000 -level openmp=5 -lg:eager_alloc_percentage 50

Customized Environment:
  GASNET_MPI_THREAD=MPI_THREAD_MULTIPLE
  LD_LIBRARY_PATH=/home/bryan/work/legate.core/_skbuild/linux-x86_64-3.9/cmake-build/_deps/legion-build/lib:/home/bryan/work/legate.core/build/lib
  LEGATE_MAX_DIM=4
  LEGATE_MAX_FIELDS=256
  NCCL_LAUNCH_MODE=PARALLEL
  PYTHONDONTWRITEBYTECODE=1
  PYTHONPATH=/home/bryan/work/legate.core/_skbuild/linux-x86_64-3.9/cmake-build/_deps/legion-src/bindings/python/build/lib:/home/bryan/work/legate.core/_skbuild/linux-x86_64-3.9/cmake-build/_deps/legion-src/jupyter_notebook:/home/bryan/work/legate.core/legate
  REALM_BACKTRACE=1

--------------------------------------------------------------------------------

Welcome to Legion Python interactive console
>>> 

@rohany
Copy link
Contributor Author

rohany commented Sep 28, 2022

Before:

➜  legate.sparse git:(main) ✗ ../install/bin/legate --verbose                                                                                                               (legate) 12:47:11
WARNING: Disabling control replication for interactive run
legate:           ../install/bin/legate
legate_dir:       /Users/rohany/Documents/nvidia/legate.sparse/../install/lib/python3.9/site-packages
bind_sh_path:     ../install/bin/bind.sh
legate_lib_path:  ../install/lib
legate_build_dir: None
legion_lib_path:  ../install/lib
realm_defines_h:  ../install/include/realm_defines.h
legion_defines_h: ../install/include/legion_defines.h
legion_spy_py:    ../install/bin/legion_spy.py
legion_prof_py:   ../install/bin/legion_prof.py
legion_python:    ../install/bin/legion_python
legion_module:    ../install/lib/python3.9/site-packages
legion_jupyter_module:    ../install/lib/python3.9/site-packages
Running: ../install/bin/legion_python --nocr -ll:py 1 -lg:local 0 -ll:cpu 4 -ll:util 2 -ll:csize 4000 -level openmp=5 -lg:eager_alloc_percentage 50
[0 - 11f66cdc0]    0.000093 {4}{threads}: reservation ('CPU proc 1d00000000000003') cannot be satisfied
Welcome to Legion Python interactive console
>>>

After:

➜  legate.core git:(legate-sparse) ✗ ../install/bin/legate --verbose                                                                                                        (legate) 12:49:34
WARNING: Disabling control replication for interactive run

--- Legion Python Configuration ------------------------------------------------

Legate paths:
  legate_dir       : /Users/rohany/Documents/nvidia/legate.core/../install/lib/python3.9/site-packages
  legate_build_dir : None
  bind_sh_path     : ../install/bin/bind.sh
  legate_lib_path  : ../install/lib

Legion paths:
  legion_bin_path       : ../install/bin
  legion_lib_path       : ../install/lib
  realm_defines_h       : ../install/include/realm_defines.h
  legion_defines_h      : ../install/include/legion_defines.h
  legion_spy_py         : ../install/bin/legion_spy.py
  legion_prof_py        : ../install/bin/legion_prof.py
  legion_python         : ../install/bin/legion_python
  legion_module         : None
  legion_jupyter_module : None

Command:
  ../install/bin/legion_python --nocr -ll:py 1 -lg:local 0 -ll:cpu 4 -ll:util 2 -ll:csize 4000 -level openmp=5 -lg:eager_alloc_percentage 50

Customized Environment:
  DYLD_LIBRARY_PATH=../install/lib:../install/lib
  GASNET_MPI_THREAD=MPI_THREAD_MULTIPLE
  LEGATE_MAX_DIM=4
  LEGATE_MAX_FIELDS=256
  NCCL_LAUNCH_MODE=PARALLEL
  PYTHONDONTWRITEBYTECODE=1
  PYTHONPATH=/Users/rohany/Documents/nvidia/legate.core/../install/lib/python3.9/site-packages/legate
  REALM_BACKTRACE=1

--------------------------------------------------------------------------------

[0 - 11958fdc0]    0.000095 {4}{threads}: reservation ('CPU proc 1d00000000000003') cannot be satisfied
[0 - 70000ff3f000]    0.061901 {6}{python}: unable to import Python module legion_top
ModuleNotFoundError: No module named 'legion_top'
Assertion failed: (0), function find_or_import_function, file /Users/rohany/Documents/nvidia/legate.core/_skbuild/macosx-10.15-x86_64-3.9/cmake-build/_deps/legion-src/runtime/realm/python/python_module.cc, line 230.
Signal 6 received by node 0, process 47751 (thread 70000ff3f000) - obtaining backtrace
Signal 6 received by process 47751 (thread 70000ff3f000) at: stack trace: 0 frames

@bryevdv
Copy link
Contributor

bryevdv commented Sep 28, 2022

OK well presumably this is the proximate problem:

  legion_module         : None

I lightly updated the path discovery code so maybe something got broken, unlike the rest of the new driver, these two functions are very difficult to unit test. I will try to reproduce this afternoon.

@bryevdv
Copy link
Contributor

bryevdv commented Sep 28, 2022

@rohany I am still trying to repro this locally. in the same way. In the mean time, there is one function responsible for computing all those paths, it is here:

https://github.com/nv-legate/legate.core/blob/branch-22.10/legate/driver/util.py#L285-L413

Perhaps you can do some investigation / printf-debugging on your end?

@bryevdv
Copy link
Contributor

bryevdv commented Sep 28, 2022

OK I have been able to reproduce this

@bryevdv
Copy link
Contributor

bryevdv commented Sep 28, 2022

@rohany can you try with this patch

diff --git a/legate/driver/util.py b/legate/driver/util.py
index fafe306..e96b5ae 100644
--- a/legate/driver/util.py
+++ b/legate/driver/util.py
@@ -327,24 +327,24 @@ def get_legion_paths(legate_paths: LegatePaths) -> LegionPaths:
         if legion_module is None:
             legion_lib_dir = legion_dir / "lib"
             for f in legion_lib_dir.iterdir():
-                if legion_lib_dir.joinpath(f / "site-packages").exists():
-                    legion_module = legion_lib_dir / f / "site-packages"
+                if f.joinpath("site-packages").exists():
+                    legion_module = f / "site-packages"
                     break
 
-            legion_bin_path = legion_dir / "bin"
-            legion_include_path = legion_dir / "include"
-
-            return LegionPaths(
-                legion_bin_path=legion_bin_path,
-                legion_lib_path=legion_lib_dir,
-                realm_defines_h=legion_include_path / "realm_defines.h",
-                legion_defines_h=legion_include_path / "legion_defines.h",
-                legion_spy_py=legion_bin_path / "legion_spy.py",
-                legion_prof_py=legion_bin_path / "legion_prof.py",
-                legion_python=legion_bin_path / "legion_python",
-                legion_module=legion_module,
-                legion_jupyter_module=legion_module,
-            )
+        legion_bin_path = legion_dir / "bin"
+        legion_include_path = legion_dir / "include"
+
+        return LegionPaths(
+            legion_bin_path=legion_bin_path,
+            legion_lib_path=legion_lib_dir,
+            realm_defines_h=legion_include_path / "realm_defines.h",
+            legion_defines_h=legion_include_path / "legion_defines.h",
+            legion_spy_py=legion_bin_path / "legion_spy.py",
+            legion_prof_py=legion_bin_path / "legion_prof.py",
+            legion_python=legion_bin_path / "legion_python",
+            legion_module=legion_module,
+            legion_jupyter_module=legion_module,
+        )
 
         raise RuntimeError("Could not determine legion paths")

@rohany
Copy link
Contributor Author

rohany commented Sep 28, 2022

patch works for me

@bryevdv
Copy link
Contributor

bryevdv commented Sep 28, 2022

@rohany can you submit a quick PR? I had to edit the installed file for the patch and I am already on a branch that is moving some related files

rohany added a commit to rohany/legate.core that referenced this issue Sep 28, 2022
Fixes nv-legate#393.

This commit fixes a bug where the legate driver could not find some
internal modules.
rohany added a commit that referenced this issue Sep 28, 2022
* legate/driver: fix driver legion_module path

Fixes #393.

This commit fixes a bug where the legate driver could not find some
internal modules.

* legate/driver: remove erroneous raise statement
bryevdv added a commit to bryevdv/legate.core that referenced this issue Sep 29, 2022
bryevdv added a commit to bryevdv/legate.core that referenced this issue Sep 29, 2022
bryevdv added a commit to bryevdv/legate.core that referenced this issue Sep 29, 2022
bryevdv added a commit that referenced this issue Oct 5, 2022
* initial import of test driver code

* consolidate some utils and types

* ignore vscode workspace for now at least

* parse_command_args -> parse_library_command_args

* factor out colorama

* Consolidate types

* consolidate ui modules

* consolidate system classes

* get rid of driver.util

* temp compat imports

* probable fix for #393

* bail if legate_module cannot be determined

* use singular util

* use cwd for default test_root

* move custom argparse action to util

* fix test after merge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants