Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pip and Python3 #96

Closed
floringogianu opened this issue May 11, 2018 · 13 comments
Closed

Pip and Python3 #96

floringogianu opened this issue May 11, 2018 · 13 comments

Comments

@floringogianu
Copy link

floringogianu commented May 11, 2018

Informed by #32, #92 and #52 I tried to configure Bazel on a conda environment, with python 3. bazel build //:deepmind_lab.so appears to run successfully, however bazel run //:python_random_agent is not able to import deepmind_lab

INFO: Running command line: bazel-out/k8-py3-fastbuild/bin/python_random_agent
ImportError: numpy.core.multiarray failed to import
Traceback (most recent call last):
  File "/home/florin/.cache/bazel/_bazel_florin/329274937dbc60c7c6c49b959689f873/execroot/org_deepmind_lab/bazel-out/k8-py3-fastbuild/bin/python_random_agent.runfiles/org_deepmind_lab/python/random_agent.py", line 26, in <module>
    import deepmind_lab
ImportError: numpy.core.multiarray failed to import
ERROR: Non-zero return code '1' from command: Process exited with status 1

You can find the changes bellow. @tkoeppe Does this seem right? Also, are there any log files I can provide you with in order to help with the debugging?

diff --git a/BUILD b/BUILD
index 9e274c5..3972e07 100644
--- a/BUILD
+++ b/BUILD
@@ -984,6 +984,7 @@ py_binary(
     data = [":deepmind_lab.so"],
     main = "python/random_agent.py",
     visibility = ["//python/tests:__subpackages__"],
+    default_python_version = "PY3"
 )
 
 LOAD_TEST_SCRIPTS = [
diff --git a/WORKSPACE b/WORKSPACE
index fa8da47..967c712 100644
--- a/WORKSPACE
+++ b/WORKSPACE
@@ -91,5 +91,5 @@ new_local_repository(
 new_local_repository(
     name = "python_system",
     build_file = "python.BUILD",
-    path = "/usr",
+    path = "/home/florin/Tools/miniconda3/envs/torch30",
 )
diff --git a/python.BUILD b/python.BUILD
index f0b3f9a..b6a6fa8 100644
--- a/python.BUILD
+++ b/python.BUILD
@@ -5,7 +5,11 @@
 
 cc_library(
     name = "python",
-    hdrs = glob(["include/python2.7/*.h"]),
-    includes = ["include/python2.7"],
+    hdrs = glob(["include/python3.6m/*.h",
+                 "lib/python3.6/site-packages/numpy/core/include/**/*.h"
+    ]),
+    includes = ["include/python3.6m",
+                "lib/python3.6/site-packages/numpy/core/include"
+    ],
     visibility = ["//visibility:public"],
 )
@gchlodzinski
Copy link

Ok, I did some testing today with Anaconda 3.6 environment and here is what I found:

  • the problem is in dmlab_module.c file
  • in particular the following line causes problem:
    PyObject *v = PyObject_GetAttrString(m,"file");
  • it return NULL object and (although it is not documented) it raises SystemError
  • it confuses Python enough to return Numpy error (you can invoke import_array earlier and still get error, different error)
  • to make exception go away we need to add next line with the following command:
    PyErr_Clear();

With it I was able to run random angent via bazel (bazel run //:python_random_agent) in my Anaconda environment.
I was also able to run lab via pip by doing the following:

  • created separate directory for lab runtime files
  • modified dmlab_module.c to point it it.
    But now I need to modify code each time I need to switch between bazel and pip.
    The only real fix would be to make PyObject_GetAttrString working.

So all in all the problem is WHY PyObject_GetAttrString / PyModule_GetFilenameObject functions do not work.
This is strange as running deepmind_lab.file in works fine in Python.
I hope it helps you - it helped me to get Lab up and running.

@floringogianu
Copy link
Author

@gchlodzinski Thanks for looking into this, much appreciated! I was able to run via bazel following your instructions. I will give it a try through pip also.

@tkoeppe
Copy link
Collaborator

tkoeppe commented May 18, 2018

Thank you, that looks promising! Indeed, the attribute fetching seems to fail in Py3.

@tkoeppe
Copy link
Collaborator

tkoeppe commented May 18, 2018

Clearing the error does indeed seem to work, but I wonder why there is an error in the first place. Why is __file__ an attribute in Py2 but not in Py3?

@gchlodzinski
Copy link

gchlodzinski commented May 18, 2018

@tkoeppe, a couple of notes and more findings:

  • since PyObject_GetAttString when returning NULL raises SystemError - you need to clear it anyway (after checking its if it indeed returned NULL).
    Here is documentation for equivalent function - Python3 PyModule_GetFilenameObject: https://docs.python.org/3/c-api/module.html and it clearly states NULL behaviour

  • not all calls for PyObject_GetAttrString behaves that way - if you want to get __dict__ - it will obey your order and return it.

  • so it looks like problem is that __file__ attribute is missing in Python3 for your module
    And indeed I did one more test today to return every __dict__ item and there is no __file__ there. Here are all keys returned in my test function called from dmlab_module.c:

__name__
__doc__
__package__
__loader__
__spec__
version
runfiles_path
set_runfiles_path
Lab

And here is what I get when I list 'dict' from Python:
['__name__', '__doc__', '__package__', '__loader__', '__spec__', 'version', 'runfiles_path', 'set_runfiles_path', 'Lab', '__file__']
You can clearly see that the difference is absence of __file__ entry.

So my GUESS is that we need to do something extra, do some more or different module configuration for python3.
Maybe this document https://docs.python.org/3/extending/building.html could help but I do not have experience in that matter (other than what I learned by looking at your code) so I can be only guessing.

I hope it helps a bit.

@tkoeppe
Copy link
Collaborator

tkoeppe commented May 20, 2018

The problem seems to be indeed that the module's __file__ attribute isn't set in Py3. I believe that in order to have this, we need to use multiphase initialization and manually populate the field from the module spec's origin attribute.

I've changed the module initialization code on the beta branch (in the experimental commit 0e4e3f5) to effect this. Could you please take a look if this helps? If yes, I'll try to polish that code up and integrate it into the master branch.

Thanks!

@tkoeppe tkoeppe changed the title Building and runing Lab in a Conda Environment Pip and Python3 May 20, 2018
@gchlodzinski
Copy link

Hi,
Right now I only able to run it without looking inside. So after integrating your changed 'dmlab_module.c' file into my Python3 version od lab (whole project requires a few more changes in order to make it working with Python3) here is what I am getting:

  • running bazel run //:python_random_agent - no output (which is bad - it should produce some output)
  • running pip version python python/random_agent.py, here is output:
Got origin value '/home/gchlodzinski/anaconda3/lib/python3.6/site-packages/deepmind_lab/deepmind_lab.so'
Segmentation fault (core dumped)

Based on my earlier tests segmentation fault occurs when there is a problem with accessing lab runtime files.
So it looks like your change is not fully working. I can spend some time debugging it today evening if it help.

@tkoeppe
Copy link
Collaborator

tkoeppe commented May 21, 2018

I see, I can reproduce that. Let me take a look. Thanks!

@tkoeppe
Copy link
Collaborator

tkoeppe commented May 21, 2018

It looks like the module lookup doesn't work the way I hoped it would. (Update: indeed, I failed to read the documentation. Module lookup doesn't work with multiphase initialization.) I was briefly contemplating not using per-module state and just keeping the existing global state. It would make a few things simpler, but on the other hand, we might as well do it right.

@tkoeppe
Copy link
Collaborator

tkoeppe commented May 21, 2018

I pushed a fix to the beta branch that looks up the module more correctly.

Please note that none of the Python code except dmlab_module_test.py has actually been updated to work with Python 3; there'd be some amount of work needed to add module six everywhere etc.

@tkoeppe
Copy link
Collaborator

tkoeppe commented May 21, 2018

For your information: the one migrated test can be run under Python 3 as follows:

bazel test --python_path=/usr/bin/python3 -c opt //python/tests:python_module_test

(But that's not using PIP.)

@gchlodzinski
Copy link

The new fix works fine for me in both: bazel and pip.
Thanks for fixing this issue!

@tkoeppe
Copy link
Collaborator

tkoeppe commented May 22, 2018

No problem, and thank you for the analysis that got us to understand the problem!

I'll close this issue for now; do feel free to reopen if there are further related problems. (Or start a new issue.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants