Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensorflow: 'module' object has no attribute 'argv' #81

Closed
c-spencer opened this issue Jun 3, 2017 · 15 comments
Closed

tensorflow: 'module' object has no attribute 'argv' #81

c-spencer opened this issue Jun 3, 2017 · 15 comments

Comments

@c-spencer
Copy link

c-spencer commented Jun 3, 2017

Caused when trying to import tensorflow (v1.2). Most likely caused by PySys_SetArgv not having been called?

Java 8, Python 2.7, Jep 3.6.3, running from within an IntelliJ IDEA scala project.

Full error:

scala> 
import jep.Jep
val jep = new Jep(false)
jep.runScript("src/main/python/test.py")

scala> jep: jep.Jep = jep.Jep@4c531172

scala> jep.JepException: <type 'exceptions.AttributeError'>: 'module' object has no attribute 'argv'
  at /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/argparse.__init__(argparse.py:1586)
  at /Users/chris/tensorflow/lib/python2.7/site-packages/tensorflow/python/platform/flags.<module>(flags.py:25)
  at /Users/chris/tensorflow/lib/python2.7/site-packages/tensorflow/python/platform/app.<module>(app.py:23)
  at /Users/chris/tensorflow/lib/python2.7/site-packages/tensorflow/python/__init__.<module>(__init__.py:98)
  at /Users/chris/tensorflow/lib/python2.7/site-packages/tensorflow/__init__.<module>(__init__.py:24)
  at src/main/python/test.<module>(test.py:3)

test.py:

import tensorflow as tf
@bsteffensmeier
Copy link
Member

I think it would be good to make it so the args can be set in PyConfig

@ndjensen
Copy link
Member

ndjensen commented Jun 4, 2017

Can you set Python's sys.argv before calling runScript? That may be a workaround.

PyConfig is for pre-init parameters that apply to the entire Python interpreter. argv is potentially applicable to a single script, especially when called through Jep.runScript(String). We should overload the method Jep.runScript(String) to take more arguments.

@c-spencer
Copy link
Author

Setting sys.argv before (or at the top of) runScript gets around this, thanks. Passing args through runScript sounds like a good solution.

@akrauchanka
Copy link

akrauchanka commented Jun 9, 2017

I've recently faced the same issue with shared module feature. In this case this workaround can't be applied as adding shared modules happens in Jep constructor.

Could it be qualified as issue in this case?

@ndjensen
Copy link
Member

ndjensen commented Jun 9, 2017

@akrauchanka, can you explain your use case in more detail?

@akrauchanka
Copy link

akrauchanka commented Jun 12, 2017

@ndjensen, sure. I've added tensorflow module to shared modules list, but with Jep constructor call I've got the same exception as mentioned in issue title. Workaround you've provided hadn't worked, because exception has been thrown on constructor call. So I've downloaded sources, made changes to C code to pass empty string to embedded interpreter as parameter during creation. Then I rebuild JEP from sources and tried it - works good.
Looks like hack, agree, but there is no options to pass params to embedded interpreter on creation to prevent modules, that depends on sys.argv parameter to work properly on sharing.

@ndjensen
Copy link
Member

ndjensen commented Jun 12, 2017

It sounds like for flexibility we need PyConfig to have a default argv of empty string "" so it can be set globally. That code would be CPython mostly using

Then we also need to overload Jep.runScript() to take a list of arguments and we'd manipulate sys.argv with Python code.

Update: Went with Jep.setSharedModulesArgv() since it's not quite the same as PyConfig. Skipping Jep.runScript() for this ticket.

@ndjensen
Copy link
Member

ndjensen commented Jun 13, 2017

Does anybody know why tensorflow requires sys.argv? If tensorflow would like to work well in an embedded environment, it shouldn't be so reliant on sys.argv. That said, the entire concept of shared modules was born from a lack of external libraries working well within embedded environments. Therefore, we will strive to make shared modules work as well as possible.

@ndjensen
Copy link
Member

For this ticket we want to add to PyConfig a variable argv, probably a String[], and then in the CPython where PyConfig is used (search for pyembed_preinit) use PySys_SetArgvEx. We'll split off Jep.runScript() changes to a separate task.

Target branch dev_3.7. If anyone wants to submit a pull request it will get done faster, otherwise I will eventually get to it.

@eastcirclek
Copy link

eastcirclek commented Jun 26, 2017

I really hope this issue is solved.
I'm using Jep to do inference using Keras on top of Theano and TensorFlow inside Apache Flink.
As I usually have multiple sub-interpreters by different threads in a single process, I have to use dev_3.7 to avoid race condition in shared modules as reported in #69.

To have tensorflow 1.2 in shared module, I make a file named 'tf_init.py' under my working directory which looks below.
import sys
sys.argv = ['pdm']
import tensorflow.python

Then I initialize Jep using new JepConfig().addSharedModules("tf_init", "numpy", "scipy", "h5py", "tensorflow")

Before doing any Keras/TF-related stuff, I do jep.eval("import tf_init") so that sys.argv is set in the top interpreter.
The reason I add import tensorflow.python is to make sure that tensorflow is loaded by the top interpreter.
Before I upgrade to tensorflow 1.2, I just need the following two lines in tf_init.py:
import sys
sys.argv = ['pdm']

For the top interpreter to see "tf_init.py" I set PYTHONPATH to my working directory. I had to do this because I don't think Jep provides a means to set an include path for the top interpreter, which is irrelevant to this issue but I hope someone to figure it out as well as this issue.

ndjensen added a commit to ndjensen/jep that referenced this issue Jul 7, 2017
ndjensen added a commit to ndjensen/jep that referenced this issue Jul 7, 2017
ndjensen added a commit to ndjensen/jep that referenced this issue Jul 7, 2017
ndjensen added a commit to ndjensen/jep that referenced this issue Jul 7, 2017
ndjensen added a commit to ndjensen/jep that referenced this issue Jul 7, 2017
ndjensen added a commit to ndjensen/jep that referenced this issue Jul 7, 2017
ndjensen added a commit to ndjensen/jep that referenced this issue Jul 7, 2017
ndjensen added a commit to ndjensen/jep that referenced this issue Jul 7, 2017
@ndjensen ndjensen changed the title 'module' object has no attribute 'argv' tensorflow: 'module' object has no attribute 'argv' Jul 10, 2017
@ndjensen
Copy link
Member

Ok, two things:

  1. @c-spencer, @akrauchanka, or @eastcirclek, can one of you open a tensorflow issue that is basically, "tensorflow doesn't work well in embedded environments due to reliance on sys.argv". Also link to this Jep ticket, and on this Jep ticket add a link to the tensorflow ticket. I am not comfortable opening the tensorflow ticket due to my lack of familiarity with tensorflow. But once open, we can all add comments with more information to the ticket.

  2. I added a method setSharedModulesArgv(String...) on my fork of Jep dev_3.7 to attempt to fix the issue. I wrote a unit test for it but have not tested it with tensorflow. @akrauchanka and @eastcirclek, can you test it out?

A simple test could be something like:

Jep.setSharedModulesArgv("");
Jep jep = new Jep();
jep.eval("import tensorflow");

If it works I will merge it into the main repository's dev_3.7.

@eastcirclek
Copy link

I tested your code of branch dev_3.7.
Thanks to Jep.setSharedModulesArgv(), I can safely remove the two lines from the file I explained above #81 (comment):

#import sys
#sys.argv = ['pdm']
import tensorflow.python

@eastcirclek
Copy link

eastcirclek commented Jul 11, 2017

@ndjensen

I don't think this is specific to tensorflow; it can be a problem for other programs which call the argparse module.

val jep = new Jep()
jep.eval("import argparse")
jep.eval("parser = argparse.ArgumentParser()")

Upon executing argparse.ArgumentParser(), I got the error:
Exception in thread "main" jep.JepException: <class 'AttributeError'>: module 'sys' has no attribute 'argv'

The simplest way to avoid this error seems to declare another class in tensorflow/python/platform/flags.py that doesn't depend on argparse.ArgumentParser.

@ndjensen
Copy link
Member

@eastcirclek, thanks for testing. I've merged the code from my fork to the main jep dev_3.7.

Ok, I agree we don't need a tensorflow ticket based on your investigation. I have concerns about the complexity we're adding to Jep to support quirks of various CPython extensions (the entire shared modules concept was added to work around issues with numpy). But since it helps the Jep community we'll keep doing our best.

@eastcirclek
Copy link

eastcirclek commented Jul 13, 2017

@ndjensen

I'm sure of the importance of having shared module in Jep. Python libraries like Scipy, H5py, Theano, and TensorFlow to name a few are not a pure Python library, so without Jep they cannot be used inside JVM-based data processing engines in which multiple sub-interpreters should be created and destroyed as user jobs are created and finished. TensorFlow supports its Java API and I tested it. However, Tensorflow Java API somehow shows worse performance than TensorFlow Python API; so I stick to use Jep+TensorFlow Python API.

So there's no doubt about shared module to me 💯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants