New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix load of models that depend on non thread-safe dependencies #27
Conversation
at the moment I only tested this with UTs, before landing this I want to make sure that it works in a real environment |
Travis tests have failedHey @paulojrp, TravisBuddy Request Identifier: b182d090-fa53-11e8-8526-c1031fffd6c8 |
I suppose this is related with #26 ? |
openml-python-common/src/main/java/com/feedzai/openml/python/jep/instance/JepInstance.java
Outdated
Show resolved
Hide resolved
/** | ||
* Private constructor for singleton pattern. | ||
* | ||
* @implNote {@link Jep} is initalized inside the current JVM. Due to the need to manage a consistent Python thread |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kudos for the explanation! IMHO I would prefer to keep the explanation generic instead of base it on the specific case we caught (and then with a reference to the specific github issue), but it's a super minor nitpick, and please let anyone else comment on this before changing anything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A quick review seems good. will make a deeper review monday. But there are tests failing
A clear disadvantage of this approach is that it never "releases" models already loaded. E.g., consider a JVM where you load and close 10000 models sequentially; this will eventually go OOM. Quick note: the docker image used in travis is not set up with tensorflow, but I see you added as UT the regression test for it. Nevertheless there are some failed UTs. I'll let you take a look at it. |
yes this commit fixed that issue |
Another problem is the names of to methods responsible for the classification of events. The implementation of the providers is assuming that their names are constantes ("classify" and "getClassification"). This logic won't work if we only use one python interpreter. I will explore another solutions. |
01865e4
to
e3c6a86
Compare
Codecov Report
@@ Coverage Diff @@
## master #27 +/- ##
=============================================
- Coverage 100% 49.03% -50.97%
- Complexity 22 37 +15
=============================================
Files 6 11 +5
Lines 42 208 +166
Branches 0 8 +8
=============================================
+ Hits 42 102 +60
- Misses 0 104 +104
- Partials 0 2 +2
Continue to review full report at Codecov.
|
This is pretty nice @paulojrp . Of course it has the downside that it requires code changes for every new library with the same problem, but it is an acceptable trade-off. I trust that you ran the UT regression of #26 and it passed? I just have 1 suggestion: Please make the list of packages added to shared modules read dynamically (environment variable? a resource file inside the openml jar? please consider remote spark executions when devising the solution) --- this will make users autonomous to overcome this problem for a new library in the future; please add a note to the README about the #26 and this solution |
e3c6a86
to
0214701
Compare
0214701
to
a34f327
Compare
a34f327
to
b70673c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall looks great! :) just a few minor style issues. Feel free to disagree with them
openml-python-common/README.md
Outdated
@@ -40,3 +40,15 @@ export PATH=$ANACONDA_PATH/envs/myenv/bin:$PATH | |||
export LD_LIBRARY_PATH=$ANACONDA_PATH/envs/myenv/lib/python3.6/site-packages/jep:$LD_LIBRARY_PATH | |||
export LD_PRELOAD=$ANACONDA_PATH/envs/myenv/lib/libpython3.6m.so | |||
``` | |||
|
|||
7. If you need to share Python modules across sub-interpreters, you would need to create a "python-packages.xml" file where you define the modules to be shared. By default the provider is already sharing the "numpy" and "tensorflow" modules. This is a workaround for the issues with CPython extensions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be 'you will need to create a (...)'. @krisztinaknagy to validate this one
openml-python-common/README.md
Outdated
@@ -40,3 +40,15 @@ export PATH=$ANACONDA_PATH/envs/myenv/bin:$PATH | |||
export LD_LIBRARY_PATH=$ANACONDA_PATH/envs/myenv/lib/python3.6/site-packages/jep:$LD_LIBRARY_PATH | |||
export LD_PRELOAD=$ANACONDA_PATH/envs/myenv/lib/libpython3.6m.so | |||
``` | |||
|
|||
7. If you need to share Python modules across sub-interpreters, you would need to create a "python-packages.xml" file where you define the modules to be shared. By default the provider is already sharing the "numpy" and "tensorflow" modules. This is a workaround for the issues with CPython extensions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any official documentation for this python-packages.xml?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bellow you can find an example of the contents of this file
openml-python-common/src/main/java/com/feedzai/openml/python/jep/instance/JepInstance.java
Show resolved
Hide resolved
openml-python-common/src/main/java/com/feedzai/openml/python/xml/parser/XMLParser.java
Outdated
Show resolved
Hide resolved
openml-python-common/src/main/java/com/feedzai/openml/python/xml/parser/XMLParser.java
Outdated
Show resolved
Hide resolved
openml-python-common/src/main/java/com/feedzai/openml/python/xml/parser/XMLParser.java
Outdated
Show resolved
Hide resolved
openml-python-common/src/main/java/com/feedzai/openml/python/xml/parser/XMLParser.java
Outdated
Show resolved
Hide resolved
openml-python-common/src/main/java/com/feedzai/openml/python/xml/parser/XMLParser.java
Outdated
Show resolved
Hide resolved
openml-python-common/src/test/java/com/feedzai/openml/python/xml/parser/XMLParserTest.java
Outdated
Show resolved
Hide resolved
openml-python-common/src/test/java/com/feedzai/openml/python/xml/parser/XMLParserTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, I would expect a UT like the one described in #26 . Is it too hard to automate?
b70673c
to
3c03a58
Compare
Hey @paulojrp, TravisCI finished with status TravisBuddy Request Identifier: d0ff2700-ff96-11e8-88a0-5d55312123ee |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pedrorijo91 for adding that UT we'd need tensorflow to be added to the Docker image used in Travis (and then to the base openml image used inside Feedzai). I'm not sure we want to keep adding all the stuff to the images, they'll grow endlessly.
@paulojrp Reviewed, mostly minor comments: please check the coverage and improve it, and tackle my comments. Thanks!
openml-python-common/src/main/java/com/feedzai/openml/python/jep/instance/JepInstance.java
Show resolved
Hide resolved
openml-python-common/src/main/java/com/feedzai/openml/python/modules/SharedModulesParser.java
Outdated
Show resolved
Hide resolved
openml-python-common/src/main/java/com/feedzai/openml/python/modules/SharedModulesParser.java
Show resolved
Hide resolved
openml-python-common/src/main/java/com/feedzai/openml/python/modules/package-info.java
Show resolved
Hide resolved
openml-python-common/src/main/java/com/feedzai/openml/python/modules/SharedModulesParser.java
Show resolved
Hide resolved
...l-python-common/src/test/java/com/feedzai/openml/python/modules/SharedModulesParserTest.java
Show resolved
Hide resolved
can't we simply create a custom model that loads python stuff @nmldiegues ? anyway, if it's too time/effort-consuming, let's be pragmatic and go without it :) |
A problem was detected when loading TensorFlow models in different threads inside the same JVM. That happened after load a TensorFlow model and then try to import a new TensorFlow model. This was caused by a dependency of TensorFlow (protobuf) that was being reloaded but it already existed in the JVM (through the 1st thread). The workaround was to share the problematic module (Tensorflow) across the sub-interpreters of Python. This is a workaround for the issues with CPython extensions.
3c03a58
to
1337706
Compare
Hey @paulojrp, TravisBuddy Request Identifier: ade3d770-ffaf-11e8-88a0-5d55312123ee |
@nmldiegues I created a new UT but the coverage is still decreasing a lot, I don't see how this is related with my modifications. @pedrorijo91 to reproduce this problem we only had to load a python file that imports tensorflow. The problem with a unit-test for this case is that we would have to modify the docker image to also install tensorflow. we could do it but then what would happen when new packages with these problems are found? (modify the image to also check that). Don't forget that we don't have an openml provider for tensorflow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all good on my side then :)
A problem was detected when loading TensorFlow models in different threads inside the same JVM. That happened after load a TensorFlow model and then try to import a new TensorFlow model. This was caused by a dependency of TensorFlow (protobuf) that was being reloaded but it already existed in the JVM (through the 1st thread).
The workaround was to share the problematic module (Tensorflow) across the sub-interpreters of Python. This is a workaround for the issues with CPython extensions.