-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
travis-ci: properly export PYTHON_VERSION and fix python 3 fallout #2124
Conversation
Not a no-op, good catch! Looks like the Ubuntu / Python 3.6 builder is failing here, which probably accounts for the subsequent failures in tests that need jobspec:
|
The centos7 py3.4 builder also fails, but it seems to get through most everything and lose it on one of the python test helpers I added to test the job-manager here:
(This one's probably on me - will have a look) |
b9cd5c5
to
8683464
Compare
Rebased this branch on current master. Also included some experimental fixes for fallout from the initial change, i.e. the As discussed in #2125, these fixes lead to other errors in the testsuite, including:
(I guess |
This fix was straightforward, so I went ahead and committed it here. |
For the diff --git a/t/python/t0002-wrapper.py b/t/python/t0002-wrapper.py
index e232cfea8..20e1d3f07 100755
--- a/t/python/t0002-wrapper.py
+++ b/t/python/t0002-wrapper.py
@@ -12,6 +12,7 @@
import unittest
+import six
import flux
from flux.core.inner import ffi, raw
import flux.wrapper
@@ -44,18 +45,14 @@ class TestWrapper(unittest.TestCase):
future = f.rpc("cmb.ping", payload)
resp = future.get()
future.pimpl.handle = None
- with self.assertRaises(ValueError) as cm:
+ with six.assertRaisesRegex(self, ValueError, r"Attempting to call a cached, bound method.*NULL handle"):
resp = future.get()
- self.assertRegexpMatches(
- cm.exception.message,
- "Attempting to call a cached, " "bound method.*NULL handle",
- )
def test_automatic_unwrapping(self):
flux.core.inner.raw.flux_log(flux.Flux("loop://"), 0, "stuff")
def test_masked_function(self):
- with self.assertRaisesRegexp(AttributeError, r".*masks function.*"):
+ with six.assertRaisesRegex(self, AttributeError, r".*masks function.*"):
flux.Flux("loop://").rpc("topic").pimpl.flux_request_encode("request", 15)
def test_set_pimpl_handle(self):
@@ -68,7 +65,7 @@ class TestWrapper(unittest.TestCase):
def test_set_pimpl_handle_invalid(self):
f = flux.Flux("loop://")
r = f.rpc("topic")
- with self.assertRaisesRegexp(TypeError, r".*expected a.*"):
+ with six.assertRaisesRegex(self, TypeError, r".*expected a.*"):
r.handle = f.rpc("other topic")
def test_read_basic_value(self): Docs: https://docs.python.org/3/library/unittest.html#unittest.TestCase.assertRaisesRegex EDIT: there was a mistake in the original diff ( |
Thanks, @SteVwonder! For the Strangely, with python3, the following exits with zero exit code and no output:
even though with some debug, the "except" path is taken:
If I comment out the
Something about trying to access |
😱 Agreed! Apparently the So that explains why things are breaking on usage of |
The problem was with the diff --git a/src/cmd/flux-jobspec b/src/cmd/flux-jobspec
index 206d14270..2c6760b3c 100755
--- a/src/cmd/flux-jobspec
+++ b/src/cmd/flux-jobspec
@@ -126,7 +126,7 @@ def slurm_jobspec(args):
try:
validate_slurm_args(args)
except ValueError as e:
- logger.error(e.message)
+ logger.error(str(e))
sys.exit(1)
t = slurm_walltime_to_duration(args.time)
return create_slurm_style_jobspec(
@@ -192,8 +192,8 @@ if __name__ == "__main__":
except SystemExit as e: # don't intercept sys.exit calls
exit_code = e
except Exception as e:
- logging.error(e.message)
exit_code = 1
+ logging.error(str(e))
finally:
logging.shutdown()
sys.exit(exit_code) One of the diff --git a/src/cmd/flux-jobspec b/src/cmd/flux-jobspec
index 206d14270..b5f324439 100755
--- a/src/cmd/flux-jobspec
+++ b/src/cmd/flux-jobspec
@@ -8,13 +8,18 @@ import math
import logging
import argparse
import json
-from collections import Sequence
+import collections
+
+try:
+ collectionsAbc = collections.abc
+except AttributeError:
+ collectionsAbc = collections
import yaml
def create_resource(res_type, count, with_child=[]):
- assert isinstance(with_child, Sequence), "child resource must be a sequence"
+ assert isinstance(with_child, collectionsAbc.Sequence), "child resource must be a sequence"
assert not isinstance(with_child, str), "child resource must not be a string"
assert count > 0, "resource count must be > 0"
I realize the irony of commenting these patches to you after making fun of the linux kernel for doing the exact same thing over email |
want to collect all your fixes into a branch and then I can cherry pick them here? |
Also, I think there are some other uses of |
👍 Here it is. You'll probably want to squash the first commit which is just running the black formatter on this PR as it is. https://github.com/SteVwonder/flux-core/tree/issue2123 For some reason, it works locally (with a simple EDIT: still investigating the potential double free in the drain/undrain |
Ah, yeah. I doubt Now that |
What if, for testing, we just put a link to the configured python in Then for test scripts we can still use The drawback is that all python scripts (run during testing) would use the configured python but perhaps that is ok. |
Ok, I've just pushed a revised branch (with @SteVwonder's fixes so far, thanks!) which uses the approach above. A The |
In case this is helpful, the segfaults in python from ```index d4d94e95d..4dc471064 100644
--- a/src/bindings/python/flux/future.py
+++ b/src/bindings/python/flux/future.py
@@ -42,7 +42,7 @@ class Future(WrapperPimpl):
match=ffi.typeof("flux_future_t *"),
filter_match=True,
prefixes=None,
- destructor=raw.flux_future_destroy,
+ #destructor=raw.flux_future_destroy,
):
# avoid using a static list as a default argument
# pylint error 'dangerous-default-value'
@@ -50,7 +50,7 @@ class Future(WrapperPimpl):
prefixes = ["flux_future_"]
super(Future.InnerWrapper, self).__init__(
- ffi, lib, handle, match, filter_match, prefixes, destructor
+ ffi, lib, handle, match, filter_match, prefixes,
)
def check_wrap(self, fun, name): The final error is from
Unfortunately, I'm not getting anything out of the corefile. I tried to run the broker under valgrind and reproduce, but once python is involved, there are many thousands of errors from valgrind. Here's the subset of valgrind output after the module is removed:
|
Sounds like the probability it's a GC'd double free is going up 😢 . I'll poke around at that between now and the Social Coding Hour linux bonaza 😄 .
It looks like for python 2.2-3.5, there are a bunch of hoops to jump through, including reconfiguring/recompiling python with various flags. With Python 3.6+, it looks like setting |
@SteVwonder, just fyi, diff --git a/src/modules/pymod/py_mod.c b/src/modules/pymod/py_mod.c
index 5a578d369..93fd2dfa1 100644
--- a/src/modules/pymod/py_mod.c
+++ b/src/modules/pymod/py_mod.c
@@ -174,6 +174,8 @@ int mod_main (flux_t *h, int argc, char **argv)
if (py_flux_handle == NULL)
return -1;
PyTuple_SetItem(py_args, 1, py_flux_handle);
+ // Take a reference for python
+ flux_incref (h);
//Convert zhash to native python dict, should preserve mods
//through switch to argc-style arguments
|
Another FYI - for python 3.4, I'm getting a different error from
making this change removes the error, but again, I'm not sure of the real fix: diff --git a/src/bindings/python/flux/core/trampoline.py b/src/bindings/python/f
index 8743491..0c2d8ec 100644
--- a/src/bindings/python/flux/core/trampoline.py
+++ b/src/bindings/python/flux/core/trampoline.py
@@ -18,7 +18,7 @@ def mod_main_trampoline(name, int_handle, args):
flux_instance = Flux(handle=lib.unpack_long(int_handle))
user_mod = None
try:
- user_mod = importlib.import_module("flux.modules." + name, "flux.module
+ user_mod = importlib.import_module("flux.modules." + name)
except ImportError: # check user paths for the module
user_mod = importlib.import_module(name)
|
@grondo: I just pushed a few changes to that same branch in my fork (after a rebase on your cherry-picked commits). It includes:
For the python3.4 importlib error, I believe that is the right fix. Per the docs, |
Great! 👍 With your latest changes this PR actually passes tests for Python 2.7, 3.4 and 3.6 AFAICT. I rebased and force-pushed the PR branch in order to put commits in a logical order. (I also added commit description to some of your commits @SteVwonder, hope you don't mind) The changes to O/w, this "small" fix is ready I think. |
Hm, pylint doesn't like the new handle
I'll try adding: # pylint: disable=no-self-use For now to get past the error. |
Some problems with |
Be sure to use `flux python` when obtaining user site path so that we use the flux confingured python not just the first one in PATH.
Add a "python" wrapper to src/cmd/python which execs the python interpreter specified at build time. This ensures that python scripts run during `make check` use the configured python and not the python interpreter first in the default PATH.
Support invoking the configured PYTHON_INTERPRETER on Python scripts in FLUX_EXEC_PATH directly by checking for `flux-<cmd>.py` in each directory before trying to execute `flux-<cmd>`. This allows flux(1) to run python scripts with the correct version of the interpreter without jumping through wrapper scripts, etc.
In order to support flux(1) invoking flux-jobspec under the python interpreter, rename the script with a .py extension.
Problem: cmp() was removed from Python 3, but is necessary for the bulk job state test. Define cmp() locally using the "official" Python 3 workaround: https://docs.python.org/3.0/whatsnew/3.0.html#ordering-comparisons Also apply black formatter. Co-authored-by: Stephen Herbein <herbein1@llnl.gov>
Use of collections.abc is deprecated and will not work in python 3.8+. Use plain collections where necessary to quiet deprecation warnings, and to support 3.8.
Exception errors no longer have `message` member. The suggested way to print the exception error string is with str(e).
if the user calls .close on a handle, then raise a RuntimeError. When this method is not explicitly provided by the bindings, the Wrapper __getattr__ method would search the auto-generated bindings for `flux_close`, call it with the `self.handle`, and result in a still existing `Flux` object instance with a closed/free'd inner flux handle
Remove all explicity handle close calls from Python, instead allowing the GC to auto-destruct and close these handles when they go out of scope.
The pymod module does not own the memory for the handle passed to it by the broker, and during module unload the hanlde was being closed twice: once by the Python GC and once by the broker. To fix, take an explicit reference in python to the handle before giving it to the loaded python module.
On Python 3.4, the following line in trampoline.py user_mod = importlib.import_module("flux.modules." + name, "flux.modules") was generating the error: SystemError: Parent module 'flux.modules' not loaded, cannot perform relative import Remove the package="flux.modules" argument to quiet the warning, since import_module() is not being used here for a relative import.
Be sure to export PYTHON_VERSION to the docker environment before running configure. This is how we select a different python version for the build and test. Fixes flux-framework#2123
Codecov Report
@@ Coverage Diff @@
## master #2124 +/- ##
==========================================
+ Coverage 80.41% 80.43% +0.01%
==========================================
Files 200 200
Lines 31756 31794 +38
==========================================
+ Hits 25537 25573 +36
- Misses 6219 6221 +2
|
This ready to go in? |
Yes, probably ready to go in. |
Thanks @garlick and @SteVwonder! |
Be sure to export PYTHON_VERSION to the docker environment as discussed in #2123.
I haven't yet tried to fix any of the potential fallout from adding python 3.x checks under travis yet, so the checks may fail here. Also, there is the potential I was just confused and this will be a no-op.