Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Move ctypes ImportError catching to appropriate place #8898

Merged
merged 15 commits into from
May 7, 2017

Conversation

davidjn
Copy link
Contributor

@davidjn davidjn commented Apr 5, 2017

The current implementation has a try/except ImportError block around a statement that has no imports. This is leftover from some previous changes that didn't fully clean up the code.

eee00f8 introduced functionality for this module to work when ctypes is not available.
3c1a13d Moved the import statement out of the try/except block and to top file, but did not account for behavior when ctypes is not importable.

The current implementation has a try/except ImportError block around a statement that has no imports. This is leftover from some previous changes that didn't fully clean up the code.

Commit [1] introduced functionality for this module to work when ctypes is not available.
Commit [2] Moved the import statement out of the try/except block and to top file, but did not account for behavior when ctypes is not importable.

[1] numpy@eee00f8
[2] numpy@3c1a13d
@eric-wieser
Copy link
Member

Commit message should start with MAINT: or perhaps BUG:.

I don't really follow here what we were trying to achieve in eee00f8 by substituting ctypes with _missing_ctypes - _getintp_ctype still fails, but now does so with a more cryptic message.

@davidjn davidjn changed the title Move ctypes ImportError catching to appropriate place BUG: Move ctypes ImportError catching to appropriate place Apr 6, 2017
@davidjn
Copy link
Contributor Author

davidjn commented Apr 6, 2017

Sorry about the commit message title- first time contributor here :).

I agree that eee00f8 left _getintp_ctype still failing. Prior to my proposal, the top of file "import _ctypes" will fail with an ImportError. With my proposal, _getintp_ctype (which is called by get_shape() and get_stride() in the _ctypes class) will fail with an AttributeError.

Do you have a recommendation here? It seems like options are to either (a) provide fakes for ctypes.c_int, ctypes.c_long, and ctypes.c_longlong or (b) catch the behavior in _getintp_ctype and raise a "ctypes not available" message there. There is a similar approach to option (b) here (introduced with 2b6af2c).

The context for this is that I'm trying to get matplotlib to run in an environment where the _ctypes module is not available. Things seem work if I either use an older version of numpy, or I make the proposed change to my local numpy. Perhaps there are also some mpl calls that run _getintp_ctype under the hood, and I just haven't hit those yet.

@eric-wieser
Copy link
Member

eric-wieser commented Apr 6, 2017

Sorry about the commit message title- first time contributor here :).

That's alright, I can edit the commit message the expense of squashing (which for one commit, is fine)

To be clear, the patch you have is clearly an improvement, and does the job of mostly fixing what 3c1a13d broke. I was hoping you

To me, the best fix would be:

class dummy_ctype(object):
    def __init__(self, cls):
        self._cls = cls
    def __mul__(self, other):
        return self
    def __call__(self, other):
        return cls(other)

And have _getintp_ctype just return dummy_ctype(np.intp).

While you're there, get_shape could be reduced to:

def get_shape(self):
    return self.shape_as(_getintp_ctype())

The dummy_ctype object is a naive fake used when ctypes is not importable. It does not intend to fake any ctypes behavior, but provides a duck-typed interface so that method calls do not fail.

The get_shape and get_strides refactors take advantage of other instance methods and reduce code duplication.
@davidjn
Copy link
Contributor Author

davidjn commented Apr 6, 2017

Thanks Eric, I've added your recommendations and also reduced get_strides accordingly.

self._cls = cls
def __mul__(self, other):
return self
def __call__(self, other):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this might actually need to be *other, since we call it with * further down. Is there any easy way to test a lack of ctypes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right. Updated with the next commit.

Adds tests for numpy/core/_internal.py to test the _ctypes class when the ctypes module is or isn't available.
@davidjn
Copy link
Contributor Author

davidjn commented Apr 7, 2017

Apologies again on the commit title not including "BUG".

I've updated the dummy_ctypes object per your comment, and added numpy/core/tests/test_internal.py to validate that the _ctypes object (and its get_shape method, which calls _getintp_ctype) work properly when the ctypes module is and is not available.

Copy link
Member

@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job! Lets see whether those run...

from numpy.core import _internal

_ctypes = _internal._ctypes(self.test_arr)
shape = _ctypes.get_shape()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should just test the .shape property here - get_shape is an implementation detail

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the shape assertions altogether- the intent was to test getintp_ctype through the get_shape method. I've added a simple test for getintp_ctype instead so that now the shape test is redundant with the assert_array_equal assertion.

# sys.modules.
sys.modules['ctypes'] = None
del np.core._internal
del sys.modules['numpy.core._internal']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh geez, that's pretty scary...

Worried this will break other tests, but I guess we'll see

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to set _internal.ctypes to None at the beginning of the test, and resetting _internal.ctypes = ctypes in the tearDown method.

shape = _ctypes.get_shape()

self.assertEqual(ctypes, _ctypes._ctypes)
self.assertEqual(len(shape), 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert_equal(shape, [2, 3]) would be shorter here. Also, we seem to prefer assert_equal over self.assertEqual

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated self.assertEqual usage to assert_equal. I removed the shape assertions altogether- the intent was to test getintp_ctype through the get_shape method. I've added a simple test for getintp_ctype instead so that now the shape test is redundant with the assert_array_equal assertion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intent was to test getintp_ctype through the get_shape method

This was the correct thing to do. But even better would be to test through the .shape property

So that we can use np.intp.
@eric-wieser
Copy link
Member

@charris: And AxisError and TooHardError not being in _globals rear their head...

@davidjn
Copy link
Contributor Author

davidjn commented Apr 7, 2017

@charris: And AxisError and TooHardError not being in _globals rear their head...

Hmm, yea I didn't consider how this may affect other tests. I'm not sure what else to do about this... suggestions are welcomed!

@eric-wieser
Copy link
Member

As a stopgap, you could try adding a try/finally that attempts to put the modules back together again afterwards

@davidjn
Copy link
Contributor Author

davidjn commented Apr 7, 2017

I think that last commit may do the trick, I've removed the sys.modules munging. I also learned how to run the full test suite locally, and this makes things pass on my machine. Lets see if it builds...

@davidjn
Copy link
Contributor Author

davidjn commented Apr 7, 2017

Alright, that seems to have cleared the cross-test issues. I'll add another commit to take care of your few suggestions on the tests, and we'll hopefully be good to go.

Fwiw, I was calling get_shape to indirectly test the _getintp_ctype method. I'll update the test as you suggest, and add a simple standalone test class for the _getintp_ctype method.

@davidjn
Copy link
Contributor Author

davidjn commented Apr 7, 2017

@eric-wieser, I've addressed your comments and added a simple test class for _getintp_ctype. Thanks for the reviews, comments, and assistance!

@eric-wieser
Copy link
Member

Fwiw, I was calling get_shape to indirectly test the _getintp_ctype method.

Yes, I agree, we should test get_shape. But because shape = property(get_shape, ...), we should test that by just testing .shape instead

Copy link
Member

@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the persistenceSorry to keep complaining, but I think these tests should be structured in terms of the public interface.

This _ctypes class is actually visible through something like np.array([1]).ctypes, so really we should be testing that interface instead, rather than the implementation details.

A test of np.array([1]).ctypes.shape with and without ctypes present should do the job.

I think that all these tests probably belong in a TestCtypes class in test_multiarray.py

def _getintp_ctype():
if ctypes is None:
import numpy as np
return dummy_ctype(np.intp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should make use of the cache, that the normal code path takes


def tearDown(self):
# Ensure that ctypes is set on _internal after tests run.
_internal.ctypes = ctypes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be better handled as just

def test_ctypes_is_not_available(self):
    _internal.ctypes = None
    try:
        do_the_test()
    finally:
        _internal.ctypes = ctypes

Or if you want to avoid repetition:

def without_ctypes(f):
    @functools.wraps(f)
    def wrapper(*args, **kwargs):
        _internal.ctypes = None
        try:
            return f(*args, **kwargs)
        finally:
            _internal.ctypes = ctypes

@without_ctypes
def test_ctypes_is_not_available(self):
    do_the_test

It's a bit confusing as you have it to replace it in contexts where it shouldn't have changed

_ctypes = np.array(self.test_arr).ctypes

self.assertIsInstance(_ctypes._ctypes, _internal._missing_ctypes)
assert_equal(_ctypes._arr.shape, (2, 3))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be _ctypes.shape, not _ctypes._arr.shape. Preferably, I'd spell it test_arr.ctypes.shape, and eliminate the _ctypes variable


class TestCTypes(TestCase):
def setUp(self):
self.test_arr = np.array([[1, 2, 3], [4, 5, 6]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good practice to put cheap setup (like this) in the tests themselves, so that its self contained

@eric-wieser
Copy link
Member

Also looks like you forgot to commit the cache changes you mention in the commit message

@@ -21,6 +21,7 @@

import numpy as np
from numpy.compat import strchar, unicode
from numpy.core import _internal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be tempted to move this inside the test, so the hack is all in one place

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving in the next commit.

test_arr = np.array([[1, 2, 3], [4, 5, 6]])

self.assertIsInstance(
test_arr.ctypes._ctypes, _internal._missing_ctypes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debatable whether testing implementation details like this (and the one above) makes any sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you feel strongly, I'll remove it. I think its a good check to keep. For instance, if someone decides to add an "import ctypes" statement into one of these methods in the future, this assertion will catch that. Otherwise, the test will continue to pass.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my concernt is that that's somewhat artificial, and doesn't really catch us doing import ctypes anywhere else - these tests wouldn't catch the bug that caused you to patch this in the first place, for instance.

I don't feel strongly about it, but would appreciate input from someone else on a better way to test this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Will wait for additional input.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm almost tempted to spin off a new python interpreter without ctypes to run the test, to ensure it can't break anything else

You might also be able to get your sys.modules hackery to work if you try to put everything back together again, but also a little risky.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Eric,

I tried again with the sys.modules hackery, and trying to put things back together again with the try/finally block. Unfortunately, other tests continue to fail. I guess this is because tests are run in parallel. With the current setup (setting ctypes = None), we will have similar cross-test side effects, but I guess we are getting lucky in that it isn't causing any errant failures.

By "spin off a new python interpreter", are you suggesting to execute the test with python's subprocess?

Eg (?):
def some_test(self):
self.assertEqual(0, subprocess.call(sys.executable, "-c", test_code)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggesting to execute the test with python's subprocess?

Yes, that is what I mean. I'm not actually sure that's a good idea though, and I don't think you should risk wasting time trying it until someone else weighs in.


self.assertIsInstance(
test_arr.ctypes._ctypes, _internal._missing_ctypes)
assert_equal(test_arr.shape, (2, 3))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be test_arr.ctypes.shape. We specifically need to test the ctypes shape, not the normal one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion here, updating in the next commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks

@@ -6734,16 +6733,17 @@ def test_ctypes_is_available(self):
test_arr = np.array([[1, 2, 3], [4, 5, 6]])

self.assertEqual(ctypes, test_arr.ctypes._ctypes)
assert_equal(test_arr.shape, (2, 3))
assert_equal(list(test_arr.ctypes.shape), (2, 3))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert_equal(list(...), some_tuple) is a little jarring - maybe replace list with tuple?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing, commit incoming.

@eric-wieser
Copy link
Member

Perhaps we should merge this, and just create a separate issue for properly testing without ctypes?

Can you summarize how exactly you ended up with a ctypes-less python installation in the first place?

@davidjn
Copy link
Contributor Author

davidjn commented Apr 10, 2017

Can you summarize how exactly you ended up with a ctypes-less python installation in the first place?

Sure, I work on Google's App Engine, and matplotlib is enabled by default as a third party library [1]. For security sandboxing, we disable some of the standard library's C-based modules, including _ctypes. Our production numpy version is an older version from before commit 3c1a13d, so things work fine when _ctypes is not available in production.

I'm working now to enable matplotlib functionality out-of-the-box with App Engine's local development server. The development server uses import hooks to similarly inhibit standard library C-based modules. We do this so that the local development environment matches the production environment as close as possible. Without the proposed fix, users will have to install an older version of numpy for matplotlib to work out of the box.

[1] https://cloud.google.com/appengine/docs/standard/python/tools/built-in-libraries-27

@davidjn
Copy link
Contributor Author

davidjn commented Apr 17, 2017

Hey Eric, just checking in to see what you think the next steps here might be. Should we merge and open another issue for properly testing later? Or should we go ahead with the subprocess testing approach?

Are there other numpy contributors who may have a better recommendation for testing?

@eric-wieser
Copy link
Member

Should we merge and open another issue for properly testing later?

I think this is the best path.

Are there other numpy contributors who may have a better recommendation for testing?

I'm hoping that one will turn up, and confirm that what you have here is a pretty reasonable compromise.

@eric-wieser
Copy link
Member

eric-wieser commented May 6, 2017

@charris: It'd be nice to get this into 1.13, especially since this is a regression in 1.10.0.

@charris
Copy link
Member

charris commented May 6, 2017

@eric-wieser I'd be happy to move AxisError and TooHardError into _globals if that helps. Should be easy to avoid problems I think.

@charris
Copy link
Member

charris commented May 6, 2017

@eric-wieser If you feel it is in good shape to merge, go ahead. The milestone is for things that need to be in 1.13, not all things that can go into it.

@eric-wieser
Copy link
Member

Should be easy to avoid problems I think.

I'd worry that npy_cache_import is in general bad news for a module reload, so I think that it's best to stay away from reloading modules in a testsuite.

If you feel it is in good shape to merge, go ahead

I felt a little unqualified to make that judgement because parts of the patch are based on code I suggested - but if this looked ok to you after a quick once-over, then I'm content to pull the trigger.

The milestone is for things that need to be in 1.13, not all things that can go into it.

Is the milestone also (retrospectively) for things that are in 1.13? Seems like a convenient way to attach PRs to the releases that they appear in.

@charris
Copy link
Member

charris commented May 6, 2017

Is the milestone also (retrospectively) for things that are in 1.13?

No. I've seen that done in SciPy for generating the set of PRs for the release, but I think it is a bit too tricky to rely on. Plus I've been deleting old release milestones. There is a script that could be modified to add milestones if it seems helpful.

@eric-wieser
Copy link
Member

I'll stop doing that then!

@charris
Copy link
Member

charris commented May 6, 2017

I'd worry that npy_cache_import is in general bad news for a module reload

The C modules cannot be reloaded, so I expect the import would keep a reference to the original function/class and thus keep it in memory even if the defining python module is reloaded. That is what ISTR from the discussion of the reload problem anyway.

@charris
Copy link
Member

charris commented May 7, 2017

@eric-wieser Still looking to put this in 1.13? I'm thinking of branching tomorrow and pushing off everything still unmerged to 1.14.

@eric-wieser
Copy link
Member

eric-wieser commented May 7, 2017

Yeah, I think I'll just merge this. This definitely fixes the problem it set out to, and it doesn't look like it affects a normal setup with ctypes present.

Worst case scenario, the test does something screwy regarding threading, but the only way we'll spot that it through merging anyway, and it won't affect end-users.

@eric-wieser eric-wieser merged commit d7d1b2a into numpy:master May 7, 2017
mherkazandjian pushed a commit to mherkazandjian/numpy that referenced this pull request May 30, 2017
This fixes a regression in 3c1a13d, restoring the
behaviour intended by eee00f8.

This also extends the support on ctype-less systems for `arr.ctypes.shape` and `arr.ctypes.strides`.

The dummy_ctype object is a naive fake used when ctypes is not importable. It does not intend to fake any ctypes behavior, but provides a duck-typed interface so that method calls do not fail.

The get_shape and get_strides refactors take advantage of other instance methods and reduce code duplication.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants