Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: MAINT: Add deprecation warning to views of multi-field indexes #6054

Merged
merged 1 commit into from
Sep 24, 2016

Conversation

ahaldane
Copy link
Member

@ahaldane ahaldane commented Jul 8, 2015

As discussed in #6053, in preparation for making multi-field indices return a view instead of a copy, this adds a deprecation warning when attempting to view the result of a multi-field index, as in:

>>> a = ones(10, dtype='i8,i8,i8')
>>> b = a[['f0','f2']
>>> b.view('i4') # Deprecation warning

Such code will break any time we convert copies to views, because the byte arrangement of copies and views is necessarily different. I'd like to get it in for 1.10, so we can more easily make the copy->view change in 1.11.

My computer access is limited right now so this PR might be a bit rough, I'll tidy it later. Just posting for feedback now.

One thing I haven't figured out is how to make the unit tests show the deprecation warning: There are unit tests which I know show the warning (test_field_names) but the warnings don't show up when I run np.test('full') or even np.test('full', raise_warnings=[FutureWarning]). I'd like to use this to catch all the numpy code that will break when convertingmulti-field copies to view. Any ideas?

@seberg
Copy link
Member

seberg commented Jul 8, 2015

Did you run the test on a clean session? Otherwise they are probably suppressed. If you are running them on a clean session, there is a with warnings.catch_warnings missing somewhere in tests that are run earlier :(. Tracking that down is annoying, I am not sure there is a nice way to do it.

The thing to remember with testing warnings is that all warnings must be created with "always" filter active.

EDIT: To be more clear, there may be some other test, which gives the warning, but does not notice there is an extra warning being given. And since the warning is not set to "always", it is ignored afterwards. So there is likely a catch warning context, which does not set the filter for all warnings (the default "once" kills things quickly.

By the way, would it be possible to wrap the whole test suit in an "always" context?

@seberg
Copy link
Member

seberg commented Jul 8, 2015

Added a PR that might make that problem go away to some degree. (but you always need to use a warning clean session when testing, and that is difficult to provide! E.g. we had it that running numpy tests a second time would give errors because of this kind of issue)

@ahaldane
Copy link
Member Author

All right, part of the problem was that the test was being skipped in python2, so it was normal not to see the warning.

If I set raise_warnings in python3, that test does indeed raise, but for a different deprecation that happens before mine. If I run the test manually (by importing it and running it) it correctly shows all the deprecations.

In order to be able to see all the deprecation warnings (and not just the first one in a test) I found I need to also comment out this line, and then run np.test('full'). So that problem is solved too.

I added a test and moved unrelated changes to another PR.

@seberg
Copy link
Member

seberg commented Jul 19, 2015

I just got a bit worried about multifield subscription/getitem. You know the wonky "buffer" logic in setitem (and we can fix that), but in general I am a bit worried about losing the non-indexed information, i.e. silently overwriting the garbage part of the VOID dtype.
Now the setitem can be fixed, but in general, is it easy to be sure that no garbage will be written into the unused fields which are used in a different view? I do think multifield assignments should certainly work, but just wanted to caution about this before we do anything. It is quite possible that this is really a no issue (after fixing VOID_setitem).

@ahaldane
Copy link
Member Author

Just to be clear, this PR should only be merged if we decide to go ahead with multi-field indices returning views (#6053). So really I need to finish that, and we need to decide what the plan is in #6053 before deciding what to do here.

As for avoiding writing garbage to unviewed fields, #6053 is already safe with respect to this. The solution was to prevent use of the 'buffer' logic in VOID_setitem if the dtype has fields, and also in the dtype_transfer path to avoid the "simple copy" case if there are fields (here)

@@ -601,6 +601,22 @@ PyArray_View(PyArrayObject *self, PyArray_Descr *type, PyTypeObject *pytype)
subtype = Py_TYPE(self);
}

if (type != NULL && (PyArray_FLAGS(self) & NPY_ARRAY_WARN_ON_WRITE)) {
const char *msg =
"Numpy has detected that you (may be) taking a view of an array "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some line breaks would be good.

@charris
Copy link
Member

charris commented Sep 10, 2016

@seberg Will your recent work on warnings help make the tests work consistently?

@seberg
Copy link
Member

seberg commented Sep 11, 2016

Yes, rebase on master, and you can be sure to find all affected tests (unless they already explicitly filter the future warnings or so). That was the point of it mostly.

@seberg
Copy link
Member

seberg commented Sep 11, 2016

I tried rebasing on master, seems somewhere in a calling function an error check is missing or something, got segfaults when running the test suits (I expected tests failing, but not segfaulting)

@ahaldane
Copy link
Member Author

Cool ,the warning improvements sound promising! I'll fix this up tonight and look into the segfault.

@charris
Copy link
Member

charris commented Sep 19, 2016

@ahaldane What is the status of this?

@ahaldane ahaldane force-pushed the warnview branch 2 times, most recently from 4f8e70e to b1cf258 Compare September 20, 2016 00:15
@ahaldane
Copy link
Member Author

Sorry, I had an unexpected turn of events last week. Updated now.

The segfault is fixed. It was due to a missing check that ret is NULL in array_deepcopy, which is only getting triggered now since a FutureWarning is raised.

I added extra wrappers to all the tests the deprecation will affect. The new warning improvements make it very easy to see what tests need to change! Very nice.

Also, originally I thought we would be making both np.diagonal and multifield-indexing return a view, but it's not so clear for np.diagonal. I've removed reference to diagonal from the warnings for now. Previous discussions about it trailed off inconclusively, see #5407 #5409 and the mailing list discussion linked there.

@charris
Copy link
Member

charris commented Sep 20, 2016

Looks like some more tests need ah fixin'.

@ahaldane ahaldane force-pushed the warnview branch 2 times, most recently from 528dca7 to 2d62047 Compare September 20, 2016 03:14
@charris
Copy link
Member

charris commented Sep 20, 2016

Hmm, the remaining error may be fixed by #8066. @gfyoung Thoughts?

@gfyoung
Copy link
Contributor

gfyoung commented Sep 20, 2016

@charris : Not sure how my PR will address this failure. My patch is addressing the DEBUG=1 machine.

@charris
Copy link
Member

charris commented Sep 20, 2016

@gfyoung Well, the simple fix for debug introduced the problem with wheels, so I'm thinking the no-cache-dir may have an impact. Guess @seberg should take a look at this also.

@gfyoung
Copy link
Contributor

gfyoung commented Sep 20, 2016

@charris : Hmmm...I guess along the trend of unusual build errors I guess that could be relevant.

@seberg
Copy link
Member

seberg commented Sep 20, 2016

Oki, seems like you figured out the "ignore" thing. Could also use the new suppress_warnings context, but it does not matter much, unless you want to make sure you only filter the one warning type.

@ahaldane
Copy link
Member Author

I think I got the last error: I was ignoring a warning when I should have been recording it. (certain config flags make the numpy warning suite complain about ignored warnings).

assert_(np.can_cast(a.dtype, b.dtype, casting='equiv'))
c = a.astype(b.dtype, casting='equiv')
with warnings.catch_warnings(record=True) as log:
Copy link
Member

@seberg seberg Sep 20, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to show it off (I really don't care about it), this could be an alternative:

with suppress_warnings() as sup:
    sup.filter(FutureWarning)

without checking that there are only FutureWarnings there, or if you want to make sure there is one future warning:

with suppress_warnings() as sup:
    l = sup.record(FutureWarning)
    ...
    assert_(len(l) > 0)

also works.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, maybe assert_warns even was a context manager by now, I somewhat think it was....

@ahaldane
Copy link
Member Author

updated to use assert_warns, which is much cleaner looking, and added a release note.

@ahaldane ahaldane force-pushed the warnview branch 4 times, most recently from 5eec07a to d8b846f Compare September 21, 2016 17:44
@charris
Copy link
Member

charris commented Sep 23, 2016

Ready to squash yet?

@ahaldane ahaldane force-pushed the warnview branch 3 times, most recently from 6b87001 to 37dd276 Compare September 24, 2016 21:30
Behavior of multi-field indexes will change in 1.13: Multi-field indexes
will return a view (not a copy) and assignment between structures with
non-identical fieldnames occurs "by position" (not "by fieldname"):

    >>> a = zeros(10, dtype=[('x', 'i8'), ('y', 'i8'), ('z', 'i8')])
    >>> a[['x', 'z']].view('i4') # Deprecation warning for multifield view
    >>> b = ones(10, dtype=[('y', 'i4'), ('x', 'f4')])
    >>> a[:] = b  # Deprecation warning for multifield assignment
@ahaldane
Copy link
Member Author

Squashed.

@ahaldane
Copy link
Member Author

Also, tweaked the warning to remove the recommended workaround, to replace arr[['f1', 'f3']].view(newtype) by arr[['f1', 'f3']].copy().view(newtype). That actually wouldn't work.

The actual workaround is to define something like

def pack_fields(arr):
    dt = arr.dtype
    packed_dtype = np.dtype(zip((n,dt.fields[n][0]) for n in dt.names))
    return array(arr, dtype=packed_dtype, copy=True)

then do pack_fields(arr[['f1', 'f3']]).view(newtype), but that seemed too long to put in the warning.

@charris
Copy link
Member

charris commented Sep 24, 2016

Hmm, that is a pretty complicated workaround. Maybe we need a packed method or some such. Not sure what the semantics should be for alignment.

@charris charris merged commit a17e905 into numpy:master Sep 24, 2016
@charris
Copy link
Member

charris commented Sep 24, 2016

Let's see what complaints we can gin up. Thanks Allan.

@ahaldane
Copy link
Member Author

Thanks for the reviews, Chuck and Sebastian!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants