Skip to content

Cleanup subscript #436

Closed
wants to merge 23 commits into from

3 participants

@87
87 commented Sep 11, 2012

Hi, this is some older refactor work re-committed against the new master. It was my intention to make the subscript code more readable and a bit cleaner in the handling. I tried to make a distinction between the handling of Python objects, NumPy scalars and NumPy arrays. Mostly its just moving functionality around, but I've tried to take care of the edge-cases and quirks in the code by adding a comprehensive set of test cases.

One test case is disabled, however, because it fails with a segmentation fault. This is also the case for the current master.. something to fix later on. (Update: this has been fixed.)

87 added some commits Oct 3, 2011
@87 87 ENH: Cleanup and simplify array_item code.
Rename array_big_item to array_item_asarray and move special case
for array_item_nice into array_item_asscalar. Also remove redundant
PyArray_Return and rename array_item_nice to array_item.
741b1ba
@87 87 ENH: Move fancy indexing code and ellipse check code into own functions. 0d7467e
@87 87 ENH: Use array_subscript_asarray to ensure an array type return value 51c6e8d
@87 87 ENH: Move Python related subscript code into array_subscript a2bcaa3
@87 87 ENH: Rename array_subscript to array_subscipt_fromobject and refactor…
… array_subscript_nice to array_subscript
1919d37
@87 87 ENH: Change silent integer conversion failure into index error. 8d0870a
@87 87 BUG: Fix boolean cases. 28e8ec5
@87 87 ENH: Optimization in subscript_simple for arrays with size > 1 7cee679
@87 87 ENH: Rename array_ass_big_item to array_ass_item_object a6cbd61
@87 87 ENH: Cleanup subscript assignment code. fa0c0fd
@87 87 BLD: Declare array_ass_item_object before use. 31f59c4
@87 87 ENH: Small optimization for redundant PyArray_Check 7f1893a
@87 87 BUG: Tuple conversion may not propagate errors from int conversion. 90822e5
@87 87 ENH: Optimize array_subscript by removing redundant error check. 9e0ef60
@87 87 ENH: Change IndexError to ValueError for integer conversion errors. 59e8ea9
@87 87 TST: Add more indexing tests. f97dfdc
@87 87 BUG: Multiple fixes from merging. d46ea9a
@87 87 STY: Text changes. 9798558
@87 87 BUG: Fix old case for boolean subscripts with non-matching shape. 34f71ad
@87 87 DOC: Re-insert comment 96624f4
@87
87 commented Sep 11, 2012

Note: This will break a future merge with the NA-branch.. ;-(

@teoliphant teoliphant commented on the diff Sep 12, 2012
numpy/core/src/multiarray/mapping.c
#if 0 /*PY_VERSION_HEX >= 0x02050000*/
- PyObject *ind = PyNumber_Index(op);
- if (ind != NULL) {
- value = PyArray_PyIntAsIntp(ind);
- Py_DECREF(ind);
- }
- else {
- value = -1;
- }
+ PyObject *ind = PyNumber_Index(op);
@teoliphant
NumPy member
teoliphant added a note Sep 12, 2012

Whitespace only difference?

@87
87 added a note Sep 12, 2012

Not really, it is because of an additional "if" statement in the code..

@87
87 added a note Sep 14, 2012

It seems the diff code does not work properly on indentation when there are preprocessor statements involved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@teoliphant
NumPy member

In general, it is great to see these kinds of fixes and cleanup. As one-big pull request it is difficult to review. You could be patient with us as we try to review it all, or preferrably break up the pull request into little ones that can be more easily accepted and/or commented on as a whole.

Please watch your use of white-space that changes shown are really white-space. There is a git incantation that strips white-space changes, but I don't know what it is right now --- but that doesn't help the review process.

Thank you for your help and efforts!

@87
87 commented Sep 12, 2012

Thanks! I'd rather keep it as one pull request, but you could check the commits individually to see what has been changed in a more readable fashion.

@87
87 commented Sep 12, 2012

P.S. The failure for this build is also unrelated to the pull request..

87 added some commits Sep 14, 2012
@87 87 BUG: Fix failing subscript test cases
This fixes indexing with NumPy boolean scalar values, which caused
crashes before. Both Python and NumPy boolean scalars now behave the
same. This contains a few small changes to PyArrayMapIterNew, where
the PyArrayMapIterObject is memset to zero and a fix for the crash,
which was caused by _nonzero_indices, which did not initialize any
iterators because ndim is zero for scalars. *IterNew will now check
for scalars before running that part of the code.
f65ca97
@87 87 BUG: Repair tests
Oops..
78df9a6
@87 87 ENH: Add exception to _nonzero_indices for zero-dim arrays
This function causes a crash otherwise, because it loops over the number
of dimensions to construct sub-iterators. If the number of dimensions is
zero, the sub-iterators will not be initialized, causing problems later on.
5c716c8
@teoliphant
NumPy member

This one looks good.

@teoliphant
NumPy member

This one looks fine -- I like getting rid of the remnants of "bigarray"

@teoliphant
NumPy member

OK.

@teoliphant
NumPy member

What is the purpose of this patch? Why are you removing the ability to index a 0-d array with a boolean mask that is also 0-d? This does not seem right to me.

87 replied Oct 10, 2012

Hmm, probably because I thought it would be handled by array_boolean_subscript, but you're right, it changes the behaviour of an edge case:

>>> a = np.array(True)
>>> a[np.array(True)]
array(True, dtype=bool)
>>> a[np.array(False)]
array([], dtype=bool)

became (after f65ca97)

>>> a = np.array(True)
>>> a[np.array(True)]
array([ True], dtype=bool)  <--changed
>>> a[np.array(False)]
array([], dtype=bool)

I can fix it, but is it correct anyway to have a 0-size 1-dimensional array on False and a 1-size 0-dimensional array on True?

Also, this has not changed:

>>> a = np.array(True)
>>> a[True]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: 0-d arrays can't be indexed
>>> a[False]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: 0-d arrays can't be indexed

Should this case be handled the same as the zero-dimensional case?

NumPy member

Isn't the behaviour after the patch more logical though? I somewhat think it should also work for higher dimensional arrays for consistency, but thats unrelated. It seems really odd to ravel one result and not the other (and not ravelling the empty result would mean an error or a None returned, which seems no good). Unless its a regression, from a consistency point of view I prefer the changed behaviour personally.

NumPy member

First think, then write... sorry :(. Any ravelling is done on the selected dimensions, so no dimensions selected means no ravellin. And I guess the 1-d empty array is an OK workaround for a "nothing" array.

@teoliphant
NumPy member

I am O.K. with merging all of these except commit 28e6ec5 which I don't understand the motivation for. 87@28e8ec5

@seberg

Just to note, this is not a bug I think, 1D array + list -> fancy indexing and a[np.array([], dtype=np.intp)] correctly selects nothing along that axes. The error you see below is just because the array has the wrong dtype.

@seberg
NumPy member
seberg commented Nov 2, 2012

I just saw this seems like a lot of work :), just to avoid conflicts, I think this should be merged before anything from #2702 since that will be far easier to change. If it interests anyone, I ran the test from there for tuple indexes, and there is no change but Error types (well one buggy corner case of the zero sized boolean indexes corner case changed for the better, but that does not count).

@seberg
NumPy member
seberg commented Apr 12, 2013

This has been merged in gh-3225.

@seberg seberg closed this Apr 12, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.