Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reset flags when Axes are removed. Array might now be 1D, or removed axe... #420

Merged
merged 2 commits into from
Sep 21, 2012

Conversation

seberg
Copy link
Member

@seberg seberg commented Sep 3, 2012

This is related to Issue #387, and fixes half of it, the other half is that the strides are still 0 if keepdims=True is used, which means that the flags are unnecessarily False. For this, the strides have to be replaced if they are 0 (similar to what happens in shape.c -> PyArray_Newshape, with the code also related to Issue #380). I will probably add commits here related to this as well, or maybe someone who knows reduction.c well can see if those 0 strides cannot be set to something better easily?

One test for it would be (not sure where to put it, in test_regression?):

def test_squeeze_contiguous():
    """#Issue 387 related"""
    a = np.array([1,2,3]).reshape(3,1)
    b = a.copy('F')

    sa = a.squeeze()
    sb = b.squeeze()

    assert_(sa.flags.c_contiguous)
    assert_(sb.flags.f_contiguous)
    assert_(sa.flags.c_contiguous)
    assert_(sb.flags.f_contiguous)

@travisbot
Copy link

This pull request fails (merged e4bafb2 into cd9092a).

@seberg
Copy link
Member Author

seberg commented Sep 3, 2012

I guess the fail is unrelated...? Should I add a commit here in the same branch if I am unsure if it makes sense?

I would suggest maybe:

@@ -60,7 +60,7 @@ allocate_reduce_result(PyArrayObject *arr, npy_bool *axis_flags,
     for (idim = ndim-1; idim >= 0; --idim) {
         npy_intp i_perm = strideperm[idim].perm;
         if (axis_flags[i_perm]) {
-            strides[i_perm] = 0;
+            strides[i_perm] = stride;
             shape[i_perm] = 1;
         }
         else {

Which as far as I see, will retain contiguousity if arrays are C-contiguous for:

    a = np.arange(27).reshape(3,3,3)
    assert_(a.sum(1, keepdims=True).flags.c_contiguous)

It will work also for F-contiguous arrays, however reduce.c will always destroy F-contiguousity if they have a size one dimension (keepdims does not matter). This is because the reduce result users PyArray_CreateSortedStridePerm which special cases one sized dimensions. I am not sure if this creation of unnecessary uncontiguous (and 0 stride) arrays is an issue or not.

@travisbot
Copy link

This pull request fails (merged 380ce94 into cd9092a).

@seberg
Copy link
Member Author

seberg commented Sep 4, 2012

@certik, in case you wonder, I think the commits fix the invalid flags issue. The rest I wrote here has nothing to do with invalid flags, just wondering if the behaviour shouldn't be improved.

@certik
Copy link
Contributor

certik commented Sep 4, 2012

This looks good. The test failures are unrelated, see the #424.

@charris, if you have a minute, could you please review this patch? It looks simple enough, but I am not so familiar with the internals yet.

@njsmith
Copy link
Member

njsmith commented Sep 5, 2012

The patch as it stands seems pretty straightforward; hard to see how it could be harmful.

As far as the strides discussion goes, wouldn't it make more sense to just fix the contiguity-checking code so that it can actually recognize that these contiguous arrays are contiguous? There's nothing wrong with having a 0 stride (or any stride value for that matter) on a length 1 dimension.

@seberg
Copy link
Member Author

seberg commented Sep 5, 2012

@njsmith I tried that and it created errors and segfaults, though maybe I did something wrong, but it seemed like there maybe some code that expects the strides to add up correctly even when they are size 1-dims. Alternatively maybe it corrupts code if ndim > 1 arrays can be both C and F contiguous. So I think someone who knows this strides stuff would have to check if ignoring 1-sized dims for contiguousity is possible and/or where there would need to be changes to make it possible.

@seberg
Copy link
Member Author

seberg commented Sep 5, 2012

@njsmith and whoever might be interested in the approach of just allowing length 1 dimensions to have any stride. I have created a branch that tries this. I only hacked a bit that flags should usually be set to new logic (in ctors.c), but tests now run fine after adapting loops...

@njsmith
Copy link
Member

njsmith commented Sep 21, 2012

@seberg or someone else: I've managed to confuse myself about the status of this PR, and I guess I'm not the only one since it's been sitting idle for so long now :-).

Is it ready to commit?

Does it close any outstanding bugs?

And is something like the seberg/cflags branch necessary to close any outstanding bugs, or is it just a proof of concept for a possible cleanup?

@seberg
Copy link
Member Author

seberg commented Sep 21, 2012

@njsmith this is ready to commit IMO, it fixes issue #387 and also flags being wrong with np.squeeze (and likely a bit other functions too).

The cflags stuff is just a proof of concept of being able to allow such things as a 1x100x1 array to be both C and F contiguous (even though its not 1D) this has nothing to do with bugs, its more or less unrelated (and I think something that should likely wait for a more major release, since maybe someone like even numpy internals might make assumptios that would be broken by it).

@njsmith
Copy link
Member

njsmith commented Sep 21, 2012

Okay, great, I'll merge it.

On the cflags stuff: I'd encourage you to go ahead and propose it, it looks like a simple bug-fix to me. C and F contiguous are what they are, our flags should represent that accurately. It's possible it might break something somewhere, but every change risks that, and the end result would just be that we get a chance to discover and clean up some broken code. (If the tests pass, then that dramatically limits the scope of the possible breakage.) And realistically there is no such thing as a "more major release", numpy 2.0 is not going to happen any time soon, or possibly ever.

njsmith added a commit that referenced this pull request Sep 21, 2012
Reset flags when Axes are removed. Array might now be 1D, or removed axe...
@njsmith njsmith merged commit fd63e8f into numpy:master Sep 21, 2012
@seberg
Copy link
Member Author

seberg commented Sep 21, 2012

@njsmith sorry, but can you give a small pointer here? The cflags stuff needs a bit cleanup, and then looking through numpy code to fix that the more general flags are actually used everywhere. Otherwise it seems to work, just might take a bit to change things everywhere carefully. So, after cleaning up, I will just do a PR so everyone can comment and decide if this is a good idea or not?

I am also a bit confused about bug fixes vs. real changes. I would now just do PR for both and if they conflict each other simply fix it later, or is there better way?

@njsmith
Copy link
Member

njsmith commented Sep 22, 2012

@seberg: Yes, I'm suggesting that you do a PR with the flag calculation fixes (together with any other changes you discover are necessary).

I'm not sure what you're asking in your second paragraph. PR's are the right way to submit any kind of change, whether it's to fix a bug or to clean up some ugly code or to add a fancy new feature... whatever. The ideal PR is minimal and self-contained. "Minimal" means that if you have two unrelated changes, they should go in two PRs. "Self-contained" basically means that each PR should always leave master in a releasable state -- so we don't want one PR for the new code and a different one for the tests, or one PR that breaks something and a second PR that fixes it again. Maybe that helps?

If you have several conceptually unrelated changes that overlap and will create git conflicts, then there are a few options, you can submit them one at a time, or submit them all at once to let people start reviewing them and then fix up the conflicts later. Any of these can work, git makes it pretty easy to sort things out after the fact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants