Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Solves #5407: Implement and test writable view for the diagonal function #5409

Closed
wants to merge 0 commits into from

Conversation

maniteja123
Copy link
Contributor

Implements task raised in issue #5407.
The ndarray.diagonal method currently returns a view, but it is not writeable. It is documented to return a writeable view in 1.10.
Correspondingly, the writable flag is set to True and the assert statement and the test function are changed accordingly.

@maniteja123 maniteja123 changed the title Solves #5407: Implement and test writable view for the diagonal .function Solves #5407: Implement and test writable view for the diagonal function Jan 1, 2015
@maniteja123
Copy link
Contributor Author

The travis build succeeded, but running numpy.test() on my local machine is failing on the writable flag assert statement. Does the travis build also check the testing suite ?

@rgommers
Copy link
Member

rgommers commented Jan 1, 2015

@maniteja123 it does. Did you do a full rebuild of numpy before running the test suite? That would explain it. Do rm -rf build (or even git clean -xdf) before rebuilding.

@rgommers rgommers added this to the 1.10 blockers milestone Jan 1, 2015
@maniteja123
Copy link
Contributor Author

@rgommers Thanks for reminding, my bad that I forgot you mentioned that the package needs to be rebuild to see the changes in the compiled code. I have rebuilt it and the test suite succeeded.

@charris
Copy link
Member

charris commented Jan 1, 2015

The commit message should probably begin ENH: and the first line shortened, maybe

ENH: Make result of ndarray.diagonal writeable.

Then expand on the details in the explanatory section after a blank line. Include Closes #5407 at the end to automatically close the task when this is merged.

@charris
Copy link
Member

charris commented Jan 1, 2015

Also, make a note in doc/release/1.10.0-notes.rst under New Features. Should probably note that writing to diagonal will affect the original array.

@maniteja123
Copy link
Contributor Author

Oh, sorry I unnecessarily committed before asking for any more changes to be done. I will add it now.

@charris
Copy link
Member

charris commented Jan 1, 2015

No problem.

@maniteja123
Copy link
Contributor Author

**ndarray.diagonal** result is now writable
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The result of **ndarray.diagonal** is now writable. Any changes made to
the diagonal will also affect the original array.

Is this fine ?

@charris
Copy link
Member

charris commented Jan 1, 2015

I think you mean now writable, not not writable ;) Also, the ~ needs to match up with the length of the heading. Otherwise, looks good.

@argriffing
Copy link
Contributor

I guess it's too late for this, but what about a copy=True keyword flag for ndarray.diagonal? The backwards incompatible change of this new numpy version is used as an example in this blog post. His perspective is probably worth listening to, and I'm sure that he would find "The list post didn't draw much response" regarding the backwards incompatible change of ravel after 4 days on the numpy-discussion mailing list quite amusing considering that he would probably consider 4 years a more appropriate notice for this kind of change, and that it should coincide with a numpy major-version update.

Given the importance of NumPy in the scientific Python ecosystem – the majority of scientific libraries and applications depends on it -, I consider its lack of stability alarming. I would much prefer the NumPy developers to adopt the attitude to change taken by the Python language itself: accumulate ideas for incompatible changes, and apply them in a new version that is clearly labelled and announced as incompatible.

@maniteja123
Copy link
Contributor Author

screenshot 38
The visual error is due to to putting the data in indented block. I actually had them matched. :)

@charris
Copy link
Member

charris commented Jan 1, 2015

@argriffing It was made not writable in 1.9 to smoke out such uses. Konrad has a point, but I'm not sure Python 3 is the best example of an alternative approach :) Numpy is, I think, settling in, radical changes are fewer than they have been. The question that concerns me at this point is how to make a transition to a better version, for once numpy settles, it will be outdated, yet its penetration will make transition to something better difficult. I've been toying with the idea of picking a 'stable' release and making larger changes thereafter. The question is, what should those changes be? I'm still keeping an eye on dynd and would like to help it into the larger world if that is possible. If you have any suggestions for moving numpy along I'd like to hear them. Speaking of which, would you like commit rights?

Adding a copy keyword would not be too difficult, although using .copy would be just as easy. I suppose the question is what is the best default?

@argriffing
Copy link
Contributor

The question that concerns me at this point is how to make a transition to a better version, for once numpy settles, it will be outdated, yet its penetration will make transition to something better difficult.

The way this will happen is that Continuum will use its $millions to develop and promote pandas as a middleware shim between numpy and its users, and to simultaneously develop dynd as you suggest. Soon Pandas will support dynd, numpy, and dynd-premium, if it does not do this already. Eventually numpy will be deprecated and only dynd and dynd-premium will be used instead, and no former user of numpy will even notice.

@charris
Copy link
Member

charris commented Jan 1, 2015

Well, I'm not sure if you are joking or not :) There is a ton of scientific code out there that doesn't use pandas. Konrad Hinson, for instance, has a lot of c code accessing the numpy api. I think we need something more straight forward here. Although changing imports is going to be a problem. It would be nice if something like

import super-numpy as np

would do the trick.

@maniteja123
Copy link
Contributor Author

just asking out of curiosity, if I compare my issue5407 branch with numpy master branch
maniteja123/numpy@numpy:master...issue5407 I am seeing this another commit maniteja123@0a83bf9. I couldn't understand how this is coming. Is it that both are waiting to be merged ?

@charris
Copy link
Member

charris commented Jan 1, 2015

@maniteja123 Not sure what you mean. Did you pull/merge anything into your branch or fork from something other than master? Note that if you make a new branch while in a branch other than master and don't explicitly base the new branch on master, it will be based on the branch you were in. You can fix that by git rebase master, but keep git reflog in mind in case of trouble.

@abalkin
Copy link
Contributor

abalkin commented Jan 1, 2015

Adding a copy keyword would not be too difficult, although using .copy would be just as easy. I suppose the question is what is the best default?

I think there are strong arguments favoring copy=True:

  1. This choice avoids a silent change in behavior when going from 1.8 to 1.10.
  2. Principle of least surprise. The strides trick used by diagonal view is not obvious and users who don't read documentation are likely to expect a copy unless they see copy=False.
  3. Code that relies on the new feature will be visually distinct. (One of Konrad's points. ) Once you see x.diagonal(copy=False), you know that numpy >= 1.10 is required.

@maniteja123
Copy link
Contributor Author

I had created from the master only and also rebased it in the start. I was referring to https://github.com/numpy/numpy/pull/5409/commits . I couldn't get how the first commit has come into this PR.

@abalkin
Copy link
Contributor

abalkin commented Jan 1, 2015

BTW, what is the future of numpy.diag?

It is currently documented to return a copy:

numpy.diag = diag(v, k=0)
    Extract a diagonal or construct a diagonal array.
...
    Parameters
    ----------
    v : array_like
        If `v` is a 2-D array, return a copy of its `k`-th diagonal.
...

but actually returns a read-only view

>>> x = numpy.eye(2)
>>> numpy.diag(x).flags['OWNDATA']
False
>>> numpy.diag(x)[0] = 2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: assignment destination is read-only

@njsmith
Copy link
Member

njsmith commented Jan 1, 2015

We could easily put off this change to a later release as well, if we want
to give people more time to adjust. In fact we never really had this
discussion - the "1.10" schedule was just a straw man, written at a time
when IIRC numpy hadn't made a release in ~2 years.

I am very strongly in favor of making changes like this on a
conservative-but-finite time scale. Contrary to the blog post above, this
is how python itself is developed 99% of the time (pretty much every
release has incompatibilities - see future imports etc.), and you're not
going to find a lot of people holding up the py2->py3 transition as a great
model for other projects to aspire to copy. I'm very sympathetic to the
problems they point out with rolling deprecations, but all the other models
are worse.
On 1 Jan 2015 19:55, "Maniteja Nandana" notifications@github.com wrote:

I had created from the master only and also rebased it in the start. I was
referring to https://github.com/numpy/numpy/pull/5409/commits . I
couldn't get how the first commit has come into this PR.


Reply to this email directly or view it on GitHub
#5409 (comment).

@maniteja123
Copy link
Contributor Author

@abalkin The documentation suggests that the diag returns a copy, but as you pointed it is read-only view until this change, now if I am testing it, it is accepting the assignment and also the WRITABLE flag is now True.Moreover the original array is getting affected. Also, the tests for writability also worked accordingly, https://github.com/numpy/numpy/pull/5409/files#diff-bc1e8d40b8b7758facc5e143be016f99R1750

@charris
Copy link
Member

charris commented Jan 1, 2015

@maniteja123 Probably this means you have merged master into your branch.

@abalkin Hmm, maybe diag could return a copy, it is mostly for matlab compatibility. Although having various diagonal like functions with different properties is confusing and asking for trouble. Trouble is always available and very willing.

@maniteja123
Copy link
Contributor Author

@charris I did pick and squash by git rebase -i master. I don't remember merging the master, but anything I need to do now ?

@maniteja123
Copy link
Contributor Author

I've just encountered PyArray_NewFromDescr_int, so it does not look like numpy is very consistent.

I could just find only this function with a common prefix and moreover the function PyArray_NewFromDescr is just using PyArray_NewFromDescr_int https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/ctors.c#L1118
which looks like it is a enhanced version probably. Just pointing out, I am not even sure what the reason was.

@abalkin
Copy link
Contributor

abalkin commented Jan 2, 2015

@maniteja123, when you post links to specific lines, please use tags such as v1.9.1rc1 or commit hash values such as 261de3f in the URLs. Otherwise your links will become invalid soon. For example, the following two links to PyArray_NewFromDescr_int are stable:

@maniteja123
Copy link
Contributor Author

@abalkin Thanks for the heads up, I was just using the master branch to browse through the code. I just wanted to show the use of PyArray_NewFromDescr_int in PyArray_NewFromDescr. I will definitely follow the conventions.

@maniteja123
Copy link
Contributor Author

Coming to making a copy of the object, it looks like it is done through the PyArray_NewCopy https://github.com/numpy/numpy/blob/v1.9.1rc1/numpy/core/src/multiarray/shape.c#L232 or NPY_COPY_PYOBJECT_PTR https://github.com/numpy/numpy/blob/v1.9.1rc1/numpy/core/src/multiarray/shape.c#L332 to create a copy.
Please pardon my ignorance, but if you could correct me and suggest some place to look into, I would read up more on this.

@jaimefrio
Copy link
Member

I think this is the commit that changed the return from a copy to a view, fd6cfd6, you can see there how it used to be implemented, PyArray_NewCopy it was.

@maniteja123
Copy link
Contributor Author

@jaimefrio Oh, great. I get it now. So, it was initially returning a writable copy till 1.9, readable view in 1.9 and now a writable view in 1.10. But does adding a copy keyword to the constructor warrant change in the C API function implementation altogether ?

@maniteja123
Copy link
Contributor Author

@charris You were saying that the result should be writable if the input array is writable. But if the flag is changeable, isn't this a risky operation ?

@jaimefrio
Copy link
Member

You are currently only modifying the C API. Once you are finished with it, if a new argument is added, you also need to update the Python API. You will need to add the new copy argument to the array_diagonal method and make it call the modified/new C API function, in the methods.c file, see here. I cannot find right now where does this method gets its docstring set, but that needs updating too.

You will also have to modify the signature and docstring of the diagonal function in fromnumeric.py, see here.

And there are probably diagonal methods in both the masked array and matrix subtypes that could also be updated to take advantage of the new argument. That may be material for a different PR, though.

Lots of teeny details to take care of, and I may be forgetting something. That's what you get for messing with the API! :-)

@njsmith
Copy link
Member

njsmith commented Jan 2, 2015

Is it really worth bothering adding a new C API function that just does
PyArray_Copy(PyArray_Diagonal(x)) (+ standard refcounting/error-checking
annoyances)? If someone just showed up with that one day I think we'd look
at it pretty dubiously -- and in this case no user has even asked for it
AFAICT, it's just a hypothetical need.

(At the Python level, the argument is even more starkly against adding a
copy= argument: writing np.diagonal(x, copy=True) is actually longer than
writing np.diagonal(x).copy(), and less orthogonal. Compare to indexing:
it's always going to be x[0, ...].copy(), not x[0, ..., copy=True].)

On Fri, Jan 2, 2015 at 7:59 PM, Jaime notifications@github.com wrote:

You are currently only modifying the C API. Once you are finished with it,
if a new argument is added, you also need to update the Python API. You
will need to add the new copy argument to the array_diagonal method and
make it call the modified/new C API function, in the methods.c file, see
here
https://github.com/numpy/numpy/blob/maintenance/1.9.x/numpy/core/src/multiarray/methods.c#L2146.
I cannot find right now where does this method gets its docstring set, but
that needs updating too.

You will also have to modify the signature and docstring of the diagonal
function in fromnumeric.py, see here
https://github.com/numpy/numpy/blob/maintenance/1.9.x/numpy/core/fromnumeric.py#L1120
.

And there are probably diagonal methods in both the masked array and
matrix subtypes that could also be updated to take advantage of the new
argument. That may be material for a different PR, though.

Lots of teeny details to take care of, and I may be forgetting something.
That's what you get for messing with the API! :-)


Reply to this email directly or view it on GitHub
#5409 (comment).

Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

@maniteja123
Copy link
Contributor Author

@jaimefrio Thanks.
The underlying C API needs to be changed to accommodate PyArray_NewCopy or something like that.

But if the signature is changed, updating the Python API means

  • changing static declarations like array_diagonal
  • corresponding wrapper functions like diagonal , which calls array_diagonal which inturn uses PyArray_Diagonal
  • The ma.diagonal
  • I think matrix.diagonal method uses the ndarray.diagonal method directly.
  • The docstrings for each of the functions

Hope I got it right now :)

@maniteja123 maniteja123 changed the title Solves #5407: Implement and test writable view for the diagonal function ENH: Solves #5407: Implement and test writable view for the diagonal function Jan 3, 2015
@maniteja123
Copy link
Contributor Author

@charris, I know there has been a lot of discussion on the mailing list, as well as here. I just wanted to inquire if there is a consensus reached.

@charris
Copy link
Member

charris commented Jan 13, 2015

My preference at this point would be to revert the previous changes so that PyArray_Diagonal returns a copy, add a new API function PyArray_Diagonal2 that has a copy argument so that either copies of views can be returned, implement PyArray_Diagonal in terms of the new function, and add a copy=1 argument to the diagonal method and function. It's kind of ugly, but conservative. @njsmith I think will go along with the new API function, although he doesn't like it, but I think we can still argue about the default value of copy, or maybe start with copy=1 and a FutureWarning.

@charris
Copy link
Member

charris commented Jan 13, 2015

I also think the view should be ro if the array is.

@maniteja123
Copy link
Contributor Author

Sorry, I was trying to make the branch on par with master, and did git rebase master, but I think it closed the PR. Please mark this as invalid, and do tell me how to prevent it next time. Until then, I will comment on this itself, instead of opening a new PR until you suggest so.

Just to clarify,

  • the situation would be reverted back to 1.7 version, where a copy of the array is returned.
  • For the PyArray_Diagonal2 function, it should call the PyArray_Diagonal function accordingly to copy argument.
  • The diagonal function would now have a copy argument in fromnumeric.py.
  • The array_diagonal function in methods.c needs to be updated accordingly.
  • If copy==1, then the PyArray_Diagonal itself would be called.
  • If copy==0, then depending on the WRITABLE flag of the input, a view needs to be returned.

@maniteja123
Copy link
Contributor Author

I think doing git reset deleted all the commits and closed the PR. I will be careful next time. Sorry again, for that this PR is important part of the milestone, as well as all the discussions. I will try to do it properly now.

@maniteja123 maniteja123 deleted the issue5407 branch January 16, 2015 19:17
@charris
Copy link
Member

charris commented Apr 7, 2015

@maniteja123 Could you open a new PR?

@charris charris modified the milestones: 1.10.0 release, 1.10 blockers Apr 7, 2015
@maniteja123 maniteja123 restored the issue5407 branch April 7, 2015 01:01
@maniteja123
Copy link
Contributor Author

@charris I will do it tonight, a little work now. Also, please do tell me if there is anything else to do than starting a new PR.

@maniteja123
Copy link
Contributor Author

Ah, sorry didn't see the needs decision label and 1.1.0 release milestone. Will start a new PR ASAP.

@maniteja123
Copy link
Contributor Author

Just a small doubt. If I need to open a new PR, there should be some changes committed in the branch. Right now, numpy/master and maniteja123/issue5407 are even and since this still needs a decision, what should I do ?

@ahaldane
Copy link
Member

This looks like it was marked as a 1.10 blocker, and was accidentally closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants