ENH: linear interpolation of complex values in lib.interp #6872

pec27 · 2015-12-21T09:11:23Z

I wanted to be able to do linear interpolation between complex values, i.e.

y = np.interp(mu, theta, m)

rather than

y = np.interp(mu, theta, m.real) + 1.0j*interp(mu, theta, m.imag)

This could be just achieved with a wrapper for the above, detecting the complex case, but I thought I would reduce the double cache-misses by covering the complex case in compiled_base.c

This is my first pull request, please be gentle :-).

pec27 · 2015-12-22T23:42:58Z

Choosing complex vs. real interpolation via cast attempts on fp (using PyArray_ContiguousFromAny) was not catching the cast error and failing on some of the builds. I switched to passing in a flag (to compiled_interp), to choose the behaviour.

ahaldane · 2016-05-07T04:49:07Z

Thanks @pec27 for the PR, and sorry it's taken a while to get to it.

It looks like this gives a speedup of about 2x over the simple version you pasted, which is pretty good for a function I suspect people call repeatedly. I tested the PR with:

N = 100000
mu = np.arange(N) + np.random.rand(N)
theta = np.arange(N)
m = np.random.rand(N) + 1.0j*np.random.rand(N)
%timeit np.interp(mu, theta, m)
100 loops, best of 3: 2.7 ms per loop
%timeit np.interp(mu, theta, m.real) + 1.0j*np.interp(mu, theta, m.imag)
100 loops, best of 3: 5.61 ms per loop

The code looks correct, however arr_interp seems a bit too big, and it feels like there is duplicated code. Plus, this code only generalizes for DOUBLE and CDOUBLE, but misses FLOAT16, FLOAT32 and the CFLOAT equivalents.

What I think would be the "ideal" version of this PR is a little bit more complicated than what you have done, but I think it wouldn't be too hard to do if you are interested, and I am happy to give pointers. It involves using numpy's "repeat" preprocessing, which is essentially a big macro system: It makes repeated copies of a block of code but varying the data types.

To get an idea of what this means look at PR #7464 which generalize arr_bincount using repeat ,or also by looking at the use of repeat in the file numpy/core/src/multiarray/arraytypes.c.src.

So essentially your work would be to take the two code paths you've already written, but instead of splitting them up using if statements you would split them up using some combination of switch and repeat (I think).

pec27 · 2016-05-07T20:15:23Z

Thanks @ahaldane for the looking over this! Suggestions on improving code re-use are always welcome, I imagine I may have missed some tricks with the numpy type system. Having said that my intention with this PR was to extend interpolation from the reals to the complex field whilst introducing no regressions, i.e. the interpolation precision is still float64 if you are doing interpolation with real values (which is also the precision of the interpolation of the components of the complex numbers).

I can see the use-case of allowing different precisions, e.g. float16, float32 etc for interpolation, but potentially users were relying both on the accuracy and on the return type being float64, and then there is potential ambiguity with what to do if say x.dtype is float32 and fp.dtype is float16 (or even one or both being integer type - should integer interpolation of an integer be an integer? I think not but others may disagree). Given the potential regressions perhaps handling different precisions should go in a separate PR?

ahaldane · 2016-05-07T22:31:14Z

That's a good point, users might be expecting double for the result. I think the ambiguity problems you mentioned can be avoided, but I'll drop that idea for now.

In the case we only support double and cdouble, I still would like to think about whether there is a tidier way of writing this up. I feel like arr_interp has become long and overly nested.

There are two alternate possibilities I am thikning about (besides leaving it as it is):

Have two functions, arr_interp and arr_interp_complex which are very similar, but use different types and slope calculations. There would be a bunch of duplicate code, but since it's just one function maybe it's OK. Anyway the code is already mostly divided into two big if-statements.
Use repeat to duplicate the code, but only repeating over DOUBLE and CDOUBLE. On the plus side we reduce code duplication and can easily extend to more types later, on the downside code using repeat can look a bit spaghetti-ish and there's more mental overhead to it.

Or maybe there is a better way of tidying it using appropriate helper functions?

I am leaning towards option 1 now - it is probably the easiest to do. Once it is written up in that form it it shouldn't be too much work if we ever want to adapt or generalize it in the future. Does that seem reasonable?

pec27 · 2016-05-10T06:09:12Z

Option 1 is pretty straightforward, as you point out the code is basically just switched in one big if statement so two functions is probably a little easier to read. I’ll update the PR and you can see what you think. If we had demand for a 3rd case then I’d definitely prefer something like repeat, but as it stands probably perfect is the enemy of good and all that.

ahaldane · 2016-05-10T18:26:30Z

Yeah, I'm happier with how this looks. Let's go with two functions.

Some things to fix:

Always use spaces instead of tabs for indentation. It looks to me that, besides that, your code is in the correct style (discussed in the numpy C style guide and the links to the style guide PEPs in HOWTO DOCUMENT).
At some point would be good to clean up the commits by squashing/reordering as you see fit, and the commit messages should be structured according to the commit guidelines. That is, the first line should start with "ENH:" or similar, then a blank line, then a short description.
just for ease of reading it would be nice to leave arr_interp completely unchanged, unless you think the changes make it cleaner.
In the python interp function, I think we might get rid of some of the nested-if structure by first doing

if np.iscomplexobj(fp):
    interp_func = compiled_interp_complex
    input_dtype = np.complex128
else:
    interp_func = compiled_interp
    input_dtype = np.float64

and then using interp_func and input_dtype appropriately in the rest of the function. Incidentally, I am not sure why the call to fp = np.asarray(fp, dtype=np.complex128) is necessary, perhaps we can remove it and avoid needing input_dtype at all.

After that I think we are pretty much done.

mhvk · 2016-05-10T19:21:17Z

The casting to array is needed to ensure the sorting with an index array a bit further down works. I must admit I also do not understand why the dtype needs to be given, though.

p.s. How I wish all these np.asarray calls were removed or at least turned into np.asanyarray; it makes subclasses so hard to get right, while if the code had just relied on ducktyping it would "just work". Anyway, outside of the scope of this PR (which is great!).

charris · 2016-05-10T19:58:10Z

This looks to be getting towards the end. Could you add a note to doc/release/1.12.0-notes.rst.

ahaldane · 2016-05-10T22:13:29Z

@mhvk @pec27 Maybe at the very top of the function we can just do fp = np.asarray(fp). That's also a slight optimization since it avoids the cost of the call to asarray inside iscomplexobj for non-array inputs.

ahaldane · 2016-05-10T22:14:12Z

numpy/core/src/multiarray/compiled_base.c

+    Py_XDECREF(ax);
+    Py_XDECREF(af);
+    return NULL;
+}


also, add a blank line here between functions

pec27 · 2016-05-11T04:58:41Z

Thanks @ahaldane @mhvk I think I got all those changes in. @ahaldane - I’m not sure about squashing commits, is that still ok since they are pushed to github? Commit d371eb01509a3e632554c89ebe9323619a122449 was supposed to update and squash them but then of course I had to merge back in...

@charris is the release note request for me? Do I have access to that file?

ahaldane · 2016-05-11T05:13:15Z

@pec27 you're right that changing history (eg by squashing commits) is a bad idea if anyone has already pulled your code, but in this case nobody has. When making a PR it is acceptable to rewrite the history of your copy before it is merged into our copy.

Here's how I usually do it: After I've finished modifying the code, I do

git rebase -i HEAD~6

to squash/amend the commits I want. You need to change the "6" to the number of previous commits you want to rebase over. After you squash it will ask you to rewrite the commit messages (which you should do). Then I do

git push https://github.com/ahaldane/numpy --force

to push to by github fork. force is necessary to overwrite the old history. See about-git-rebase or look at the git docs for more info, and feel free to ask if you have trouble.

ahaldane · 2016-05-11T05:15:43Z

Also, @charris indeed did ask you to add a release note to that file (it's in the numpy file tree). I think it would go under "new features", and you can just copy the style of the other features already there.

Also, I don't think you've removed all the tabs yet, and it would be nice to undo any changes to arr_interp - right now it looks like you've moved some code around in that function.

pec27 · 2016-05-11T05:19:05Z

Ah, I think I removed all the tabs and then merged them back in. Doh. I will fix and recommit.

ahaldane · 2016-05-11T05:21:13Z

Oh and one more comment: Choosing how to squash/amend your commits is up to your judgement in general. In this case the changes are self contained enough that I would be satisfied if you simply squashed everything into a single commit, for example.

pec27 · 2016-05-11T05:31:34Z

Thanks for the tips on rebasing. About the doc/release/1.12.0-notes.rst, I guess I checked out before that file was created - should I create a new file with just my release note?

pv · 2016-05-11T08:46:13Z

@ahaldane: note that github nowadays has the "squash+merge" button that squashes everything into one commit (be sure to write the desired commit message to the msg box).

charris · 2016-05-11T15:33:04Z

@pec27 To get the notes file, make sure your master branch is up to date, then checkout your branch and git rebase master. That should play your commits on top of current master and bring in the file for editing.

ahaldane · 2016-05-11T15:36:58Z

numpy/lib/function_base.py

@@ -24,7 +24,7 @@
 from numpy.lib.twodim_base import diag
 from .utils import deprecate
 from numpy.core.multiarray import _insert, add_docstring
-from numpy.core.multiarray import digitize, bincount, interp as compiled_interp
+from numpy.core.multiarray import digitize, bincount, interp as compiled_interp, interp_complex as compiled_interp_complex


this line is too long, it needs to be shortened like

from numpy.core.multiarray import ( digitize, bincount, interp as compiled_interp, interp_complex as compiled_interp_complex )

ahaldane · 2016-05-11T15:37:57Z

@pv thanks I didn't know that. @pec27 if you don't squash yourself, I can use that.

Besides the line note I just added and the release note it looks good to me.

pec27 · 2016-05-12T05:09:34Z

Hi @charris @ahaldane I rebased to master but github doesn’t seem happy - it thinks I am now trying to PR 693 commits. I’m not quite sure how to tell my github fork that I have rebased from upstream. Any advice from wiser heads welcome! If it helps I have so few changes that I should be able to re-apply everything manually...

charris · 2016-05-12T05:49:50Z

Hmm, I have no idea what you did, although it looks like you did a merge in there somewhere. The first thing to do is to recover to something sane. For that, run git reflog, which will give you a list of hashes associated with actions you have taken. Find a good spot to return to, and then git reset --hard <hash>. You can also use the HEAD@{n} instead of the hash, but whichever. That will get you back to an earlier starting point.

pec27 · 2016-05-12T06:11:19Z

Thanks @charris, that restores me back to somewhere sane, I’ll try the rebase again tomorrow with fresh eyes.

lib.interp function now allows interpolation of complex fp with complex128 precision (i.e. equivalent to lib.interp on the real and imaginary parts). Tests are added for the non-periodic and periodic cases.

pec27 · 2016-05-12T22:30:51Z

The rebase went smoothly today, I blame bit rot :-). I think I have covered everything (release note, squashed commit, numpy style conventions etc), @ahaldane could you take a look to see if it is pull-able?

ahaldane · 2016-05-13T02:04:37Z

Looks nice and clean to me now!

Thanks a lot for the PR @pec27, I'm going to go ahead and merge.

charris added 01 - Enhancement component: numpy._core labels Dec 21, 2015

charris added this to the 1.12.0 release milestone Apr 16, 2016

charris added the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label May 10, 2016

ahaldane reviewed May 10, 2016
View reviewed changes

ahaldane reviewed May 11, 2016
View reviewed changes

pec27 force-pushed the complex-interp branch from 3a89117 to 90433d5 Compare May 12, 2016 06:06

pec27 force-pushed the complex-interp branch 2 times, most recently from ccfeb6d to eaaa7e8 Compare May 12, 2016 20:32

pec27 force-pushed the complex-interp branch 3 times, most recently from 333ee7c to 67295a0 Compare May 12, 2016 20:57

ENH: linear interpolation of complex values in lib.interp

6aa21ad

lib.interp function now allows interpolation of complex fp with complex128 precision (i.e. equivalent to lib.interp on the real and imaginary parts). Tests are added for the non-periodic and periodic cases.

pec27 force-pushed the complex-interp branch from 67295a0 to 6aa21ad Compare May 12, 2016 20:59

ahaldane merged commit f1a3631 into numpy:master May 13, 2016

This was referenced May 13, 2016

ENH: Add polyvalfromroots to numpy.polynomial #7542

Closed

TST: Suppressed warnings #7099

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: linear interpolation of complex values in lib.interp #6872

ENH: linear interpolation of complex values in lib.interp #6872

pec27 commented Dec 21, 2015

pec27 commented Dec 22, 2015

ahaldane commented May 7, 2016

pec27 commented May 7, 2016

ahaldane commented May 7, 2016

pec27 commented May 10, 2016

ahaldane commented May 10, 2016

mhvk commented May 10, 2016

charris commented May 10, 2016

ahaldane commented May 10, 2016 •

edited

Loading

ahaldane May 10, 2016

pec27 commented May 11, 2016

ahaldane commented May 11, 2016 •

edited

Loading

ahaldane commented May 11, 2016 •

edited

Loading

pec27 commented May 11, 2016

ahaldane commented May 11, 2016

pec27 commented May 11, 2016

pv commented May 11, 2016 via email

charris commented May 11, 2016

ahaldane May 11, 2016

ahaldane commented May 11, 2016 •

edited

Loading

pec27 commented May 12, 2016

charris commented May 12, 2016

pec27 commented May 12, 2016

pec27 commented May 12, 2016

ahaldane commented May 13, 2016

ENH: linear interpolation of complex values in lib.interp #6872

ENH: linear interpolation of complex values in lib.interp #6872

Conversation

pec27 commented Dec 21, 2015

pec27 commented Dec 22, 2015

ahaldane commented May 7, 2016

pec27 commented May 7, 2016

ahaldane commented May 7, 2016

pec27 commented May 10, 2016

ahaldane commented May 10, 2016

mhvk commented May 10, 2016

charris commented May 10, 2016

ahaldane commented May 10, 2016 • edited Loading

ahaldane May 10, 2016

Choose a reason for hiding this comment

pec27 commented May 11, 2016

ahaldane commented May 11, 2016 • edited Loading

ahaldane commented May 11, 2016 • edited Loading

pec27 commented May 11, 2016

ahaldane commented May 11, 2016

pec27 commented May 11, 2016

pv commented May 11, 2016 via email

charris commented May 11, 2016

ahaldane May 11, 2016

Choose a reason for hiding this comment

ahaldane commented May 11, 2016 • edited Loading

pec27 commented May 12, 2016

charris commented May 12, 2016

pec27 commented May 12, 2016

pec27 commented May 12, 2016

ahaldane commented May 13, 2016

ahaldane commented May 10, 2016 •

edited

Loading

ahaldane commented May 11, 2016 •

edited

Loading

ahaldane commented May 11, 2016 •

edited

Loading

ahaldane commented May 11, 2016 •

edited

Loading