New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to return indexes in conditional abundance matching #913

Merged
merged 12 commits into from Jun 8, 2018

Conversation

Projects
None yet
2 participants
@Christopher-Bradshaw
Copy link
Contributor

Christopher-Bradshaw commented Jun 7, 2018

As discussed in person + slack the primary goal of this is to:

  • Add an option to return the index of y2 that was matched (rather than the value in y2) to the CAM code

I made one other minor change - previously, subgrid noise was not being added to the left and right edges. Now it should be.

I've put a number of comments on the PR where I have done something I am not sure about. Let me know what you think!

from .tests.naive_python_cam import sample2_window_indices


def conditional_abunmatch(x, y, x2, y2, nwin, add_subgrid_noise=True,
assume_x_is_sorted=False, assume_x2_is_sorted=False):
assume_x_is_sorted=False, assume_x2_is_sorted=False, return_indexes=False):

This comment has been minimized.

@Christopher-Bradshaw

Christopher-Bradshaw Jun 7, 2018

Author Contributor

I'm not sure what the best api is - another option would be to have two functions:
conditional_abunmatch and conditional_abunmatch_indexes. I don't know how this is done elsewhere in halotools/how you would like to do it going forward. Your call :)

This comment has been minimized.

@aphearin

aphearin Jun 7, 2018

Contributor

The only time I prefer creating an entirely new function is in cases where adding in some new feature considerably slows performance; if performance is mission-critical to the original function, I just write a new one. The additional algebraic operations added here are minor enough that I think what you've done here is preferable.

@@ -159,7 +183,9 @@ def cython_bin_free_cam_kernel(double[:] y1, double[:] y2, int[:] i2_match, int
cdef int idx_in1, idx_out1, idx_in2, idx_out2
cdef double value_in1, value_out1, value_in2, value_out2

cdef double[:] y1_new = np.zeros(npts1, dtype='f8') - 1
cdef int[:] y1_new_indexes = np.zeros(npts1, dtype='i4') - 1

This comment has been minimized.

@Christopher-Bradshaw

Christopher-Bradshaw Jun 7, 2018

Author Contributor

I tried to define these in if statements:

if return_indexes:
    cdef int[:] y1_new_indexes = np.zeros(npts1, dtype='i4') - 1
else:
    cdef double[:] y1_new_values = np.zeros(npts1, dtype='f8') - 1

but that wasn't allowed?

Defining both is a bit unnecessary, but I couldn't find a nice way around it

This comment has been minimized.

@aphearin

aphearin Jun 7, 2018

Contributor

Actually, I don't know a way around this either. This seems fine.


if iy2 > iy2_max:

This comment has been minimized.

@Christopher-Bradshaw

Christopher-Bradshaw Jun 7, 2018

Author Contributor

I'm pretty sure that this shouldn't be possible and this set of if if else block confused me a bit at first. I think this reads a bit easier now but I'm happy to revert if you disagree.

This comment has been minimized.

@aphearin

aphearin Jun 7, 2018

Contributor

Agreed this is a little easier to read, but min returns a python object, so it's better practice to use a c-imported min function instead:

from libc.math cimport fmin as c_min

After looking at the cython -a output, in this case I'm pretty sure the cython compiler figures it out anyway and there is no performance hit, so just pointing this out so you are aware of python-cython interactions.

Also watch out for this on lines 130 & 131 inside the definition of get_value_at_rank

This comment has been minimized.

@Christopher-Bradshaw

Christopher-Bradshaw Jun 7, 2018

Author Contributor

👍 thanks.

This comment has been minimized.

@Christopher-Bradshaw

Christopher-Bradshaw Jun 7, 2018

Author Contributor

Actually, as these are ints I don't think there is a min/max func in libc.math for them. (checked https://github.com/cython/cython/blob/master/Cython/Includes/libc/math.pxd)

http://cython.readthedocs.io/en/latest/src/userguide/numpy_tutorial.html suggests that cython has a fast min/max for ints (as you saw in the cython -a output).

high_rank = rank1 + 1
if low_rank < 0:
low_rank = 0
elif high_rank >= nwin:

This comment has been minimized.

@Christopher-Bradshaw

Christopher-Bradshaw Jun 7, 2018

Author Contributor

I think this should just be if rather than elif? That is what I have in the function (or the same with min/max)

This comment has been minimized.

@aphearin

aphearin Jun 7, 2018

Contributor

Ok, no problem at all to leave as is - mostly I was pointing this out for pedagogical reasons: since this is one of your first cython function contributions I thought it might be helpful to be explicitly aware of all python-layer interactions. But cython -a does reveal that the cython compiler has this very well cleaned.

@@ -384,11 +386,13 @@ def test_subgrid_noise1():
result2 = conditional_abunmatch(x, y, x2, y2, nwin1, add_subgrid_noise=True)
assert np.allclose(result, result2, atol=0.1)
assert not np.allclose(result, result2, atol=0.02)
assert np.all(result - result2 != 0)

This comment has been minimized.

@Christopher-Bradshaw

Christopher-Bradshaw Jun 7, 2018

Author Contributor

This should be enough to assert that we are adding noise to the leftmost/rightmost edges.

This comment has been minimized.

@aphearin

aphearin Jun 7, 2018

Contributor

Does this actually ensure that noise was added to the edges? Or just that noise was added somewhere?

This comment has been minimized.

@Christopher-Bradshaw

Christopher-Bradshaw Jun 7, 2018

Author Contributor

I think it ensures that noise was added to every point

This comment has been minimized.

@aphearin

aphearin Jun 7, 2018

Contributor

Yes, you're right, upon second look I see that now.

x = np.random.uniform(0, 10, n1)
y = np.random.uniform(0, 1, n1)

with NumpyRNGContext(fixed_seed2):

This comment has been minimized.

@Christopher-Bradshaw

Christopher-Bradshaw Jun 7, 2018

Author Contributor

If I used the same seed here as above tests were passing when they shouldn't (before I had correctly reordered/reindexed the indexes). I think that was because we were in the special case of x == x2 and y == y2 + c.

This comment has been minimized.

@aphearin

aphearin Jun 7, 2018

Contributor

Right, yes, good catch, sometimes I have to use different seeds for such purposes as well.

@cython.boundscheck(False)
@cython.nonecheck(False)
@cython.wraparound(False)
def get_value_at_rank(double[:] sorted_values, int rank1, int nwin, int add_subgrid_noise):

This comment has been minimized.

@aphearin

aphearin Jun 7, 2018

Contributor

Ok, quick implementation comment about this get_value_at_rank function. This function is declared with def rather than cdef, which means get_value_at_rank returns a python object. If you run cython -a bin_free_cam_kernel.pyx and inspect the html output, you will see deep yellow on line 270. Click on it and see all the code cython generates to compile this line. To fix this, get_value_at_rank needs to be C-declared. See, e.g., the definition of _bisect_left_kernel. With this change, this means that get_value_at_rank will no longer be accessible by the outside world, but only accessible within cython, so you will not be permitted to add it to __all__.

In general, it's good to get in the habit of always running cython -a whenever reviewing cython code. Because cython is a superset of python, the compiler will allow you to write code with avoidable python-layer interactions without complaining, so you just have to be vigilant and always check this manually.

This comment has been minimized.

@Christopher-Bradshaw

Christopher-Bradshaw Jun 7, 2018

Author Contributor

I defed it so that it would be accessible to the code that does the left and right edges.

But maybe a better way would be to have an internal cdef int _get_value_at_rank and then an external def get_value_at_rank that just wraps this?

Running cython -a is definitely a good idea

This comment has been minimized.

@aphearin

aphearin Jun 7, 2018

Contributor

bin_free_cam_kernel.html looks clean as of this commit. I also just did a simple timing test, comparing your branch to the current version of master, and it looks like there is a nearly negligible performance increase from this feature, so I think this PR is all set once it passes CI.

This comment has been minimized.

@Christopher-Bradshaw

Christopher-Bradshaw Jun 7, 2018

Author Contributor

Sweet, thanks for the comments!

@cython.boundscheck(False)
@cython.nonecheck(False)
@cython.wraparound(False)
cdef int _find_index(int[:] arr, int val):

This comment has been minimized.

@aphearin

aphearin Jun 7, 2018

Contributor

The _find_index function implements a blind lookup for a value in the correspondence indices array. The cython is clean (though watch out for the undeclared index variable on line 265), but I think the algorithm can be improved by taking advantage of the binary search that is already implemented, no? While the correspondence_indx2 array is not sorted, the sorted_cdf_values2 maintains a sorted order. If an original array of indices were defined at the very beginning of the engine, simply with np.arange(npts2), couldn't that array be manipulated alongside sorted_cdf_values2 to avoid the blind lookup? Naively, it seems this would be faster; it could be that the performance hit from the additional manipulations wash out the improvement, I don't know, but it seems worth testing since the cythonized binary search machinery is already in place. Let me know whether this makes sense.

This comment has been minimized.

@Christopher-Bradshaw

Christopher-Bradshaw Jun 7, 2018

Author Contributor

Just to make sure I understand what you are suggesting:
In the same way that we have the correspondence_indx2 array to keep track of how to unsort the sorted_cdf_values2, we can have a another array that keeps track of where things are in the correspondence_indx2? That would make this _find_index just a lookup in that map.

I think that this has roughly the same time complexity - when we update the correspondence indicies (to shift them along) we will need to update this array too and that would be roughly O(len(nwin)) which is I think what we have now.

But I could well have missed something... I definitely don't fully understand all the internals.

(index cdefed in f05be1c)

This comment has been minimized.

@aphearin

aphearin Jun 7, 2018

Contributor

You may be right, this might not be worth the trouble. After cleaning the cython of the remaining python interactions as discussed elsewhere, could you just do a simple timing test comparing your implementation against what is currently in master? If runtimes are comparable, then it means that further optimization is not worth spending time on.

This comment has been minimized.

@aphearin

aphearin Jun 7, 2018

Contributor

In the timing test, just make sure to use at least ~1e6 points

This comment has been minimized.

@Christopher-Bradshaw

Christopher-Bradshaw Jun 7, 2018

Author Contributor

In the new code

Not returning indexes 143.03s
Returning indexes 171.74s

In master

Not returning indexes 150.37s
... TypeError: conditional_abunmatch() got an unexpected keyword argument 'return_indexes'                  

So returning indexes is ~20% slower, probably due to that extra linear search over the window. I used 1e7 points in both x and y and a 10001 sized window.

But no change in the returning values case.

Christopher-Bradshaw added some commits Jun 7, 2018

@aphearin aphearin merged commit 0b9d31c into astropy:master Jun 8, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@Christopher-Bradshaw Christopher-Bradshaw deleted the Christopher-Bradshaw:idx-cam-kernel branch Jun 12, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment