Skip to content

Conversation

@cdeil
Copy link
Member

@cdeil cdeil commented Dec 7, 2013

This is a first attempt to implement #19, i.e. to switch the existing wrapper to Cython.

@astrofrog I really don't know Cython and am probably doing a few things wrong. Could you please have a look and comment?

I'll go read some Cython tutorial now and give this another shot tomorrow.

@astrofrog
Copy link
Member

@cdeil - thanks for working on this! I can review tomorrow. In the mean time it seems like this needs rebasing on master though, as it says it can't be merged. Maybe you were working off an old master?

.travis.yml Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make this 0.19.1 since that is the version we have a wheel binary for?

@cdeil
Copy link
Member Author

cdeil commented Dec 8, 2013

Rebased and changed to Cython 0.19.1 in .travis.yml

@astrofrog
Copy link
Member

@cdeil - I just did some performance tests, and the results are very impressive! If I write a pure-C code as follows:

#include <stdio.h>
#include <overlapArea.h>

int main() {

    int i = 0;

    double eps = 1.e-1;
    double ilon[4], ilat[4], olon[4], olat[4];
    int energyMode = 0;
    double refArea = 1.;
    double areaRatio = 1;
    double overlap;

    ilon[0] = 0;
    ilon[1] = eps;
    ilon[2] = eps;
    ilon[3] = 0;

    ilat[0] = 0;
    ilat[1] = 0;
    ilat[2] = eps;
    ilat[3] = eps;

    olon[0] = 0.5 * eps;
    olon[1] = 1.5 * eps;
    olon[2] = 1.5 * eps;
    olon[3] = 0.5 * eps;

    olat[0] = 0;
    olat[1] = 0;
    olat[2] = eps;
    olat[3] = eps;

    for(i=0;i<1e7;i++) {
        overlap = computeOverlap(ilon, ilat, olon, olat, energyMode, refArea, &areaRatio);
    }

}

and call computeOverlap 1e7 times, I get:

real    0m20.378s
user    0m20.355s
sys 0m0.017s

(compiled with clang -O3 which is what Cython uses). If I write a Python code as follows:

import numpy as np
import reproject

N = 1e7
ilon = np.zeros((N,4))
ilat = np.zeros((N,4))
olon = np.zeros((N,4))
olat = np.zeros((N,4))

EPS = 1.e-1

ilon[:,1] = EPS
ilon[:,2] = EPS

ilat[:,2] = EPS
ilat[:,3] = EPS

olon[:,0] = 0.5 * EPS
olon[:,1] = 1.5 * EPS
olon[:,2] = 1.5 * EPS
olon[:,3] = 0.5 * EPS

olat[:,2] = EPS
olat[:,3] = EPS

overlap = reproject.compute_overlap(ilon, ilat, olon, olat)

I get:

real    0m21.342s
user    0m20.808s
sys 0m0.510s

In other words, you are within 5% of the theoretical maximum, which is about as good as I've ever seen Cython. This looks great to me!

Do we want to allow compute_overlap to also be called with 1-D arrays in which case it would return a scalar?

@astrofrog
Copy link
Member

By the way, I'll work on trying to add the performance benchmarks to the package.

@cdeil
Copy link
Member Author

cdeil commented Dec 8, 2013

I think compute_overlap is more of an expert function and end users will call higher-level functions, so I'm not sure the special case of 1D arrays is needed.
If it's easy to do though, why not.

There should probably be some call like np.asarray though, because at the moment compute_overlap doesn't accept lists?

@astrofrog
Copy link
Member

@cdeil - I guess you're right it can be considered an expert function. In fact, to keep the Cython as simple as possible, how about making the Cython function private (_compute_overlap), then having a public compute_overlap function in the Python code that does things like asarray, and checks the dimensionality (and in future, can also parallelize the call to _compute_overlap?

@cdeil
Copy link
Member Author

cdeil commented Dec 8, 2013

I first had this extra layer of a compute_overlap function in overlap.py that called a _compute_overlap function in _overlap.pyx. :-)
But then I thought that this extra layer has no advantages.
We could call np.asarray and multiprocessing directly in _overlap.pyx, no?

I'm not against introducing the extra layer, I just don't see the advantage of introducing it yet.
Maybe the advantage of not having to re-compile while working on compute_overlap is already advantage enough?

Let me know if you think making the change would be an improvement and I'll do it.

@astrofrog
Copy link
Member

Well I guess I don't know enough about Cython to know whether doing all this extra stuff in the Cython layer would cause it to not be as fast. And I don't think we can add the multiprocessing stuff to the Cython file but I may be wrong. I would personally prefer having an extra pure-Python layer.

@cdeil
Copy link
Member Author

cdeil commented Dec 8, 2013

OK, I've added the wrapper as discussed.
At the moment it returns 0 instead of 3.0461741978670866e-08 in the test ... any idea why?

@astrofrog
Copy link
Member

@cdeil - the issue is in the asarray calls which have the wrong variables inside (they are all ilon)

@cdeil
Copy link
Member Author

cdeil commented Dec 8, 2013

Thanks for the tip, fixed now.

Let me know if there's something else that should go in this PR.

@astrofrog
Copy link
Member

@cdeil - looks great, thanks!

astrofrog added a commit that referenced this pull request Dec 8, 2013
Use Cython to wrap C overlap area code
@astrofrog astrofrog merged commit c9d6735 into astropy:master Dec 8, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants