Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
Original ticket http://projects.scipy.org/numpy/ticket/1603 on 2010-09-02 by trac user nhmc, assigned to unknown.
Robert Kern mentioned that in1d(ar1, ar2) can be slow for the case when len(ar2) is very small:
I've attached a script that compares timings for the existing version of in1d and the kern_in function described in the thread above; I use this to work out which algorithm is fastest for given lengths of ar1 and ar2 (see also the attached plot).
Also attached is a patch that changes in1d to use the kern_in algorithm when it results in a speed up. I think the speedup, which can be > 10x for very large ar1 and very small ar2, is worth the minor increase in code complexity.
@rc wrote on 2010-09-03
the patch looks good to me (very nice timing plot!), thanks for writing it.
However I cannot try to run it or apply it right now as I am just installing myself on a stay abroad. I may be able to do it next week, but somebody else could commit it IMHO faster :).
trac user nhmc wrote on 2010-11-06
Ok, I've attached a 2nd patch, patch2.diff, that adds new tests which check both paths through the new in1d(). All tests are ok for me on Python 3.1 and 2.6 after applying the patch.
I've left the masked array versions of in1d unchanged. Someone more familiar with the masked array code could tweak those if they'd like to spend time on it.