New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in1d speed up (Trac #1603) #2199

Closed
numpy-gitbot opened this Issue Oct 19, 2012 · 9 comments

Comments

Projects
None yet
1 participant
@numpy-gitbot

numpy-gitbot commented Oct 19, 2012

Original ticket http://projects.scipy.org/numpy/ticket/1603 on 2010-09-02 by trac user nhmc, assigned to unknown.

Robert Kern mentioned that in1d(ar1, ar2) can be slow for the case when len(ar2) is very small:

http://article.gmane.org/gmane.comp.python.numeric.general/40077

I've attached a script that compares timings for the existing version of in1d and the kern_in function described in the thread above; I use this to work out which algorithm is fastest for given lengths of ar1 and ar2 (see also the attached plot).

Also attached is a patch that changes in1d to use the kern_in algorithm when it results in a speed up. I think the speedup, which can be > 10x for very large ar1 and very small ar2, is worth the minor increase in code complexity.

@numpy-gitbot

This comment has been minimized.

Show comment
Hide comment
@numpy-gitbot

numpy-gitbot Oct 19, 2012

Attachment added by trac user nhmc on 2010-09-02: timings.png

numpy-gitbot commented Oct 19, 2012

Attachment added by trac user nhmc on 2010-09-02: timings.png

@numpy-gitbot

This comment has been minimized.

Show comment
Hide comment
@numpy-gitbot

numpy-gitbot Oct 19, 2012

Attachment added by trac user nhmc on 2010-09-02: setmember.py

numpy-gitbot commented Oct 19, 2012

Attachment added by trac user nhmc on 2010-09-02: setmember.py

@numpy-gitbot

This comment has been minimized.

Show comment
Hide comment
@numpy-gitbot

numpy-gitbot Oct 19, 2012

Attachment added by trac user nhmc on 2010-09-02: patch.diff

numpy-gitbot commented Oct 19, 2012

Attachment added by trac user nhmc on 2010-09-02: patch.diff

@numpy-gitbot

This comment has been minimized.

Show comment
Hide comment
@numpy-gitbot

numpy-gitbot Oct 19, 2012

@rc wrote on 2010-09-03

Hi Neil,

the patch looks good to me (very nice timing plot!), thanks for writing it.

However I cannot try to run it or apply it right now as I am just installing myself on a stay abroad. I may be able to do it next week, but somebody else could commit it IMHO faster :).

numpy-gitbot commented Oct 19, 2012

@rc wrote on 2010-09-03

Hi Neil,

the patch looks good to me (very nice timing plot!), thanks for writing it.

However I cannot try to run it or apply it right now as I am just installing myself on a stay abroad. I may be able to do it next week, but somebody else could commit it IMHO faster :).

@numpy-gitbot

This comment has been minimized.

Show comment
Hide comment
@numpy-gitbot

numpy-gitbot Oct 19, 2012

@rc wrote on 2010-09-21

ok, so I have tested it now, all tests pass. I just wonder, whether the test_in1d() tests in numpy/lib/tests/test_arraysetops.py test both branches, i.e. the old branch and the new one for the case when len(ar2) is very small?

numpy-gitbot commented Oct 19, 2012

@rc wrote on 2010-09-21

ok, so I have tested it now, all tests pass. I just wonder, whether the test_in1d() tests in numpy/lib/tests/test_arraysetops.py test both branches, i.e. the old branch and the new one for the case when len(ar2) is very small?

@numpy-gitbot

This comment has been minimized.

Show comment
Hide comment
@numpy-gitbot

numpy-gitbot Oct 19, 2012

trac user nhmc wrote on 2010-09-22

Yes the tests should be updated. The masked array version of in1d could also be changed. I'll try to take a look at these over the weekend.

numpy-gitbot commented Oct 19, 2012

trac user nhmc wrote on 2010-09-22

Yes the tests should be updated. The masked array version of in1d could also be changed. I'll try to take a look at these over the weekend.

@numpy-gitbot

This comment has been minimized.

Show comment
Hide comment
@numpy-gitbot

numpy-gitbot Oct 19, 2012

Attachment added by trac user nhmc on 2010-11-06: patch2.diff

numpy-gitbot commented Oct 19, 2012

Attachment added by trac user nhmc on 2010-11-06: patch2.diff

@numpy-gitbot

This comment has been minimized.

Show comment
Hide comment
@numpy-gitbot

numpy-gitbot Oct 19, 2012

trac user nhmc wrote on 2010-11-06

Ok, I've attached a 2nd patch, patch2.diff, that adds new tests which check both paths through the new in1d(). All tests are ok for me on Python 3.1 and 2.6 after applying the patch.

I've left the masked array versions of in1d unchanged. Someone more familiar with the masked array code could tweak those if they'd like to spend time on it.

numpy-gitbot commented Oct 19, 2012

trac user nhmc wrote on 2010-11-06

Ok, I've attached a 2nd patch, patch2.diff, that adds new tests which check both paths through the new in1d(). All tests are ok for me on Python 3.1 and 2.6 after applying the patch.

I've left the masked array versions of in1d unchanged. Someone more familiar with the masked array code could tweak those if they'd like to spend time on it.

@numpy-gitbot

This comment has been minimized.

Show comment
Hide comment
@numpy-gitbot

numpy-gitbot Oct 19, 2012

@rgommers wrote on 2011-05-29

Applied in 6441c2a, thanks Neil.

numpy-gitbot commented Oct 19, 2012

@rgommers wrote on 2011-05-29

Applied in 6441c2a, thanks Neil.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment