kendalltau in Cython (Trac #893) #1420

Open
scipy-gitbot opened this Issue Apr 25, 2013 · 10 comments

1 participant

@scipy-gitbot

Original ticket http://projects.scipy.org/scipy/ticket/893 on 2009-03-19 by @sturlamolden, assigned to unknown.

Kendall's tau can be slow or very to compute with loops and/or very memory expensive to compute with vectorization.

Kendall's tau exists in several versions (tau-a, tau-b, tau-c, ...) SciPy currently has tau-b. This can be a source of confusion (e.g. MINITAB computes tau-b).

SciPy lacks Kendall's tau for contingency table data.

This is a reimplementation in Cython that fixes these issues.

@scipy-gitbot

Attachment added by @sturlamolden on 2009-03-19: tau.pyx

@scipy-gitbot

Attachment added by @sturlamolden on 2009-03-19: tau.c

@scipy-gitbot

@sturlamolden wrote on 2009-03-19

Kendall's tau can be very slow to compute with loops and/or very memory expensive to compute with vectorization.

Kendall's tau exists in several versions (tau-a, tau-b, tau-c, ...) SciPy currently has tau-b. This can be a source of confusion (e.g. MINITAB computes tau-a).

SciPy lacks Kendall's tau for contingency table data.

This is a reimplementation in Cython that fixes these issues.

@scipy-gitbot

trac user peridot wrote on 2009-10-07

Needs tests to be accepted.

@scipy-gitbot

trac user brentp wrote on 2010-07-02

also see: http://projects.scipy.org/scipy/ticket/999

@scipy-gitbot

@rgommers wrote on 2010-11-22

On gh-1526 there is some discussion, and so far it looks like the pure Python version will go in. Regarding speed, this Cython version seems to be O(n^2^) while the Python one in gh-1526 is O(n log(n)). Of course that version could be Cythonized for even more speed, but no one has taken the time to do so.

Please review the patch and/or join the discussion there if you have time. After that discussion is done, I think this ticket can be closed as well.

@scipy-gitbot

@rgommers wrote on 2010-11-28

The faster version of gh-1526 was committed in fdaee9f. It does not have tau-c, so leaving this ticket open.

@scipy-gitbot

Milestone changed to Unscheduled by @rgommers on 2011-06-12

@scipy-gitbot

@josef-pkt wrote on 2012-05-07

this also has a separate implementation of kendalltau for contingency tables, kendalltau_fromct, which might in that case be faster than the current (Enzo) implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment