scipy.cluster.hierarchy.ClusterNode.pre_order returns IndexError for non-root node (Trac #1652) #2177

Closed
scipy-gitbot opened this Issue Apr 25, 2013 · 1 comment

Projects

None yet

3 participants

@scipy-gitbot

Original ticket http://projects.scipy.org/scipy/ticket/1652 on 2012-04-30 by trac user MarkPundurs, assigned to unknown.

To reproduce the error, run the following script:

import random, numpy
from scipy.cluster.hierarchy import linkage, to_tree

datalist = []
for i in range(8000):
    datalist.append(random.random())
datalist = numpy.array(datalist)
datalist = numpy.reshape(datalist, (datalist.shape[0], 1))
Z = linkage(datalist)
root_node_ref = to_tree(Z)
left_root_node_ref = root_node_ref.left
left_root_node_ref.pre_order()

The result is:

Traceback (most recent call last):
  File "C:\ReproduceError-pre_order.py", line 12, in <module>
    left_root_node_ref.pre_order()
  File "C:\Python27\lib\site-packages\scipy\cluster\hierarchy.py", line 732, in pre_order
    if not lvisited[ndid]:
IndexError: index out of bounds

One possible solution (successfully tested with preceding script) is to change pre_order in hierarchy.py as follows:

        n = self.count

        curNode = [None] * (2 * n)
    #following two lines changed: dictionaries instead of lists
        lvisited = {}
        rvisited = {}
        curNode[0] = self
        k = 0
        preorder = []
        while k >= 0:
            nd = curNode[k]
            ndid = nd.id
            if nd.is_leaf():
                preorder.append(func(nd))
                k = k - 1
            else:
        #following line changed: check existence of dictionary key rather than value of list item
                if ndid not in lvisited.keys():
                    curNode[k + 1] = nd.left
                    lvisited[ndid] = True
                    k = k + 1
        #following line changed: check existence of dictionary key rather than value of list item
                elif ndid not in rvisited.keys():
                    curNode[k + 1] = nd.right
                    rvisited[ndid] = True
                    k = k + 1
                else:
                    k = k - 1

        return preorder
@jamestwebber jamestwebber added a commit to jamestwebber/scipy that referenced this issue Feb 26, 2014
@jamestwebber jamestwebber Modified ClusterNode.pre_order() for non-root
Fixes #2177 by replacing the boolean vectors with sets.
7dfd1bc
@jamestwebber
Contributor

I ran into this problem recently, so I submitted a pull request (w/ test) that fixes it. I used sets instead of dictionaries of bools, but otherwise it's just like the solution Mark proposed.

@rgommers rgommers added a commit that closed this issue Mar 9, 2014
@jamestwebber @rgommers jamestwebber + rgommers BUG: modified ClusterNode.pre_order() for non-root nodes.
Fixes gh-2177 by replacing the boolean vectors with sets.
3ece75e
@rgommers rgommers closed this in 3ece75e Mar 9, 2014
@rgommers rgommers modified the milestone: 0.15.0, 0.14.0 Mar 9, 2014
@rgommers rgommers added a commit that referenced this issue Mar 15, 2014
@jamestwebber @rgommers jamestwebber + rgommers BUG: modified ClusterNode.pre_order() for non-root nodes.
Fixes gh-2177 by replacing the boolean vectors with sets.
ba150de
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment