Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scipy.cluster.hierarchy.ClusterNode.pre_order returns IndexError for non-root node (Trac #1652) #2177

Closed
scipy-gitbot opened this issue Apr 25, 2013 · 1 comment
Labels
defect A clear bug or issue that prevents SciPy from being installed or used as expected Migrated from Trac scipy.cluster
Milestone

Comments

@scipy-gitbot
Copy link

Original ticket http://projects.scipy.org/scipy/ticket/1652 on 2012-04-30 by trac user MarkPundurs, assigned to unknown.

To reproduce the error, run the following script:

import random, numpy
from scipy.cluster.hierarchy import linkage, to_tree

datalist = []
for i in range(8000):
    datalist.append(random.random())
datalist = numpy.array(datalist)
datalist = numpy.reshape(datalist, (datalist.shape[0], 1))
Z = linkage(datalist)
root_node_ref = to_tree(Z)
left_root_node_ref = root_node_ref.left
left_root_node_ref.pre_order()

The result is:

Traceback (most recent call last):
  File "C:\ReproduceError-pre_order.py", line 12, in <module>
    left_root_node_ref.pre_order()
  File "C:\Python27\lib\site-packages\scipy\cluster\hierarchy.py", line 732, in pre_order
    if not lvisited[ndid]:
IndexError: index out of bounds

One possible solution (successfully tested with preceding script) is to change pre_order in hierarchy.py as follows:

        n = self.count

        curNode = [None] * (2 * n)
    #following two lines changed: dictionaries instead of lists
        lvisited = {}
        rvisited = {}
        curNode[0] = self
        k = 0
        preorder = []
        while k >= 0:
            nd = curNode[k]
            ndid = nd.id
            if nd.is_leaf():
                preorder.append(func(nd))
                k = k - 1
            else:
        #following line changed: check existence of dictionary key rather than value of list item
                if ndid not in lvisited.keys():
                    curNode[k + 1] = nd.left
                    lvisited[ndid] = True
                    k = k + 1
        #following line changed: check existence of dictionary key rather than value of list item
                elif ndid not in rvisited.keys():
                    curNode[k + 1] = nd.right
                    rvisited[ndid] = True
                    k = k + 1
                else:
                    k = k - 1

        return preorder
jamestwebber added a commit to jamestwebber/scipy that referenced this issue Feb 26, 2014
Fixes scipy#2177 by replacing the boolean vectors with sets.
@jamestwebber
Copy link
Contributor

I ran into this problem recently, so I submitted a pull request (w/ test) that fixes it. I used sets instead of dictionaries of bools, but otherwise it's just like the solution Mark proposed.

@rgommers rgommers modified the milestones: 0.15.0, 0.14.0 Mar 9, 2014
rgommers pushed a commit that referenced this issue Mar 15, 2014
Fixes gh-2177 by replacing the boolean vectors with sets.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect A clear bug or issue that prevents SciPy from being installed or used as expected Migrated from Trac scipy.cluster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants