Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsynchronized tree: __repr__ recursion fix for #119, #121 #122

Merged
merged 7 commits into from Dec 21, 2016

Conversation

rojaster
Copy link
Contributor

@rojaster rojaster commented Oct 3, 2016

Possible fix #119 and fix #121 .
(Copied from my email version to @Psycojoker )
After debugging, I found some solutions or hacks:

Let me explain on excerpts from the code.

In master branch problem does not appear, but the bug is not solved. It's not appearing because last changes have double-negative(problem's not in lazy or unlazy logging function):

  • if RedBaron run under debug then in __repr__ method of Node class condition if in_a_shell() is True and dataflow is going down into __str__().

That's true for the python shell and ipython shell too. And that's why a bug is not occurring in the shell.

  • If RedBaron is run without debug mode and not in shells then bug doesn't occur because in log function redbaron.DEBUG parameter is False. And repr function is not called. String representation of the object wouldn't be called.

This right for the master branch. But for 0.6.1 it's not true.

Look carefully at this code in 0.6.1 and in master branch: the difference is in one comma in call of log function. And that's why in 0.6.1 python tries to get repr of the object before a call of a log function.
The cost of this bug is in one comma you think=) but it is actually still deeper, check.

This problem occurs in 0.6.1 because in https://github.com/PyCQA/redbaron/blob/0.6.1/redbaron/base_nodes.py#L1591
python tries to get a string before it calls a log function. It calls a __repr__ function implicitly. In __repr__ function in_a_shell() returns False for Pycharm or something else that's not a python shell or ipython shell.

Next, it's trying to get a path :
https://github.com/PyCQA/redbaron/blob/0.6.1/redbaron/base_nodes.py#L951,
and after a few intermediate calls stops here: https://github.com/PyCQA/redbaron/blob/master/redbaron/base_nodes.py#L121 , to get an index of the node in the parent, but a tree has not synchronized yet with last changes: https://github.com/PyCQA/redbaron/blob/0.6.1/redbaron/base_nodes.py#L1333.
Look, here we have already done insertion of the code. Node.data includes insertion, but node.node_list does not. Parent points to node_list, but node_list doesn't have a new insertion. Index method raises the exception - ValueError(https://docs.python.org/2/tutorial/datastructures.html), because there is no such item.

Excerpt from https://github.com/PyCQA/redbaron/blob/master/redbaron/base_nodes.py#L121:
python pos = parent.index(node.node_list if isinstance(node, ProxyList) else node) for parent.index(node) raises a ValueError, message of this error calls implicitly __repr__ again to get a representation of the object, I mean, this call is being called again and would try to get path and index again. Because the Tree is not being synchronized before a log call. Yeap, Pycharm(py.test) tells about it =)

    if os.getenv('PYCHARM_HOSTED'):
        if os.getenv('IPYTHONENABLE'):
            print("IPYTHONENABLE %s" % os.getenv('IPYTHONENABLE'))
        print("PYCHARM_HOSTED : {0}".format(os.getenv('PYCHARM_HOSTED')))
    if not os.getenv('PYCHARM_HOSTED'):
        if __IPYTHON__:
            print("IPYTHOSHELL %s" % __IPYTHON__)

I tested this code and that's why it looks how it looks to avoid exceptions and misunderstanding.
But it is a hack. And maybe is not useful.

if isinstance(parent, NodeList):
     try:
            pos = parent.index(node.node_list if isinstance(node, ProxyList) else node)
     except Exception:
            return 0
            return pos

In this part, we catch exception for ValueError in UserList or another unknown exception(UserList.index raises ValueError)
This is the hack too and code is not clean and clear(IMHO). And this ain't cool.

#  first test, test_insert(DEBUG=FALSE) failed, changes have not synchronized yet

def _synchronise(self):
    # TODO(alekum): log method calls __repr__ implicitly to get a self.data representation
    log("Before synchronise, self.data = '%s' + '%s'" % (self.first_blank_lines, self.data))
    #log("Before synchronise, self.data = '%s' + '%s'", self.first_blank_lines, self.data)
    super(LineProxyList, self)._synchronise()
    #log("After synchronise, self.data = '%s' + '%s'" % (self.first_blank_lines, self.data))
    log("After synchronise, self.data = '%s' + '%s'", self.first_blank_lines, self.data)
# second test, test_insert(DEBUG=FALSE) passed, changes have synchronized already

def _synchronise(self):
    # TODO(alekum): log method calls __repr__ implicitly to get a self.data representation
    #log("Before synchronise, self.data = '%s' + '%s'" % (self.first_blank_lines, self.data))
    log("Before synchronise, self.data = '%s' + '%s'", self.first_blank_lines, self.data)
    super(LineProxyList, self)._synchronise()
    log("After synchronise, self.data = '%s' + '%s'" % (self.first_blank_lines, self.data))
    #log("After synchronise, self.data = '%s' + '%s'", self.first_blank_lines, self.data)
# The solution(DEBUG=FALSE), that I've described: actual state of a node is in node_list.

def _synchronise(self):
    # TODO(alekum): log method calls __repr__ implicitly to get a self.data representation
    log("Before synchronise, self.data = '%s' + '%s'" % (self.first_blank_lines, self.node_list))
    #log("Before synchronise, self.data = '%s' + '%s'", self.first_blank_lines, self.data)
    super(LineProxyList, self)._synchronise()
    log("After synchronise, self.data = '%s' + '%s'" % (self.first_blank_lines, self.node_list))
    #log("After synchronise, self.data = '%s' + '%s'", self.first_blank_lines, self.data)

I've tested it with master and 0.6.1, all is good. but I am not sure that it is a right and idiomatic way.
It doesn't matter that in master I've changed this call.
Do you know any cases where this solution could raise an AttributeError because self doesn't have attribute node_list? I couldn't find.

Copy link
Collaborator

@ibizaman ibizaman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all thank you for such a detailled investigation. LGTM overall, just a few changes need to be made before merging.

Can you add a line in the CHANGELOG explaining the fix (one short line is enough).

Do you know any cases where this solution could raise an AttributeError because self doesn't have attribute node_list?

Me neither. We can always add test cases later on when the problem arises.

@rojaster
Copy link
Contributor Author

Hmm, Should I add my short explanation into the block for 0.6.2(unreleased)?

@Psycojoker
Copy link
Member

If you think that this is going to be too long you can simply refer to this PR :)

@rojaster
Copy link
Contributor Author

@Psycojoker I just wonder where I should write it ;)

@ibizaman
Copy link
Collaborator

ibizaman commented Dec 20, 2016

@rojaster it's where you sais, in the unreleased block, append a bullet. Thanks!

EDIT: In fact, can you update your branch and merge in the latest master? The 0.6.2 branch has been released already.

@rojaster
Copy link
Contributor Author

@ibizaman ok,

@rojaster
Copy link
Contributor Author

Hmm, I am confused a little, these changes would be included into 0.6.2 or I should add it into 0.6.3?

@Psycojoker
Copy link
Member

0.6.3 here :)

@rojaster
Copy link
Contributor Author

done, check it

@ibizaman ibizaman merged commit e422c99 into PyCQA:master Dec 21, 2016
@ibizaman
Copy link
Collaborator

@rojaster Thanks!

@Psycojoker
Copy link
Member

Fix released https://pypi.python.org/pypi/redbaron/0.6.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants