Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix occasional crashes when (re)moving items #571

Merged
merged 1 commit into from May 19, 2019

Conversation

worstje
Copy link
Contributor

@worstje worstje commented May 2, 2019

Describing all the rabbitholes that I and kakaroto have gone through
while debugging this one until dawn can frankly not do enough justice to
the crazy amount of rubberducking that went on while trying to fix this.

This bug would be triggered whenever you had a document open in the
editor and then moved an ancestor object downwards (visually) in the tree.
Or when you simply deleted the ancestor. Depending on the exact method
that caused the opened item to be removed from the internal model, the
exact nature of the bug would vary, which means this commit fixes a few
different bits of code that lead to what appears to be the same bug.

In order of appearance, the bugs that ruined our sleep were:

  1. The editor widget was trying to handle the removed item at too late a
    stage.

  2. The editor widget tried to fix its view after a move by searching for
    the new item with the same ID, but in the case of moving an object down
    it came across its own old item, ruining the attempt.

  3. The editor widget did not properly account for the hierarchical
    nature of the model.

Upon fixing these the next day, it was revealed that:

  1. The outlineItem.updateWordCount(emit=False) flag is broken. This
    function would call setData() in several spots which would still cause
    emits to bubble through the system despite emit=False, and we simply got
    lucky that it stopped enough of them until now.

This last one was caused by a small mistake in the fixes for the first
three bugs, but it has led to a couple of extra changes to make any
future bug hunts slightly less arduous and frustrating:

a) When calling item.removeChild(c), it now resets the associated parent
and model to mirror item.insertChild(c). This has also led to an extra
check in model.parent() to check for its validity.

b) The outlineItem.updateWordCount(emit=) flag has been removed entirely
and it now emits away with reckless abandon. I have been unable to
reproduce the crashes the code warned about, so I consider this a code
quality fix to prevent mysterious future issues where things sometimes
do not properly update right.

Worthy of note is that the original code clearly showed the intention to
close tabs for items that were removed. Reworking the editor to support
closing a tab is unfortunately way out of scope, so this intention was
left in and the new fix was structured to make it trivial to implement
such a change when the time comes. An existing FIXME regarding unrelated
buggy editor behaviour was left in, too.

Many thanks to Kakaroto for burning the midnight oil with me to get to
the bottom of this. (I learned a lot that night!)

Issues #479 and #559 are fixed by this commit. And maybe some others,
too.

Copy link
Contributor

@kakaroto kakaroto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I went through a quick round of review, there's a few things to fix to make the code/comments cleaner, but nothing major. Thanks for implementing the fix! :)

manuskript/models/abstractItem.py Show resolved Hide resolved
manuskript/models/abstractModel.py Show resolved Hide resolved
manuskript/models/abstractModel.py Outdated Show resolved Hide resolved
manuskript/models/abstractModel.py Outdated Show resolved Hide resolved
manuskript/ui/editors/editorWidget.py Outdated Show resolved Hide resolved
manuskript/ui/editors/editorWidget.py Outdated Show resolved Hide resolved
manuskript/ui/editors/editorWidget.py Outdated Show resolved Hide resolved
manuskript/ui/editors/editorWidget.py Outdated Show resolved Hide resolved
manuskript/ui/editors/editorWidget.py Outdated Show resolved Hide resolved
manuskript/ui/editors/editorWidget.py Outdated Show resolved Hide resolved
@worstje
Copy link
Contributor Author

worstje commented May 11, 2019

@gedakc This PR is all ready for you now with all code review comments from @kakaroto taken care of.

Rebase & squash have been done in the assumption all is good now, but if it isn't, just yell and I'll adjust whatever needs adjusting.

Edit: Apparently we weren't fully done code reviewing. My bad! I'll incorporate the last point of contention (and probably yet another bugfix for an issue that was linked since I understand the fix) within the next 24 hours. Sorry about the false alarm. 😄

@worstje
Copy link
Contributor Author

worstje commented May 12, 2019

All good now (includes a fix for #516 too), unless @kakaroto has even more concerns... 😈

@kakaroto
Copy link
Contributor

Lol, yeah, I was worrying about that check not being there, re-reading while awake was a good idea :p I didn't re-review the latest patches but if you fixed all the stuff we discussed, then it should be all good from my side of the review process.

@gedakc
Copy link
Collaborator

gedakc commented May 15, 2019

Wow, lots of work done on this PR!

I will start reviewing, which is mostly testing the known crash cases to see if these are now fixed.

@worstje worstje mentioned this pull request May 16, 2019
@gedakc
Copy link
Collaborator

gedakc commented May 16, 2019

Unfortunately I have been unable to confirm that this PR improves the situation for either issue #479 or issue #559.

Any help you can provide regarding how to test this PR would be appreciated.

Issue #479 Testing

Using Kubuntu 16.04 and Windows 10 with this PR applied, the crash still occurs in my testing.

See issue 479 - steps to reproduce crash.

Issue #559 Testing

So far I am still unable to reproduce the crash mentioned in this issue.

@kakaroto
Copy link
Contributor

Just confirmed #479 isn't fixed, though the traceback is different this time.

Traceback (most recent call last):
  File "v:\Projects\book\manuskript-git\bin\..\manuskript\models\abstractModel.py", line 134, in parent
    return self.createIndex(parentItem.row(), 0, parentItem)
TypeError: createIndex(self, int, int, object: object = 0): argument 1 has unexpected type 'NoneType'

@gedakc
Copy link
Collaborator

gedakc commented May 16, 2019

Issue #559 Testing - Take 2

Using Kubuntu 16.04 and Windows 10 with this PR applied, the crash no longer occurs. :-)

See issue 559 - steps to reproduce the crash.

@worstje
Copy link
Contributor Author

worstje commented May 17, 2019

@gedakc Feel free to test the testcase that still remained broken. Hopefully for the last time, as it should now be fixed! 😄

(I haven't squashed the commit yet because I figure everyone with issues that I reference in my squashed commit must be quite sick of all the notification emails I keep generating for their issues, and I might as well wait for the all-good signal this time around.)

@gedakc
Copy link
Collaborator

gedakc commented May 18, 2019

Thanks @worstje for the code enhancement to fix issue 479.

On Kubuntu 16.04 I retested both issue 479 and issue 559 and neither crash occurred. :-)

I did also try moving various collections of folders up and down and into and outside of other folders and was able to get one of my CPU's to run 100% with the Manuskript GUI becoming unresponsive. Unfortunately I do not have a series of steps to reliably reproduce this situation.

I eventually used CTRL+C in the terminal window to halt execution. When I did this I received the following console log:

$ bin/manuskript  
Debug: Web rendering engine used: QWebView
Running manuskript version 0.9.0.
Found translation in settings: 
Note: No translator found or loaded from system locale for locale en_CA.
Last accessed directory "/home/gedakc/tmp" loaded.
Project d3.msk saved.
Project d3.msk saved.
Project d3.msk saved.
Project d3.msk saved.
Project d3.msk saved.
Project d3.msk saved.
Project d3.msk saved.
^CTraceback (most recent call last):
  File "/home/gedakc/workspace/manuskript.gedakc/bin/../manuskript/ui/editors/editorWidget.py", line 367, in rowsAboutToBeRemoved
    if ancestorCandidate == None:
KeyboardInterrupt
Project d3.msk saved.
Project d3.msk saved.
$

It is entirely possible that this issue exists in 0.9.0 as well. I will try to come up with a reproducible set of steps.

Either way I think that the code in this PR is an improvement. If you would like to squash the commits, or otherwise edit the commit messages as the first commit does not fix issue 479, then I will merge with the develop branch.

As an aside: On the topic of commits, my personal goal is to ensure that each individual commit compiles, runs, and implements an improvement in the code. This helps with troubleshooting because it makes using git bisect work well to identify regressions. Also if a set of commits adds some code and then later removes the same code, then it is much cleaner to not add the code in the first place. Simple fixes should only require one commit. More complex fixes might require multiple commits that make a logical sequence of changes to the code. These are just general guidelines I try to follow.

@gedakc
Copy link
Collaborator

gedakc commented May 18, 2019

Finally I was able to figure out a series of steps to reproduce the GUI hang.

Test 4 - Multi Chapter Drag and Drop beside and in other Chapters

Open an terminal window and start manuskript.

  1. Create a new Manuskript project using the Novel template
  2. Select Editor pane
  3. Select Chapter 7
  4. Select the Outline pane
  5. Select Chapter 7 through Chapter 9 with shift-click
  6. Drag and drop between Chapter 12 and Chapter 13
  7. Select Chapter 7 through Chapter 9 with shift-click
  8. Drag and drop into Chapter 3
  9. Expand Chapter 3
  10. Select Chapter 7 through Chapter 9 with shift-click
  11. Drag and drop between Chapter 12 and Chapter 13

At this time the Manuskript GUI becomes unresponsive.

If I use top in another terminal window I discover that one python3 process is using 100% of a single CPU.

Use Ctrl+C in the terminal window to regain Manuskript GUI responsiveness.

The console log is as follows:

$ bin/manuskript 
Debug: Web rendering engine used: QWebView
Running manuskript version 0.9.0.
Found translation in settings: 
Note: No translator found or loaded from system locale for locale en_CA.
Last accessed directory "/home/gedakc/tmp" loaded.
Project d3.msk saved.
Project d3.msk saved.
Project d3.msk saved.
^CTraceback (most recent call last):
  File "/home/gedakc/workspace/manuskript.gedakc/bin/../manuskript/ui/editors/editorWidget.py", line 365, in rowsAboutToBeRemoved
    ancestorCandidate = ancestorCandidate.parent()
KeyboardInterrupt
Project d3.msk saved.
Project d3.msk saved.

@worstje
Copy link
Contributor Author

worstje commented May 18, 2019

How do you keep managing to break this? I swear I spent ten minutes just dragging folders and texts around after that last commit... I may not be amused, but I am definitely in awe of your testing efforts! 😭

The loop you got stuck in is one I added and whose result I was paranoid about in the latest commit. Unfortunately that paranoia was based on the assumption it does, in fact, manage to exit.

I think I might know the problem. QModelIndex.parent() does not return null (None), but an invalid model index object. (Which is different from our own item.parent() function, which would just return None...) You must have somehow managed to cause an update to be processed on an item which was kind of removed from the tree already? I'm not sure. Qt breaks my brain.

I'll fix it paranoid-style and combine it with a squash. You'll have it shortly. 😄

@gedakc
Copy link
Collaborator

gedakc commented May 18, 2019

Thank you @worstje for continuing to investigate and fix these issues. In my opinion the code is relatively complex and with the asynchronous nature of GUI's and signals, it can be very difficult to debug.

I appreciate both you and @kakaroto delving deep into this area to fix it for all users. 👍

@worstje
Copy link
Contributor Author

worstje commented May 18, 2019

@gedakc No problem.

Your latest testcase has been reproduced and also fixed with the fix I created before I even saw your testcase. 👍

With that I assume issue 479 is indeed fixed, so I'll leave that mention in the commit message.

(I'm living on the razors edge here by assuming the rabbithole only goes so deep... but I like to live dangerously!)

Describing all the rabbitholes that I and kakaroto have gone through
while debugging this one until dawn can frankly not do enough justice to
the crazy amount of rubberducking that went on while trying to fix this.

This bug would be triggered whenever you had a document open in the
editor and then moved an ancestor object downwards (visually) in the tree.
Or when you simply deleted the ancestor. Depending on the exact method
that caused the opened item to be removed from the internal model, the
exact nature of the bug would vary, which means this commit fixes a few
different bits of code that lead to what appears to be the same bug.

In order of appearance, the bugs that ruined our sleep were:

1) The editor widget was trying to handle the removed item at too late a
stage.

2) The editor widget tried to fix its view after a move by searching for
the new item with the same ID, but in the case of moving an object down
it came across its own old item, ruining the attempt.

3) The editor widget did not properly account for the hierarchical
nature of the model.

Upon fixing these the next day, it was revealed that:

4) The outlineItem.updateWordCount(emit=False) flag is broken. This
function would call setData() in several spots which would still cause
emits to bubble through the system despite emit=False, and we simply got
lucky that it stopped enough of them until now.

This last one was caused by a small mistake in the fixes for the first
three bugs, but it has led to a couple of extra changes to make any
future bug hunts slightly less arduous and frustrating:

a) When calling item.removeChild(c), it now resets the associated parent
and model to mirror item.insertChild(c). This has also led to an extra
check in model.parent() to check for its validity.

b) The outlineItem.updateWordCount(emit=) flag has been removed entirely
and it now emits away with reckless abandon. I have been unable to
reproduce the crashes the code warned about, so I consider this a code
quality fix to prevent mysterious future issues where things sometimes
do not properly update right.

Worthy of note is that the original code clearly showed the intention to
close tabs for items that were removed. Reworking the editor to support
closing a tab is unfortunately way out of scope, so this intention was
left in and the new fix was structured to make it trivial to implement
such a change when the time comes. An existing FIXME regarding unrelated
buggy editor behaviour was left in, too.

Many thanks to Kakaroto for burning the midnight oil with me to get to
the bottom of this. (I learned a lot that night!)

Issues olivierkes#479, olivierkes#516 and olivierkes#559 are fixed by this commit. And maybe some others,
too.
@gedakc
Copy link
Collaborator

gedakc commented May 18, 2019

I've been on both sides of coding and testing so I know how frustrating it can be when the testers keep finding problems. ;-)

I will re-test using the known steps for producing errors. When that passes I will try some random folder/text moves to determine if we've finally arrived at a robust solution.

@worstje
Copy link
Contributor Author

worstje commented May 18, 2019

I usually try to do the same, although there is substantial evidence that I have failed in that regard where it comes to testing these changes.

Here is to hoping this is going to be my final comment in this PR. 😄

@gedakc
Copy link
Collaborator

gedakc commented May 18, 2019

All of my following tests with the most recent PR 571 applied have been successful. :-)

Issue 479 Testing

See issue 479 - steps to reproduce crash.

  • Kubuntu 16.04: SUCCESS. No crash.
  • Windows 10: SUCCESS. No crash.

Issue 516 Testing

See issue 516 - editor/cork options wrong after deleting a text for steps to reproduce problem.

  • Kubuntu 16.04: SUCCESS. Correct bottom of pane options now displayed (Go to parent item, Text, Index Cards, Outline).
  • Windows 10: SUCCESS. Correct bottom of pane options now displayed (Go to parent item, Text, Index Cards, Outline).

Issue 559 Testing

See issue 559 - steps to reproduce crash.

  • Kubuntu 16.04: SUCCESS. No crash.
  • Windows 10: SUCCESS. No crash.

PR 571 GUI Hang

See steps to reproduce GUI hang.

  • Kubuntu 16.04: SUCCESS. No GUI hang.
  • Windows 10: SUCCESS. No GUI hang.

Great work @worstje and @kararotu!

If no concerns are raised then I plan to merge this PR with the develop branch in the next day or so.

After that I will move on to reviewing the backlog of new PR's.

@gedakc gedakc added this to the 0.10.0 milestone May 18, 2019
@gedakc
Copy link
Collaborator

gedakc commented May 19, 2019

I will merge this PR with the develop branch for inclusion with the next release of Manuskript.

@gedakc gedakc merged commit 12390a9 into olivierkes:develop May 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants