Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

userbrowse core dump #25

Closed
ghost opened this issue Sep 1, 2016 · 13 comments
Closed

userbrowse core dump #25

ghost opened this issue Sep 1, 2016 · 13 comments
Labels

Comments

@ghost
Copy link

ghost commented Sep 1, 2016

Step to reproduce:

  1. Browse your own share or anyone else for that matter.
  2. Start scrolling throught the share by pressing the down arrow.
  3. Wait for the core dump :

**
ERROR:gailtreeview.c:2314:idle_expand_row: code should not be reached

[1] + 10743 abort (core dumped) python nicotine.py

Bisected to 12424d4 if i'm not wrong.

Can anyone confirm this ?

@ghost ghost added the question label Sep 1, 2016
@Mutnick
Copy link
Member

Mutnick commented Sep 5, 2016

I can not confirm this.

@ghost
Copy link
Author

ghost commented Sep 5, 2016

@Mutnick : I'm unable to trigger the bug on windows, but both on ubuntu and fedora it triggers.
Could you test on your debian box if you haven"t already ?

@Mutnick
Copy link
Member

Mutnick commented Sep 6, 2016

Yes tested on Debian also. What is the minimal way you can replicate this? Does a share of one file give the error?

@ghost
Copy link
Author

ghost commented Sep 6, 2016

With one file I can't reproduce it since the down arrow will have nothing to go through.
You should have at least a few subdir in the share.

One way I'm able to easily reproduce it on ubuntu 16.04 / fedora 24 is :

  1. Create a virtual share containing the n+ source tree.
  2. Open the first level "test" to see the source tree.

subdir_tree

  1. Start going through the share tree by keeping the down arrow pressed until it crash.

crash

It happens for me after less than 3-4s, it does even goes to the "files" subdir it crashes before.
It does not matter whar I share: with my regular music share (something like 600 dirs) it also crashes quite quickly.

@Mutnick
Copy link
Member

Mutnick commented Sep 7, 2016

I retested on Debian-based, sharing source tree (master branch from aug 29), works fine here. Smooth sailing down into Stubs subdirectory.

python 2.7.9, GTK+ 2.24.25, PyGTK+ 2.24.0

@ghost
Copy link
Author

ghost commented Sep 8, 2016

Thx for the tests. I've just spawned a debian VM and I agree it works under debian stable up-to-date.
As soon as I've upgraded debian to testing with GTK 2.24.30 n+ crashes.
So it might be a bug not triggering in GTK+ 2.24.25 and below (debian + windows) and triggering in GTK 2.24.30 (fedora 24, ubuntu 16.04, debian testing).

IMHO it's not a bug in GTK but more likely a bug in the uglytree.py custom tree model.

From what i gather uglytree.py was introduce to fix http://nicotine-plus.org/ticket/480 because loading a big directory structure was slow.
The first comment point to a function BrowseGetDirs which was causing the slowdown and freeze of the UI for a long time.
Anyway they went ahead and, instead of fixing the BrowseGetDirs function, introduced a custom tree view model.

So by curiosity I've checked out the version of userbrowse.py before the uglytree.py introduction and as expected it was slow but at least not crashing :).
I've been working on it for the past few days in a branch (https://github.com/gfarmerfr/nicotine-plus/tree/rework-userbrowse) and I've optimize the BrowseGetDirs function.
Before doing anything it was taking around 45s to generate a treeview with 30k directories and freezing the UI for this long.

After a rewrite and doing some optimization like:

  • Attaching the model to the treeview AFTER filling it up.
  • Sorting the model AFTER filling it up (so GTK do not sort everything each time you attach a new directory to the model...)
  • etc...

I managed to take it down to 5s.

This is still a little bit slower than the uglytree.py model but I think I might be able to push the load in a thread to avoid blocking the UI for 5s (need some code refactoring).
If I manage to do that it will be even better than the current model.

After that i've adapted some functions in userbrowse.py (search and download related) because the 'shares' data structure had changed over the years.
I've also fix the "send to player" and "open in browser" functions to handle virtual shares.
Btw these two functions were and still are broken in the current userbrowse.py since the introduction of virtual shares by Quinox.
The last few functions I need to adapt/test are the upload ones.

One bonus in this version: changing the encoding via the dropdown list change the fileview encoding AND the treeview encoding (when uglytree was introduced they seems to have forgotten that).
There's still some open bugs referring to that in the old bug-tracker.

Anyway when I'm done with this branch I might ask for your help for testing it out :)
If we can fix in one shot most of the userbrowse.py bugs it would be nice.

@Mutnick
Copy link
Member

Mutnick commented Sep 8, 2016

Best to not replace hand waving with yet more hand waving. What specifically in uglytree.py is causing coredump?

@ghost
Copy link
Author

ghost commented Sep 10, 2016

I'm not trying to hand wave anything I assure you.
If I could avoid checking out a 7 years old version of the userbrowse.py script and modify it I would do it gladly :)

I've tried to debug the uglytree.py module more than once now and I think I'm on something.
I've managed to find a simple share directory structure to keep it crashing consistently under GTK 2.24.30.

case1

You have to first expand everything down in 'test1' subdir and then try to expand 'test2' surdir.
As soon as you click on the 'test2' little expand button it crashes.
You don't even have to play keeping the down arrow pressed and wait to stumble around a structure similar to this one.
So if the previous open subdir of the tree (test111) is more than one level down relative to the wanted one (test2): crash.
Also you must have at least two subdir in 'test2' or this won't happen.

ERROR:gailtreeview.c:2314:idle_expand_row: code should not be reached.

Too play around and see want would happen I added yet another subdir test1111 in the first branch.

case2

Now it crashes too, but I even have a traceback with an out of range index;

Traceback (most recent call last):
  File "nicotine-plus/pynicotine/gtkgui/uglytree.py", line 228, in on_iter_next
    return self.GetNext(node)
  File "nicotine-plus/pynicotine/gtkgui/uglytree.py", line 169, in GetNext
    nchildren, (level, first_node) = self.GetChildren(self.GetParent(node))
  File "nicotine-plus/pynicotine/gtkgui/uglytree.py", line 149, in GetParent
    return (level-1, self.tree1[level][number][1])
IndexError: list index out of range
**
ERROR:gailtreeview.c:2314:idle_expand_row: code should not be reached

On Debian & Windows: no traceback at all and still no crash...
If I then remove one subdir from 'test2' I still have the traceback with GTK 2.24.30 but no crash.

So I decided to compare the data structures (self.tree1 and self.tree2) representing the same directory tree in Debian and Fedora.
They are the same:
debian8.txt
fedora24.txt

For the root cause we can exclude PyGTK (2.24.0) which is the same on every OS.
We can exclude python too: Windows has 2.7.12 and doesn't crash but both Fedora 24 and Ubuntu 16.04 with the same version crashes.
I think GTK+ is the culprit: Windows: 2.24.8 -> OK, Debian: 2.24.25 -> OK, Fedora 24 + Ubuntu 16.04: 2.24.30 -> crash.

So either GTK 2.24.30 is borked or uglytree.py was throwing crap at GTK this entire time but GTK was happily dismissing it.
Either way the traceback is pretty clear: finding the next iter in the treeview in some corner cases crash n+.
It might happen anytime when someone browse a share with has a similar directory structure somewhere down the tree.

I will look at it futher and test if I can fix it in uglytree.py.

@ghost ghost added bug and removed question labels Sep 10, 2016
@Mutnick
Copy link
Member

Mutnick commented Sep 10, 2016

OK great, just found it curious as to my knowledge this wasn't an issue for many years so good to know the actual cause and it looks like you are looking into that. I checked the userbrowse rework branch you have so far and didn't notice any issues. Not sad that uglytree dies if browse speed is maintained, and it looks fast to me.

@ghost
Copy link
Author

ghost commented Sep 10, 2016

Yes it's strange. The GTK2 changelog from 2.24.25 to 2.24.30 seems not that big but it's causing problems only for this specific part. Other parts of n+ using a standard model (searches, userlist, etc..) aren't affected from what I can see.

What would be nice is testing every point release between 2.24.25 and .30 to find out when things begins to go south, but since they're not packaged under fedora/ubuntu it would imply building GTK from source and testing manually which would take time. I'm a little bit short of it since the end of holidays :)

Things I still need todo on the branch based on the standard model used before uglytree:

  • Finish porting upload functions to the new 'shares' datastructure: easy but need testing.
  • Test userbrowse for browsing weird/old shares (ex: pre-virtual shares on both n+ and slskqt which begin with a slash on unix or a drive letter on windows): should be easy.
  • Test userbrowse download/upload with various remote shares with non ascii/utf-8 encoding.
  • Futher speed improvements when changes necessary are not overly complex and the codebase is still easy to understand.

I will try to finish most of it before the end of the next week-end.
I will also continue to investigate which exact part/codepath of uglytree is causing problems: we never know that might be useful someday.

@ghost
Copy link
Author

ghost commented Oct 3, 2016

I've just finished up the conversion of the upload part.
September was a totally crazy month for me between work and a s# load of personal stuff to take care of.
Never had any time to work on n+ :(.
But now I'm back since things have settle down :)

I will do most the the testing part Wednesday evening.

@ghost
Copy link
Author

ghost commented Oct 9, 2016

So far I've tested these combinations without any visible regressions for browse/download/upload.
Local => Remote

This branch N+ Linux => This branch N+ Linux with virtual shares
This branch N+ Linux => This branch N+ Windows with virtual shares
This branch N+ Windows => This branch N+ Linux with virtual shares
This branch N+ Windows => This branch N+ Windows with virtual shares

This branch N+ Linux => N+ 1.2.16 Linux old share format
This branch N+ Linux => N+ 1.2.16 Windows old share format
This branch N+ Windows => N+ 1.2.16 Linux old share format
This branch N+ Windows => N+ 1.2.16 Windows old share format

Last few tests I need to do before merging to this branch on master:

This branch N+ Linux => Official SLSK client Linux
This branch N+ Linux => Official SLSK client Windows
This branch N+ Windows => Official SLSK client Linux
This branch N+ Windows => Official SLSK client Windows

@ghost
Copy link
Author

ghost commented Oct 10, 2016

@Mutnick : I've just finished the last few tests and everything seems good to me.
I will go ahead a merge this branch Wednesday since I'm out tomorrow.

@ghost ghost closed this as completed in d20577f Oct 12, 2016
kiplingw pushed a commit that referenced this issue May 3, 2020
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

1 participant