Skip to content
This repository

thread branch testing #107

Closed
IgnorantGuru opened this Issue June 22, 2012 · 42 comments

3 participants

IgnorantGuru arclance Jean-Philippe Fleury
IgnorantGuru
Owner

A temporary branch of spacefm named 'thread' is now available for early testing. This is a redesign and rewrite of spacefm's vfs threading. Now all task threads and the GUI thread should run completely independently and never block the others. This should increase responsiveness overall, and especially when using network filesystems like nfs. The new design also should reduce CPU usage while copying, etc, even on local filesystems.

IMPORTANT: As of now, this branch is highly unstable and unfinished, but see further comments below for progress over the next few days. Currently, please do not use this branch for anything but testing.

IMPORTANT: Currently command execution does not work properly - anytime spacefm has to run a bash command and get the output, etc., it may crash or fail. This part of the code is not yet done or tested. So testing that is not useful, but testing copy/move/link functions will be helpful, especially on nfs, etc.

TO INSTALL: Follow the BUILD NEXT instructions in the README, except change 'next' to 'thread' in those commands.

When you are done testing, be sure to build next or master again for normal use - DO NOT USE the 'thread' branch for anything real.

Please add comments for any problems with the 'thread' branch below. Thanks for testing.

arclance

I don't have any drives nfs mounted that have data that I would expose to potential harm from an untested program.
What could potentially go wrong if I test copying over nfs with this branch?
Could it corrupt the data already on the drive or not?

Is there a way to make a separate install for the thread branch so I don't affect the version I currently have installed?

IgnorantGuru
Owner

exec functions (when spacefm runs bash or spawns commands) should now be working reasonably well - so everything is up for testing now. However there are still some changes yet to be made to this code.

I found some thread blocking in some of the old GUI code. In particular this affected loading a new dir while a task is running. These should no longer block.

IgnorantGuru
Owner

I don't have any drives nfs mounted that have data that I would expose to potential harm from an untested program.
What could potentially go wrong if I test copying over nfs with this branch?

I would suggest creating a 'test' folder on your nfs share for testing purposes. If you stay in that folder you shouldn't have any issues. There are no guarantees, but the vfs functions that handle io weren't changed significantly, except where they update the progress bar and such. So I wouldn't expect any data corruption. I've been using it when it was far more unstable than now, and I had no corruption issues.

Could it corrupt the data already on the drive or not?

Highly unlikely. The main issue here will be occasional crashes and hangs (hopefully few as time goes on). Also some custom functions such as 'Popup Output' and such may be broken (let me know). But the core vfs i/o shouldn't be affected. Obviously if it crashes then the file it was copying will be incomplete.

Is there a way to make a separate install for the thread branch so I don't affect the version I currently have installed?

The way I'm doing it is I have the 'next' branch installed. The 'thread' branch I just build and run, but don't install (skip the 'make install' step and just run src/spacefm after make - be sure all instances are killed first). Because the /usr/share glade ui files, etc haven't changed from the next branch, this is okay in this case. Thanks for testing.

arclance

I just built the thread branch and tested a copy over nfs.
I copied a large file that had been broken up into rar files to test one of the worst situations (several files and a large amount of data).

It is actually worse than before, after the progress bar came up the window became unresponsive for 5 to 10 seconds.
After the window resumed responding it was very similar the next branch except the progress bar updated much more smoothly and each time the gui got sluggish it did not last as long.

There was a lot of terminal output, would you like me to post it here or somewhere else?

I did notice a error that shows up in the other branches but I had missed before.

(spacefm:6529): GdkPixbuf-CRITICAL **: gdk_pixbuf_scale_simple: assertion `dest_height > 0' failed

IgnorantGuru
Owner

I just built the thread branch and tested a copy over nfs.

Okay thanks for the feedback on that - it's still early, should be some improvement over the next few days. I'm going to try nfs after I finish up a few other things. I'll probably need to see the hangs in action so I can track down where they are - its a challenge with multiple threads. I think the vfs is hang-free now, but some of the older gui code still blocks in a few places.

There was a lot of terminal output, would you like me to post it here or somewhere else?

Not necessary for lack of responsiveness - won't really tell anything. When you do need to post output, a gist or pastebin is probably best.

GdkPixbuf-CRITICAL **: gdk_pixbuf_scale_simple: assertion `dest_height > 0' failed

Haven't seen that - if you can figure out how to reproduce it that would help. Or you can use BUILD DEBUG instructions and in gdb, instead of 'run', type 'run --g-fatal-warnings' which will make it halt on that warning. Then do 'bt full'.

arclance
Haven't seen that - if you can figure out how to reproduce it that would help. Or you can use BUILD DEBUG
instructions and in gdb, instead of 'run', type 'run --g-fatal-warnings' which will make it halt on that warning. Then do 'bt full'.

It happens once every time I start spacefm that I know of, I don't know if it happens more often than that but I will keep an eye out for it.

Here is a pastebin from a debug build of the next branch from 2012-06-15.
http://pastebin.com/jAa4Zewb

Do you want me to test the thread branch as well?

IgnorantGuru
Owner

Do you want me to test the thread branch as well?

No that code is the same in both, and the backtrace shows me exactly where it is, so I should be able to correct that. Thanks for the details.

arclance

Good I did not think you would need both and I already had next branch built for debug so I used that one.
I looked at the backtrace before, is it really trying to make a 0x16 thumbnail image?

IgnorantGuru
Owner

I looked at the backtrace before, is it really trying to make a 0x16 thumbnail image?

I didn't write that code, just bugfixed it in a few places, so I'll need to look it over to find out what's happening there. That 0 is probably the result of either a bad calculation or some failure in determining the size - could be a bug in gdk_pixbuf for example. But I can at least test for that condition. It's only a warning anyway - since it didn't crash it probably doesn't affect much. You might have a corrupt icon on your system, for example.

arclance

It looks like it's having trouble loading the thumbnail for this image.
http://www.imagebam.com/image/2f0659192877873

Here is the thumbnail for that image that spacefm was trying to load.
http://hcd-1.imgbox.com/aancbk55.png

The thumbnail is only 5px high, I think the problem is that it is trying to show a 16x16 icon for that image.
It seems it does not check if one of the icon dimensions becomes 0 after scaling the icon and then generates a warning because of the invalid height parameter.

IgnorantGuru
Owner

For anyone interested in the details on the above commit...

One kind of task spacefm run is an 'exec task', which basically means it runs another program (bash, a terminal ,etc). This is what runs custom command scripts and some built-ins. Unlike other tasks, this wasn't run in another thread. When I rewrote the vfs threading, I made it a separate thread. But then I discovered that gio always adds io watches to the main loop thread anyway. (The io watch is used to collect stdout and stderr and show them in the run dialog.) Since that's the only busy part, I have reverted to back to not using a separate thread for exec.

But I did find a way to set the priority on the watch very lower - lower than GTK+ or GLib ever uses. Thus collection of the stdout/stderr channels won't preempty any GUI activity in the main loop (won't get sluggish).

This means that spacefm can now handle much more output in its run dialog. If you flood it, you won't get the old 'stdout has been closed - please run this command in a terminal'. To test a flood, create a custom command like this:

while (( 1 )); do echo test $x; (( x++ )); done

If you really push it by running several of those at once you might get a hang - I got one once with several running and changing directories rapidly at the same time. But that is an extreme test and even that hang may clear up with more polishing.

The exec tasks (custom commands) should now be working normally in the thread branch.

IgnorantGuru
Owner

Working with curlftpfs I found that it seemed to do nothing on a large copy. What in fact was happening was it was calculating the size of the files to copy, which with curlftpfs is apparently very slow. It doesn't freeze the GUI because its in another thread.

The above commit fixes an uneccessary mutex lock so you can at least see a task running and the total size increasing.

I think next I will make it abort the size calculation if it takes more than a few seconds - it doesn;t really need to know the total size except for progress display, and it doesn't make sense to wait several minutes for it.

IgnorantGuru
Owner

curlftpfs is now working reasonably well. This fs is very slow in general, even just to read a directory. This can block the GUI thread at times because some preliminary things are done in the GUI thread when opening or changing directories (such as simply verifying the directory exists). So some lag there is expected.

Also tested sshfs with good results - this is pretty fast and didn't observe any lags, even while running a flood process and do some local copies at the same time.

About to test nfs.

IgnorantGuru
Owner

Some updates on the status of nfs support:

I was able to reproduce the unresponsiveness. The cause is very simple: a write() call on nfs not only blocks its own thread but also the main loop thread (and perhaps all threads), and also causes the entire program to be uninterruptible. Even ctrl-c in gdb won't interrupt it. In fact once it even temporarily locked up the entire desktop, taskbar, and all apps.

write() is supposed to block execution in it's own thread (wait until the bytes are written), but I don't understand why it's affecting other threads. I don't know if this is normal for nfs and some kind of non-blocking I/O needs to be used, whether it is due to something peculiar to spacefm, or whether some mount options affect this behavior.

Nothing I tried, including changing the thread priority (which reportedly has no effect in Linux anyway), and using soft and intr mount options, had any effect on the issue.

I don't see any docs that explain or cover this, or any code example indicating that write() needs to be treated differently for nfs.

In the old spacefm threading, after each write() call the task thread would gain high priority (gdk_threads_enter), then update the progress display (GTK is not thread-safe so this is required, but causes other threads to stop). This was in efficient and caused GUI lag. In the new design, mutex (mutual exclusion) locks and precision timers are used so that the display updates periodically from the main loop thread without causing other threads to stop. In the unlikely event of an already locked mutex, the progress is not updated that time, so that the GUI thread is not blocked waiting for the mutex to be unlocked.

But with nfs, the GUI thread is simply stopped completely during the write() call, so obviously everything becomes unresponsive, even though the writes are proceeding and will eventually finish. On a large file this is a considerable delay. The only time the display is updated will be if the 50ms progress timer in the GUI thread happens to fire while the nfs task thread is between writes. Since the nfs task thread spends most of its time doing writes, the display won't be updated hardly at all. Same for other GUI events - they have to wait until the GUI thread is given some CPU time between nfs writes.

I also observed the long delay when first mounting nfs. I didn't look into its cause but it looks to be a related issue.

Unless some solution is found for this, my suggestion is don't bother using spacefm with nfs, unless you are willing to wait. You could also create a custom command that copies files using cp, and since spacefm runs this as another process, it should run truly independently.

Or, try using sshfs instead, which seems to work very well with spacefm.

I don't like nfs - I think it's horrible in general - so I'm not inclined to spend huge amounts of time on this issue. But if a solution simpler than designing asynchronous I/O just for nfs exists, I'm willing to take a look.

IgnorantGuru
Owner

arclance, you mentioned the old threading used to cause some lag on local filesystems. Any difference with the new design? I can't reproduce any lag in either.

As mentioned above, the GUI thread does some preliminary access before starting a new thread, such as verifying a directory exists before changing and reading a directory (reading the file list is done in another thread), so if you system doesn't respond to this simple stat I/O quickly, you can notice some lag in the GUI. But I don't think this is unreasonable, and it would be hard to avoid. The only place I've seen it very noticeable is when using curlftpfs.

arclance
Nothing I tried, including changing the thread priority (which reportedly has no effect in Linux anyway), and using soft and intr mount options, had any effect on the issue.

The intr mount option should have not have an effect on the problems with spacefm.
It just allows you to use umount -f to unmount the nfs drive if you loose the connection to the server while also using the hard mount option.
hard,inter is better than soft because it is less likely to cause data corruption.

I also observed the long delay when first mounting nfs. I didn't look into its cause but it looks to be a related issue.

I don't see this when I mount nfs in a terminal, I have not tried mounting them with spacefm so I can not say what happens if I do that.
Do you also see that mounting lag if you mount the nfs drives in a terminal?
Also what speed and type (router, switch) of network are you running nfs on?
A slow network could cause some of the initial lag you are seeing.

Or, try using sshfs instead, which seems to work very well with spacefm.

I don't like nfs - I think it's horrible in general - so I'm not inclined to spend huge amounts of time on this issue. But if a solution simpler than designing asynchronous I/O just for nfs exists, I'm willing to take a look.

I have a lot of drives nfs mounted, I don't think that sshfs is good for that but I could be wrong.
I also don't know if sshfs is more or less likely to cause data corruption and data integrity is more important to me than performance in spacefm.
I know that the way I have nfs setup on my system it would be very difficult for data corruption to occur.

arclance, you mentioned the old threading used to cause some lag on local filesystems. Any difference with the new design? I can't reproduce any lag in either.

I have not tested it in the thread branch but it got much better at some point in the stable/next branch.
I think it was around when most things got extracted in a terminal since the extraction progress bar was the leading cause of that before.

I have not tested large local copies in a while so I can't say how much they have improved.
I do see some lag with local copies on my slower dualcore system, but it is under decent load so it may not have anything to do with spacefm.

IgnorantGuru
Owner

The bottleneck on my last test was a 100Mbit switch, wired connections, with the switch on a gigabit high performance router. I think the long delay when first mounting is likely caused by spacefm doing a stat or something simple like that either before nfs is ready, or while it's warming up. I didn't look into it in detail because the other issue makes it so unusable anyway, but I could take a look. There was no delay mounting nfs in a terminal. The was no delay in spacefm either - until after it was mounted, afaict.

Starting spacefm when nfs was already mounted and entering the nfs mount point had no delay. Deleting even large files worked fine. But copying large files, the GUI was unresponsive.

I don't have much experience with sshfs - just know it was very responsive in my spacefm tests.

arclance
The bottleneck on my last test was a 100Mbit switch, with the switch on a gigabit high performance router.

That might cause some lag but I don't know, I routinely run at max speed over my gigabit switch using nfs.

IgnorantGuru
Owner

I don't believe network speed is the critical issue. If write() on nfs merely blocked its own thread it wouldn't interfere with the GUI. The copy could be as slow as it wants.

IgnorantGuru
Owner

Re nfs, I received this from someone I asked about it:

I don't think there should be any issues. I whipped up a short program
and tested this here, and it seems like the happily-running thread is
not affected by blocking write() in a writer thread:

http://pastebin.com/DTghKZ6d

It takes about 10 seconds per 64 MB write on my setup, and the other
thread ticks along at 5x per second, unaffected by the writing.

Which seems to at least confirm that it should work. So maybe it's a glib issue (the above example uses pthread while spacefm uses glib for threading). I'll see what I else I can determine when I get some time to look at it.

IgnorantGuru
Owner

A few more notes on testing nfs:

Copying large files from nfs to local works fine. It seems only writing to nfs has the problem.

Whatever is happening, spacefm becomes an uninterruptible process (kill -KILL won't interrupt it), even though the write loop keeps running! Uninterruptible normally means it's held in a kernel level system call. This could be a kernel bug (another developer suggested this to me too). Although unusual, several times again the whole desktop and window manager became frozen as well, so whatever nfs is doing it's not just affecting spacefm.

I tried creating the thread with pthread directly rather than glib - same problem.

What I find odd is that the loop containing the write() call keeps ticking along; it's not just blocked for a long time on one call. Yet the main thread seems to get CPU (and even userspace interrupts) very rarely nonetheless - thread scheduling issue?

arclance

I have never seen a nfs write affect anything other than spacefm.

IgnorantGuru
Owner

A bit of a breakthrough on nfs: I tracked down what is happening in the main thread. For one, the inotify file change monitor is detecting the change in the directory as the file is being copied. This triggers a stat on the file (high priority part of main thread). The stat blocks and holds the whole thread, causing the unresponsiveness in the GUI.

With the file change monitor disabled in the code, things improve greatly. But clicking on menus often calls stat on the currently selected file, which also causes a lag. So it's these stat calls in the nfs directory that are causing the majority of the problems.

This can be confirmed by closing the tab containing the nfs dir while the copy is running. It takes a few seconds to close while the GUI is unresponsive, but once that directory is no longer open in spacefm, the responsiveness returns to normal and the copy proceeds. So that is one workaround for the problem - close the nfs dir after starting the copy (or use Edit|Copy To|Location to never open it).

So that explains what is happening. Correcting this problem doesn't look to be trivial though.

IgnorantGuru IgnorantGuru referenced this issue from a commit June 25, 2012
IgnorantGuru fix gui unresponsive with nfs #107
inhibit file change monitor on non-block devices
in do_copy after initial create, emit create and flush
in do_copy after close, update file display (file size)
in chmod, update file display (file size)
e31f6ea
IgnorantGuru
Owner

The above commit should clear up most of the nfs issues:

1) spacefm now disables the file change monitor for directories which are not on a block device. This prevents the main thread from calling stat while nfs writes are in progress, which were a big source of the problem. This means you may need to refresh the directory to see updated file sizes or other changes.

2) file creations and deletions are still detected by the monitor. But after first creating a new file on copy, it will tell the monitor the file was created and flush the monitor cache before calling write(). This helps avoid a stat while writes are in progress.

3) After a file has been successfully copied, it will manually force the monitor to update info on the new file (so the size displayed will be correct). During copy the size in the file list will show as zero (due to changes not being detected in the dir).

4) Similarly, a chmod/chown task (setting permissions from the Properties dialog) will manually force the monitor to update info on the changed file(s).

Note that doing anything in the nfs dir tab while a large copy is in progress may freeze the GUI for several seconds. This is because spacefm needs to call stat and sometimes read files to show a menu, etc. For example, if you right-click on a file, it needs to get the mime type of the file to know how to populate the Open menu. I'll take a look at any unnecessary file access that can be eliminated to minimize this, but to some extent it needs to be this way. For best results, don't do much in the nfs dir tab while a large copy is in progress. Other tabs should not be affected. (Other file managers may not have this limitation because they use vfs's that cache file properties and attributes more. SpaceFM doesn't use such vfs's and uses system calls with some internal caching,.)

On sshfs, I just noticed that copying to sshfs makes the GUI unresponsive (same in 0.7.8). I'm going to see if anything can be done on that.

IgnorantGuru
Owner

Also, the nfs fixes will work for udev builds only. If you build with --enable-hal, the file change monitor won't currently be disabled on nfs.

Jean-Philippe Fleury

On sshfs, I just noticed that copying to sshfs makes the GUI unresponsive (same in 0.7.8). I'm going to see if anything can be done on that.

I tested with the last thread commit (e31f6ea), and steps described here:

#95 (comment)

still cause GUI lag. Lag occurs only with large files copy. For example, if I copy a lot of little files, there's no lag, but if I copy only one big file, there's a lag.

Jean-Philippe Fleury

Is there a way to make a separate install for the thread branch so I don't affect the version I currently have installed?

Personally, when I test a development version, I install SpaceFM in a custom folder thanks to the prefix option, and I run SpaceFM with a new config. Example:

mkdir /tmp/spacefm-build
cd /tmp/spacefm-build
wget -O spacefm.tar.gz https://github.com/IgnorantGuru/spacefm/tarball/thread
tar xzf spacefm.tar.gz
cd IgnorantGuru-spacefm-*
mkdir build
./configure --prefix=/tmp/spacefm-build/build
make
make install
mkdir config
/tmp/spacefm-build/build/bin/spacefm -c /tmp/spacefm-build/config
IgnorantGuru
Owner

A few changes for sshfs:

On large file writes, it really didn't like the create file, update display, write file, update display sequence (for some undetermined reason). So now all copies to non-block (nfs, sshfs, etc) will first copy the file, then update the display. This means the folder won't show the new file until it's closed, unless you manually refresh (which has delay consequences during a large write).

sshfs is very touchy when writing a large file TO the sshfs share. Even a eaccess call will send it into a 10 second delay. The best thing to do is start the copy then click another tab, leaving the sshfs tab alone until the copy is done. Any activity in the tab, such as right-clicking on a file, may cause an unresponsive GUI for a few seconds at least. It does recover though.

While I don't know how other file managers handle this in detail, spacefm is fairly low level in its system calls compared to file managers that have dependencies like gvfs. Normally this makes it quick and accurate, similar to using a shell, but it does cause some issues when using net filesystems - especially when writing larger files TO them. Spacefm's menus also tend to be context sensitive, so it needs to check mime types, etc. Even clicking on a tab causes it to stat some things to update the status bar totals. All of this activity tends to get blocked while a large network write is in progress, causing the GUI to be temporarily frozen.

IgnorantGuru
Owner

The above commit corrects one of the sources of a long delay when first opening nfs. At least with my nfs server, when I restart the server and then mount and open the dir on another machine, opening a file involves a delay of about 30 seconds. There's nothing spacefm can do about this - it even occurs if I mount nfs in a terminal and then cat a file in the dir.

But what was happening on a new dir load in spacefm is it attempted to open the '.hidden' file. Even if the file didn't exist, this open() call would block for about 30 seconds due to nfs. So now spacefm first calls euidaccess() to make sure the file exists before calling open(). This got rid of the delay opening the dir in some cases.

However, note that the first thing spacefm does when opening a dir is it gets the mime type of all files, which in some cases involves opening them. So you may still get a delay before files are displayed. But the mime-type work is done in a background thread, so it shouldn't block the GUI (unless you click refresh).

Once the server is up and running, subsequent starts of spacefm, or even unmounting and mounting nfs, doesn't incur the delay. It only happens when I restart the server, then mount. Since it behaves the same way in a terminal I'm considering this an nfs issue (could just be my server too, but arclance mentioned a delay of about that length).

That's about it for the network filesystems, so if there are any other delays or issues not covered by the above explanations, please let me know. Next I'm going to be testing everything thoroughly and fixing some minor things before merging thread into next. In my uses thus far the thread branch is now working well - no crashes.

arclance
could just be my server too, but arclance mentioned a delay of about that length

That long delay was just at the start of a copy not when I opened a folder on the nfs drive.

I get at most a 2 second delay when first opening a nfs drive, that can also happen on a local directory if it has a lot (hundreds) of files in it.
Image folders are the worst because it has to get the mime types and load thumbnails.

Your longer delay might be due to your network having a bottleneck 10 times slower than mine.
I get reads at close to 1000 Mb/s or around 80 MiB/s through my gigabit switch most of the time.

Have you checked the latency between your nfs server and client?
You said you were going through a router that will introduce more latency than just using a switch.

I don't see any observable delay opening files over nfs except when a program wants to cache a large file for some reason.

arclance

I just tested a nfs copy using the latest version of the thread branch and while the gui is much smoother the copy runs at about half the speed as using the next branch.

The next branch gave a copy speed of about 33MiB/s and the thread branch gave a copy speed of about 15MiB/s.

The thread branch also started copying for about 1 second and then the copy froze for about 10 seconds before resuming.

A labeled screenshot from my network monitor conky illustrating this problem can be found below.

http://i.imgbox.com/aayyEoCK.png

IgnorantGuru
Owner

The next branch gave a copy speed of about 33MiB/s and the thread branch gave a copy speed of about 15MiB/s.
The thread branch also stared copying for about 1 second and then the copy froze for about 10 seco

The delay likely lowered the average speed. You might set the 'Current Speed' column visible, or set View|Tasks|Popups|Detailed Stats (visible in a popup run dlg). The current speed is based on the last 2 second interval, whereas the average speed is merely the total data moved divided by the time (including delays).

If a copy freezes for real (not just the GUI), it's almost certainly an nfs delay. Doing anything else with the nfs folder (even in another app) could also cause spacefm to call stat on a file, etc., which may cause an nfs delay too.

Thanks for testing - let me know if you can measure an appreciable difference in speed between branches over a few tests, noting the current speed as a more accurate measurement. I doubt the thread branch would be inherently slower, probably just nfs variations, but I didn't do any speed comparisons.

nfs is horrible for testing because it's very inconsistent and subject to many causes of delay.

arclance

I will do some more tests with that option but if you look at the image I posted you can clearly see that the upload speed on my ethernet connection (only runs nfs) is different between the two tests.

What would be the best way to measure copy speed when copying from a terminal to get another point of comparison?

I was not doing anything else with the nfs drive when the copy froze, moving to a different tab did not affect the copy at all.
The gui did not freeze just the copy, you can see in the image I posted that there is a gap in the network use which corresponds to the frozen copy.

I also just had a crash in the next branch while deleting files off the nfs drive so I could test copying them again.
A pastebin of the backtrace is below.

http://pastebin.com/305i8Cdm

Should I start a new bug report for the crash since it is not related to the thread branch?

IgnorantGuru
Owner

I will do some more tests with that option but if you look at the image I posted you can clearly see that the upload speed on my ethernet connection (only runs nfs) is different between the two tests.

One test is virtually meaningless. Spacefm twiddles its thumbs during most copies - the speed of a drive or network is very slow compared to the cpu, so mostly it just waits for the write() call to come back from the kernel.

Just ran some tests copying from one internal hard drive to another with 0.7.8 and thread (good to do a local copy speed test that doesn't involve the network first) (all in M/s average speed with a 2GB file):

thread  0.7.8   cp
37      38      38.7
38      38      38.9
38      39
39      38
39      38

No appreciable difference as expected. To measure cp's speed:

time cp <file> <dest>

Then use the 'real' time returned for the calculation. spacefm's M/s is megabytes per second (1024x1024 bytes).

For me time returned "real 0m52.810s", so:
2141135754 / 1024 / 1024 / 52.81 = 38.7 M/s

Also, spacefm uses the same function for copying nfs as for any other fs. This test shows no functional difference. So any other difference is spacefm waiting for the kernel (nfs).

Also be sure both branches are built with the same configure options (iow no gdb debugging symbols in one and not in the other).

IgnorantGuru
Owner

Should I start a new bug report for the crash since it is not related to the thread branch?

The code involved has been reworked in the thread branch, soon moving to next, so the bug report is no longer useful. Hopefully what caused that has been corrected with the improved threading. Hard to say, but it looks like maybe it tried to open the task dialog just after the task finished. That kind of timing problem should now be corrected.

arclance
The code involved has been reworked in the thread branch, soon moving to next, so the bug report is no longer useful.
Hopefully what caused that has been corrected with the improved threading.

Ok if that bug does crop up again it seems to happen when the progress bar goes away at the end of the delete if the last file is small in size like a short text file.

Hard to say, but it looks like maybe it tried to open the task dialog just after the task finished. That kind of timing problem should now be corrected.

Given when it happens (it happened to me 3 more times while testing) that was my guess as to the cause of the crash.

Further testing does not show a significant difference between the nfs copy speed between the next and thread branches.

The thread branch may be slightly slower (0.5 MB/s) but I would have to do a lot more tests to confirm such a small difference.

IgnorantGuru
Owner

Ok if that bug does crop up again it seems to happen when the progress bar goes away at the end of the delete if the last file is small in size like a short text file.

I had it crash once a month or two ago under those conditions, but I was never able to reproduce it - probably a race condition in the old threading. If you catch anything like that in the new code I'd like to know.

Just ran some speed tests on nfs. Not sure what the 80 is about (some kind of low level caching most likely) but otherwise unremarkable (copying a 1.6GB file):

thread  0.7.8   cp
80      11      11.05
11      11
11      11

If anything, the thread branch should be more efficient - less CPU time. But the bottleneck is the network / harddrive. Also, 0.7.8 and next are equivalent for these tests.

arclance

This is what my test looks like with a 1.3 GB split rar archive.

thread   next   cp
55       51     54.604
54       54     51.073
51       54     54.600

This is what I get for a single 1.2GB file.

thread   next   cp
32       39     34.403
35       33     31.569
33       33     31.520

The single file transfer actually happened at about 100MiB/s and was done in about 15 to 20 seconds.
Then there was a pause with very little network activity before the transfer actually finished.
I saw this even with cp so it may be due to my nfs drives being mounted with "sync" which delays writing until the whole file has been received by the server.

IgnorantGuru
Owner

The above commit changes delete behavior: spacefm will no longer recursively calculate the total size of files to be deleted. I think this is unnecessary overhead as deletes are usually too fast to see the info anyway, and it's another delay on network fs's. But feedback is welcome - should this change be this included or reverted? What's lost is a (questionable) time estimate for when the delete will be complete and the total size of all files being deleted (often gone too quick to read). otoh it doesn't take very long to calculate unless the dirs are deep or the fs is slow.

Then there was a pause with very little network activity before the transfer actually finished.
I saw this even with cp so it may be due to my nfs drives being mounted with "sync" which delays writing until the whole file has been received by the server.

Usually on the close() call (when writing to the file is done), actual writes must be received and written by the server, afaik. Probably is affected by sync.

Also, after the close, spacefm will trigger the file browser to add/update the info for the file (if the dir is open in a tab). This means a stat on the file, which nfs handles either gracefully or not depending on who-knows-what. So that can be another little delay, which can affect the GUI too. But for me it was mostly working smoothly as long as I left the nfs dir tab alone until it completed the copy.

Thanks again for the testing and feedback.

IgnorantGuru
Owner

The problem: What criteria should be used to determine whether or not to detect changes in a displayed directory?

Currently [in the thread branch], only directories on devices which are block devices, or which are type 'tmpfs', have change detection enabled. All others, such as nfs, do not, to prevent the delays caused by calling stat while writes are in progress.

I think this may cause problems for some fuse filesystems, though - they may not be networked and thus change detection would be desirable. One option is to whitelist these as tmpfs is whitelisted. But I would need to know what they are.

Comments?

IgnorantGuru
Owner

The 'thread' branch has been merged into the 'next' branch and deleted. This means the next branch now contains these updates. I have tested it quite a bit but it can use some routine use before release.

You can comment on additional issues here (even though this issue is closed) or open a new issue.

IgnorantGuru IgnorantGuru closed this June 30, 2012
IgnorantGuru
Owner

In 0e82cf5 (spacefm >0.7.10) file change detection has been changed to a blacklist. Block devices always detect changes. Non-block devices detect changes except for nfs, fuse (including curlftpfs, fuse.sshfs, fuse.*, excluding fuseblk), smbfs, ftpfs.

If you know another that should be blacklisted (a filesystem which hangs or performs poorly if change detection is enabled), please comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.