Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copying files is slow and slows to a crawl over time for large numbers of files #1

Open
elementaryBot opened this issue Jun 18, 2017 · 24 comments

Comments

@elementaryBot
Copy link
Contributor

elementaryBot commented Jun 18, 2017

Copying a lot of files via Pantheon Files becomes slower and slower over time.

I've created 250,000 100-byte files on tmpfs for testing, and kicked off copying to another tmpfs. It started off at speeds over 100Kb/s but halfway through it's just 4Kb/s (!) and dropping.

Profiling with sysprof shows that all this time is spent in g_list_last(), which probably means that we're abusing a linked list somewhere and that it has to walk the entire list of already copied files, one by one, for each next file copied.

Testcase:
mkdir ~/created-files ~/copy-here
sudo mount -t tmpfs -o size=1G,mode=0777 tmpfs ~/created-files
sudo mount -t tmpfs -o size=1G,mode=0777 tmpfs ~/copy-here
cd created-files
split -b 100 SOME-BIG-FILE

open Pantheon Files and copy "created-files" folder into "copy-here"

This is a synthetic test case, but I had over 250,000 files during my last backup for OS reinstallation, so this is a real-life scenario.

ProblemType: Bug
DistroRelease: elementary OS 0.3
Package: pantheon-files 0.1.5.1+r1680+pkg35~ubuntu0.3.1 [origin: LP-PPA-elementary-os-daily]
ProcVersionSignature: Ubuntu 3.13.0-43.72-generic 3.13.11.11
Uname: Linux 3.13.0-43-generic x86_64
ApportVersion: 2.14.1-0ubuntu3.6
Architecture: amd64
CrashDB: pantheon_files
CurrentDesktop: Pantheon
Date: Sun Dec 21 04:42:10 2014
ExecutablePath: /usr/bin/pantheon-files
GsettingsChanges:

InstallationDate: Installed on 2014-12-10 (10 days ago)
InstallationMedia: elementary OS 0.3 "Freya" - Daily amd64 (20141209)
SourcePackage: pantheon-files
UpgradeStatus: No upgrade log present (probably fresh install)

Launchpad Details: #LP1404588 Sergey "Shnatsel" Davidoff - 2014-12-21 01:53:22 +0000

@elementaryBot
Copy link
Contributor Author

pantheon-files-daemon has a lot of memory used too (250Mb)

Launchpad Details: #LPC Sergey "Shnatsel" Davidoff - 2014-12-21 04:41:05 +0000

@elementaryBot elementaryBot added the Priority: High Significantly affecting majority of users' normal work label Jun 18, 2017
@elementaryBot
Copy link
Contributor Author

In addition, such state makes any operations in Files very slow. Even the startup process of pantheon-files while pantheon-files-daemon is in such state is very slow.

Launchpad Details: #LPC Sergey "Shnatsel" Davidoff - 2014-12-21 04:42:04 +0000

@elementaryBot
Copy link
Contributor Author

Replacing GList with GSequence data structure might be a way to hotfix this without changing huge amounts of code.

Launchpad Details: #LPC Sergey "Shnatsel" Davidoff - 2014-12-24 21:55:20 +0000

@elementaryBot
Copy link
Contributor Author

A bounty of 100$ has been placed on this bug

Launchpad Details: #LPC Jeremy Wootten - 2015-02-18 14:55:15 +0000

@elementaryBot
Copy link
Contributor Author

Might this be related with a similar issue concerning very slow file transfer to USB stick?

Launchpad Details: #LPC Giulio Sant - 2015-02-23 16:46:20 +0000

@elementaryBot
Copy link
Contributor Author

I have changed the bug description to clarify that the bounty relates to obtaining significant improvement in file copying performance in general, not just for file numbers of the order of 100,000. Even with comparatively small numbers of files (100 - 1000) Files is very much slower than other well known file managers. I have increased the bounty to reflect the widened scope.

Launchpad Details: #LPC Jeremy Wootten - 2015-03-01 08:04:25 +0000

@elementaryBot
Copy link
Contributor Author

Note solved but it was improved some so I'm bumping it from the milestone

Launchpad Details: #LPC Cody Garver - 2015-03-25 11:33:10 +0000

@elementaryBot
Copy link
Contributor Author

I assume both source and destination were open in a Files view during the copy? Different tabs or different windows? Icon View or other?

Launchpad Details: #LPC Jeremy Wootten - 2016-04-09 10:22:46 +0000

@elementaryBot
Copy link
Contributor Author

So I ran a couple benchmarks to see if I could figure out what the problems might be here. Wrote a simple program, basically a "cp" clone using g_file_copy to benchmark copy speeds against "cp" itself. What I found is that g_file_copy has very similar performance to "cp" (copied 10,000 small files at about 160kB/sec), so no problems there. Seems more like this has to do with all the queuing and locking going on in the file manager. Been swapping out various data structures and benchmarking and seeing some small performance increases. Removing some locking from the deep counter and switching out the marlin file queue for a thread-safe GAsyncQueue improved things a bit. I've been getting between 40kB/sec to 60kB/sec with those changes. It might also be worth swapping out the GIOScheduler stuff since that is deprecated. Not sure if that will bring any speed increase with it.

Launchpad Details: #LPC Matt Spaulding - 2016-10-12 20:53:15 +0000

@elementaryBot
Copy link
Contributor Author

Matt: Thanks for having a go at this. Just for clarity, the target is to get Files to be at least comparable to other popular filemanagers in performance in this aspect, say within 75%? This assumes that other features like "undo" that might affect speed are also comparable.

Launchpad Details: #LPC Jeremy Wootten - 2016-10-13 10:46:49 +0000

@elementaryBot
Copy link
Contributor Author

Okay, thank you for the clarification. Which file managers should I run comparisons against? At least in my tests with Nautilus it's copy speeds with large numbers of files is very poor, comparable to what we're seeing with Files.

Launchpad Details: #LPC Matt Spaulding - 2016-10-13 15:09:16 +0000

@elementaryBot
Copy link
Contributor Author

Matt: I was thinking of Thunar and PCFman primarily although I have not done a comparison recently I admit. I assumed Nautilus was was superior at time of filing of the bug but perhaps things have changed. It is a fairly old bug now.

If Files is (now) already comparable to the best file managers under the conditions quoted in the bug then I would be willing to change the target to a more modest improvement and/or fixing of associated memory leakages.

Launchpad Details: #LPC Jeremy Wootten - 2016-10-13 17:34:41 +0000

@vjr
Copy link
Member

vjr commented Jun 20, 2017

Not sure if launchpad comment will get forwarded here but I've got a patch if anyone would like to try it out and comment? See vjr@c972549

@danirabbit danirabbit changed the title Copying files is slow and slows to a crawl over time for large numbers of files [$200] Copying files is slow and slows to a crawl over time for large numbers of files Jul 29, 2018
@ghost
Copy link

ghost commented Sep 6, 2018

Is this still needing to be resolved?

@elegaanz
Copy link

elegaanz commented Sep 6, 2018

Yes, I can still reproduce this behavior sometimes with Juno.

@jeremypw
Copy link
Collaborator

Need to ensure that files are not being swapped out as tmpfs fills up.

@jeremypw
Copy link
Collaborator

Does this only happen for copying or also for moving or linking?

@jeremypw
Copy link
Collaborator

Is the destination folder being displayed? If so, some of the delay may be caused by processing FIleMonitor signals and updating the associated async directory object and the display widget - especially for GtkIconView which gets very sluggish for a large number of file items.

@jeremypw jeremypw added the Status: Incomplete Requires more information to confirm or reproduce label Oct 31, 2018
danirabbit pushed a commit that referenced this issue Jun 19, 2019
@jeremypw
Copy link
Collaborator

jeremypw commented Jul 30, 2019

I found recently (version 4.1.9 on Juno) that repeated use of <Ctrl>A, <Ctrl>C and <Ctrl>V to create large numbers of (empty) files (doubling the number each cycle) becomes very slow after a few cycles (about 10).

@ghost
Copy link

ghost commented Jan 31, 2020

I am not a developer; however, has using the tar command ever been considered? I have used it to efficiently copy large numbers of files. The following articles elaborate on its utility:

Note that the last article demonstrates that enabling noatime in particular for the file system sped up the process, which may be a worthy consideration.

@jeremypw
Copy link
Collaborator

I agree that for very heavy file manipulation tasks special tools are better than Files which at the moment is more suitable for general file browsing and operations on small numbers of files. Things like the color-tag plugin and the undo manager contribute an increasingly large overhead as the number of files increases.

@ghost
Copy link

ghost commented Feb 2, 2020

@jeremypw Why stop at “general file browsing,” though? Would it not be wonderful if Files were better equipped for universal applications rather than being adequate for “small” file operations? To be able to use the first party applications even under heavy loads would lessen one's dependence on the Terminal or third parties; for example, using Files to perform large copy operations, using Photos to edit very large photographs, or using Music to manage a library of several thousands of songs.

@jeremypw
Copy link
Collaborator

jeremypw commented Feb 2, 2020

Absolutely, but everything is limited by the developer time/abilities available. All developers are free to contribute new abilities to Files and improve the existing ones, but none are paid to do so. Also there is sometimes a pay off between ease of use and efficiency. e.g. it is useful to have unlimited undo abilities and color tagging but this slows up file operations. An app that focuses purely on moving files may not implement these.

@ghost
Copy link

ghost commented Feb 2, 2020

I understand. I did not mean to presume that the elementary OS team is swimming in cash, employees, and/or volunteers. I do hope that a solution will be found for this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants