-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
apt-get/dpkg commands are really slow on a ZFS rootfs #3857
Comments
|
It's not Not much you can do, other than create a new filesystem with |
|
On Wednesday September 30 2015 07:15:13 Turbo Fredriksson wrote:
Yes, but that shouldn't affect the time required for reading the database, should it? R. |
|
It could be that prefetch is hurting us in apt's database. #3765 might help if that is the case. |
|
On Wednesday September 30 2015 07:59:12 Richard Yao wrote:
How safe is that patch, applied over the current stable version for Ubuntu (0.6.5.1-1)? |
|
another potentially relevant observation that I forgot to mention:
it's especially the 1st database traversal (or whatever the "reading database ... XXX files currently installed" operation does) that is really slow. First being "after not having re/installed/upgraded anything for a while".
Subsequent calls are much faster, possibly even reasonably fast given my hardware (hard to assess that).
Makes the issue tricky to debug ...
|
|
@RJVB it's not ready yet since preconditioned (needed) additional changes aren't in ZOL's code, so: not safe (the buildbots status give a hint how "safe" pull request mostly are, WIP == work in progress/process) |
|
On Wednesday September 30 2015 09:36:04 kernelOfTruth aka. kOT, Gentoo user wrote:
Heh, I know what WIP stands for, and saw that the buildbots were running into issues ... but I also saw CentOS, and I tend to equate that with AncientOS ;) R. |
|
@RJVB I suspect this was caused by a recent performance regression which in 0.6.5 and 0.6.5.1. It's been addressed in 0.6.5.2 which is now available. Can you install 0.6.5.2 and see if it addresses the issue. |
|
On Wednesday September 30 2015 11:39:22 Brian Behlendorf wrote:
I didn't repeat it here, but I've always had the slowness in this application. It was one reason I installed my gf's machine on a btrfs root. I'll install 0.6.5.2 as soon as it appears in apt, and update this ticket accordingly. R. |
|
@RJVB Did |
|
On Sunday October 04 2015 04:53:52 Turbo Fredriksson wrote:
It's very hard to determine whether the issue is really resolved; as I said before, caching means that apt/dpkg commands that do not involve (un)installing packages complete (much) faster when repeating them after the initial run. That said, subsequent You wouldn't happen to know if the apt database is accessed with a fixed record size, would you? R. |
|
I did a bit of testing recently with the apt/dpkg suite and there are 2 interesting things: first, all access to the database files (pkgcache.bin, etc.) are via |
|
On Sunday October 04 2015 04:53:52 Turbo Fredriksson wrote:
Actually, I don't think it made a difference. I just did an apt-get install of a simple -dev package, and the "Reading database" step after downloading the archive advanced at about 5 seconds per 5% (and told me afterwards I have 473814 files and directories installed). R. |
|
just a heads up that this still is an issue. |
|
On Thursday February 25 2016 00:38:43 K1773R wrote:
For me it is indeed. Were you asking, or were you saying that it affects you too? R. |
|
yes, it affects me too. |
|
If the slowness is during the "Reading package lists..." phase, the issue is the creation of the /var/cache/apt/pkgcache.bin file. We've deployed a lot of Ubuntu Trusty systems and the apt-get program definitely runs very slow while the temp file is mmapped and being populated: The slowness only seems to occur when running the stock 3.13 kernel. When we update these systems to use the "Wily enablement stack", which has a 4.2 kernel, the problem goes away. I don't have any Debian-based testing systems on which to try to track down the problem so I've not pursued it any further. |
|
On Thursday February 25 2016 05:03:27 Tim Chase wrote:
The slowness only seems to occur when running the stock 3.13 kernel. When we update these systems to use the "Wily enablement stack", which has a 4.2 kernel, the problem goes away.
I'm running (one of) the latest(s) 3.14 kernel(s), so it's not just the 3.13 kernel. I'm currently not able to update beyond 3.14 (= not willing to give up certain software that doesn't support later kernels yet).
R.
|
|
I have the same issue inside lxc container on zfs pool. reading dpkg database and unpacking packages take ages. iotop shows z_null_int 99% load all the time but dd write is fast. It could be the same issue as #6171. host: host container |
|
On Wednesday January 31 2018 10:35:41 aTanCS wrote:
It could be the same issue as #6171.
That'd be good news, because that issue is apparently fixed. However, I have been seeing this slowness since well before the regression mentioned in the fix commit message.
|
|
This issue continues to plague 0.7.12 (and Linux 4.14). I have been getting a hunch that the /var/lib/dpkg/info directory is a if not the culprit. That dir contains a very large number of tiny files, 2 or 3 per installed package, and apparently those files aren't always removed when packages are uninstalled. I've seen other situations where a key/value database-in-a-flat-directory lead to dismal performance on ZFS (and on FAT16 O:^)). There must be ways to improve this on the ZFS level (when you can add as much disks and/or RAM as you want) but when you're just (ab?)using ZFS as a safe filesystem on a kickabout Linux notebook those are just not an option. The best fix is also not really an option: getting the dpkg system to use a proper, fast key/value database (LMDB would probably do). Here's a workaround that I'm testing, with initial encouraging results: I'm not losing on disk space either: We'll see how this evolves over time. For now I went with the easier and safer solution; the alternative would be to set up a sparse zvol with the same properties as the dataset holding the apt and dpkg directories, and create the XFS fs in there. I'm not used to working with ZVOLs, hence the easier and safer (because I'm not putting my pool at risk); it does mean I'll be making very regular syncs of /var/lib/dpkg and /var/lib/dpkg.bak . I also like the fact that it's trivial (provided you keep backups) to recreate the sparse disk image when the existing one starts to degrade. Edit, some results from the tcsh time built-in: The big differences here are in the number of input operations ( |
|
Just throwing it out there, but why wouldn't gentoo have the same problems, the portage tree is made up of 163498 files consuming roughly 1.5gb of space. |
|
Doesn't dpkg use fsync as well? Have you tried it with it turned off in ZFS?
…On Mon, 7 Jan 2019 at 11:15, bunder2015 ***@***.***> wrote:
Just throwing it out there, but why wouldn't gentoo have the same
problems, the portage tree is made up of 163498 files consuming roughly
1.5gb of space.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#3857 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AgRN17dhiy4PV7xQrkFxqF3rOA6clNaHks5vAyxEgaJpZM4GGhGC>
.
|
|
why wouldn't gentoo have the same problems, the portage tree is made up of 163498 files consuming roughly 1.5gb of space.
No idea, but it must have something to do with how that tree is accessed. Maybe it does what MacPorts does with its port tree; indexing file(s).
Come to think of it, I sometimes use their port tree repo (github:macports/macports-ports) as a source for benchmarking filesystem operations.
Doesn't dpkg use fsync as well? Have you tried it with it turned off in ZFS?
The slowest part is where it says "Reading database", so I doubt that it can be due to fsync'ing. But as mentioned above the dataset used here already has sync=disabled. Is there another way to "turn fsync off in ZFS"?
|
|
sync=disabled should be OK. Does it take the same amount of time no matter
how many times the directory is read (so the cache is warm, assuming you're
got metadata caching turned on)?
…On Mon, 7 Jan 2019 at 11:38, René Bertin ***@***.***> wrote:
>why wouldn't gentoo have the same problems, the portage tree is made up
of 163498 files consuming roughly 1.5gb of space.
No idea, but it must have something to do with how that tree is accessed.
Maybe it does what MacPorts does with its port tree; indexing file(s).
Come to think of it, I sometimes use their port tree repo
(github:macports/macports-ports) as a source for benchmarking filesystem
operations.
> Doesn't dpkg use fsync as well? Have you tried it with it turned off in
ZFS?
The slowest part is where it says "Reading database", so I doubt that it
can be due to fsync'ing. But as mentioned above the dataset used here
already has sync=disabled. Is there another way to "turn fsync off in ZFS"?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3857 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AgRN17y_tI2rjKrl40iwt6n4ASYAlqIcks5vAzGwgaJpZM4GGhGC>
.
|
|
sync=disabled should be OK.
It has been for years :)
Does it take the same amount of time no matter
how many times the directory is read (so the cache is warm, assuming you're
got metadata caching turned on)?
No, for me the command is really slow only when not executed since a while (since the previous day, for instance).
And that's typical of things like dpkg, you don't use it very often most of the time (but when you do there's a good chance dpkg installs for cache with the packages it installs if you see what I mean).
|
|
Hmm, I think it's more up to (a) ZFS dev(s) to show how ZFS is NOT the cause of this issue - despite the fact that it occurs just about only with ZFS?
Mailing lists aren't really suitable for discussions that drag on for years....
|
This came up on IRC and I've been asked to file an issue about it. It will be mostly a set of observations as I can evidently not easily compare the same distribution on the same computer using various filesystems.
I'm running Kubuntu 14.04 LTS off a ZFS root. I use an EXT3 /boot partition. I originally installed the OS to a FS supported by the installer that comes with the Kubuntu live images, and then rsynced the result onto my ZFS rootfs.
Since then I have tried to tune the pool my reducing copies=2 to the directories that really deserve it, and moving directories with more or less "volatile" content onto datasets that have sync disabled or even also checksums disabled.
One class of operations that has always been much slower since moving to ZFS is installing, removing or upgrading packages with apt-get or dpkg. I think this is related to the database(s) involved, the step where it says "reading database" takes much longer than I'd expect. (Expectation is based here on over all l performance differences with another, faster laptop running the same distribution off a btrfs partition.)
But for comparison, upgrading or installing a new kernel with accompanying headers is noticeably faster when I disable sync; progress messages appear in much faster succession as soon as sync is disabled.
As a result I put /var/cache/apt on a dataset that has sync and compression disabled (but checksums enabled, and copies=2), which made database access somewhat faster, but still very slow.
My hunch is that dpkg uses database algorithms or settings that are poorly appropriate when the db file is stored on zfs.
The text was updated successfully, but these errors were encountered: