New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reduce cache size needs (txn.active + files) #1766
Comments
Your "files" cache/index is huge, 2.1GB. Likely because you have a lot of files. The txn.active directory is used for transaction processing (while the transaction is active, removed after transaction is completed). The files cache needs some space "per file" (see formula in internals docs) and also has some "generations" memory back into the past, whatever files it has seen in past backups. Recently we added an env var to adjust the number of generations. So, to summarize: files cache size is determined by number of files seen in last N backups, N being 10 or 20 or so (or whatever you set the env var to). |
What you could do:
|
This can in fact be optimized a fair bit by not keeping a backup cache around (which is what txn.active is about), but directly writing these files. This halves the files+main cache space needs, but also means that an aborted transaction / program run is way more expensive, even more so if the repository is remote / on the internet rather than local. The files cache is coded relatively dense. About ~80 bytes per file (with some contents, not empty files) at minimum: 32 byte file ID and one 32 byte chunk ID at least, plus some integers. Because of the dense coding these are practically incompressible. Other possible optimizations for the file cache include simply not caching small files (below a couple kB). This would probably do a lot for you here. |
Thomas, I switched off files cache and got: 2016-10-26T12:20:33+02:00 Sichere /home. Thats not too bad. Last backups with enabled cache where: 2016-10-26T10:33:32+02:00 Sichere /home. 2016-10-17T12:36:56+02:00 Sichere /home. 2016-10-14T10:02:19+02:00 Sichere /home. When I look at the variations I find it difficult to say what performance impact will be. I am using borgbackup only from office / work related networks with 1 GBit link to backup VM and today ping times of about 1 to 1,5 ms. Backup is to slow with my current DSL uplink tomorrow.
So theoretically I could make a symlink from txn.active to /tmp/txn.active to store this directory in tmpfs. This would add another about 2 GiB to RAM usage, in addition to the about 5,4 GiB RSIZE borgbackup used, but on laptop with 16 GiB that would be bearable. |
txn.active is dynamically created and deleted again. if you backup via dsl uplink, that might heavily influence your backup times. also, if these are maildirs with a lot of tiny files, maybe there isn't a big difference between stat() and stat+open? |
Disabling the files cache doesn't require more bandwidth to the repository. |
Well I am open to other suggestions – maybe using this new environment variable to reduce generations –, right now this works. I do not mind that much whether it takes one hour or half an hour. And if it needs to read in all local files. This is Dual SSD BTRFS RAID 1, I barely notice that read I/O activity during working at the laptop. |
@enkore oops, right. it has to query for chunk presence, but does it in the local chunks cache, in memory. |
It doesn't look like the files cache makes a lot of difference for you at all... from the data you posted you can probably just always turn it off. |
Would it make sense to not put "small" files in the files cache? That might save some space, and not take much extra time, since the small file could be quickly chunked and matched with the chunks cache. It might even save time in some situations. The definition of "small" could be tunable by an environment variable. |
Maybe, maybe not. Access time might be ~10ms for hard disks, limiting throughput to << 100 files per second (much less for SSDs). Just measure it? |
Related: #235. |
Unclear whether any issue remains. Closing. Feel free to reopen if this assessment is not accurate. |
@jdchristensen had an interesting idea of not putting small files into files cache - it seems there is some benchmarking left to do. |
see #3096 about the small files performance. |
@jdchristensen see #3096. |
Related to "significantly reduce cache space needs" #235. But here it isn´t large chunks.archive.d, but rather txn.active + files. This is a cache for /home which has an excessive number of files, but its not that large:
merkaba:~> du -sh /home
164G /home
merkaba:
> find /home | wc -l> find /home -type f | wc -l2865746
merkaba:
2797574
merkaba:
> find /home -type d | wc -l> find /home/martin/.local/share/local-mail | wc -l64319
merkaba:
2128451
merkaba:
> find /home/martin/.local/share/local-mail -type f | wc -l> find /home/martin/.local/share/local-mail -type d | wc -l2127093
merkaba:
1358
Today a borgbackup run into exception due to insufficient free space (see issue "limit cache space needs to avoid out of space exceptions" #1765)
And really:
merkaba:~/.cache/borg> du -sch * | sort -rh | head -4
5,4G insgesamt
4,9G 673e17ea929[…]
420M 809d8850c0bd[…]
44M 723a580c358b[…]
merkaba:~/.cache/borg/673e17ea92925012e32db8d4d92c6ec18a2a08f490d431b068fe6bdaae073737> LANG=C du -sch * | sort -rh | head -5
4.9G total
2.3G txn.active
2.1G files
296M chunks.archive.d
253M chunks
The text was updated successfully, but these errors were encountered: