Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Libtorrent 2.x memory-mapped files and RAM usage #6667

Open
HanabishiRecca opened this issue Jan 13, 2022 · 110 comments
Open

Libtorrent 2.x memory-mapped files and RAM usage #6667

HanabishiRecca opened this issue Jan 13, 2022 · 110 comments

Comments

@HanabishiRecca
Copy link
Contributor

libtorrent version (or branch): 2.0.5 (from Arch Linux official repo)

platform/architecture: Arch Linux x86-64, kernel ver 5.16

compiler and compiler version: gcc 11.1.0

Since qBittorrent started to actively migrate to libtorrent 2.x, there a lot of concerns from users about extensive RAM usage.
I not tested libtorrent with other frontends, but think result will be similar to qBt.

The thing is, libtorrent 2.x memory-mapped files model may be great improvement for I/O performance, but users are confused about RAM usage. And seems like this is not related to particular platform, both Windows and Linux users are affected.
libtorrent 2.x causes strange memory monitoring behavior. For me the process RAM usage is reported very high, but in fact memory is not consumed and overall system RAM usage reported is low.

qb-lt2

Not critical in my case, but kinda confusing.
This counter does not include OS filesystem cache, just in case.
Also this is not some write sync/flush issue, because also present when only seeding.

I'm not an expert in this topic, but maybe there are some flags can be tweaked for mmap to avoid this?

@HanabishiRecca
Copy link
Contributor Author

Seems like for Windows users it also can cause crashes qbittorrent/qBittorrent#16048. Because in Windows process can't allocate more than virutal memory avaliable (physical RAM + max pagefile size).
And also it can cause lags, because if pagefile is dynamic, the system will expand it with empty space. Windows doesn't have overcommit feature, so it must ensure that allocated virtual memory is actually exist somewhere.

@arvidn
Copy link
Owner

arvidn commented Jan 13, 2022

there's no way to avoid mmap allocating virtual address space. However, the relevant metric is resident memory (which is the actual amount of physical RAM used by a process. in htop these metrics are reported as VIRT and RES respectively. I don't know what PSS is, do you? it sounds like it may measure something similar to virtual address space.

Is the confusion among users similar to this?

@arvidn
Copy link
Owner

arvidn commented Jan 13, 2022

And also it can cause lags, because if pagefile is dynamic, the system will expand it with empty space.

pages backed by memory mapped files are not also backed by the pagefile. The page file backs anonymous memory (i.e. normally allocated memory). The issue of windows prioritizing pages backed by memory mapped files very high, failing to flush it when it's running low on memory is known, and there are some work-arounds for that. such as periodically flushing views of files and closing files (forcing a flush to disk).

Windows doesn't have overcommit feature, so it must ensure that allocated virtual memory is actually exist somewhere.

I don't believe that's true. Do you have a citation for that?

@HanabishiRecca
Copy link
Contributor Author

HanabishiRecca commented Jan 13, 2022

I don't know what PSS is, do you? it sounds like it may measure something similar to virtual address space.

Yes. It's like RES but more precise. Don't matter anyway, RES value are effectively identical in this case. Can take a > screenshot with all possible values enabled, if you don't belive.

Is the confusion among users similar to this?

No. I mentioned, that it's not cache.

I don't believe that's true. Do you have a citation for that?

Well, just search for it.
https://superuser.com/questions/1194263/will-microsoft-windows-10-overcommit-memory
https://www.reddit.com/r/learnpython/comments/fqzb4h/how_to_allow_overcommit_on_windows_10/

Simply Windows never had overcommit feature. And I personally as programmer faced this fact.

@arvidn
Copy link
Owner

arvidn commented Jan 13, 2022

sorry, I accidentally clicked "edit" instead of "quote reply". And now I'm having a hard time finding the undo button.

@HanabishiRecca
Copy link
Contributor Author

More columns. VIRT is way larger.

qb-st

@arvidn
Copy link
Owner

arvidn commented Jan 13, 2022

I don't know what PSS is, do you? it sounds like it may measure something similar to virtual address space.

Yes. It's like RES but more precise. Don't matter anyway, RES value are effectively identical in this case. Can take a > screenshot with all possible values enabled, if you don't belive.

The output from:

pmap -x <pid>

would be more helpful.

I don't believe that's true. Do you have a citation for that?

Well, just search for it.
https://superuser.com/questions/1194263/will-microsoft-windows-10-overcommit-memory
https://www.reddit.com/r/learnpython/comments/fqzb4h/how_to_allow_overcommit_on_windows_10/

None of these are microsoft sources, just random people on the internet making claims. One of those claims is a program that (supposedy) demonstrates over-committing on windows.

Simply Windows never had overcommit feature. And I personally as programmer faced this fact.

I think this may be drifting away a bit from your point. Let me ask you this. On a computer that has 16 GB of physical RAM, would you expect it to be possible to memory map a file that's 32 GB?

According to one answer on the stack overflow link, it wouldn't be considered over-committing as long as there is enough space in the page file (and presumably in any other file backing the pages, for non anonymous ones). So, mapping a file on disk (according to that definition) wouldn't be over-committing. With that definition of over-committing, whether windows does it or not isn't relevant, so long as it allows more virtual address space than there is physical memory.

@HanabishiRecca
Copy link
Contributor Author

HanabishiRecca commented Jan 13, 2022

The output from:

pmap -x <pid>

would be more helpful.

Of course. (File name is redacted.)

10597:   /usr/bin/qbittorrent
Address           Kbytes     RSS   Dirty Mode  Mapping
000055763ecbd000     512       0       0 r---- qbittorrent
000055763ed3d000    3508    2316       0 r-x-- qbittorrent
000055763f0aa000    5984     368       0 r---- qbittorrent
000055763f682000     108     108     108 r---- qbittorrent
000055763f69d000      28      28      28 rw--- qbittorrent
000055763f6a4000       8       8       8 rw---   [ anon ]
000055764131e000   54236   54048   54048 rw---   [ anon ]
00007ef4cc899000 9402888 8062068   56592 rw-s- file-name-here
...

None of these are microsoft sources, just random people on the internet making claims. One of those claims is a program that (supposedy) demonstrates over-committing on windows.
I think this may be drifting away a bit from your point. Let me ask you this. On a computer that has 16 GB of physical RAM, would you expect it to be possible to memory map a file that's 32 GB?

According to one answer on the stack overflow link, it wouldn't be considered over-committing as long as there is enough space in the page file (and presumably in any other file backing the pages, for non anonymous ones). So, mapping a file on disk (according to that definition) wouldn't be over-committing. With that definition of over-committing, whether windows does it or not isn't relevant, so long as it allows more virtual address space than there is physical memory.

Sorry. By overcommit I mean exceeding the virtual memory amount, as I said earlier physical RAM + pagefile. Windows don't allow that. If this happens, Windows will expand the pagefile with empty space to ensure full commit capacity and will raise OOM if max pagefile size exceeded. So if you have e.g. 16G RAM and 16G max pagefile size, maximum possible virtual memory amount for the whole system will be 32G. This can be easily tested.
Linux allows overcommit, by that I mean allocate more VIRT than RAM+swap size. Btw on the screenshot I have 16G RAM and no swap at all. VIRT size of 41G will not be possible on Windows in such case.

@arvidn
Copy link
Owner

arvidn commented Jan 13, 2022

Ok. When memory mapping a file, that file itself is backing the pages, so (presumably) they won't affect the pagefile, or be relevant for purposes of extending the pagefile.

That pmap output seems reasonable to me, and what I would expect. Except if you're seeding, then I would expect the "file-name-here" region to be mapped read-only.

I would also expect those pages to be some of the first to be evicted (especially some of the 99.3 % of the non-dirty pages, that are cheap to evict). Do you experience this not happening? Does it slow down the system as a whole?

@HanabishiRecca
Copy link
Contributor Author

HanabishiRecca commented Jan 13, 2022

I found a more clear analogy. Windows always behaves like Linux with kernel option vm.overcommit_memory = 2. That's it.
Sorry again for previous word spam.

That pmap output seems reasonable to me, and what I would expect. Except if you're seeding, then I would expect the "file-name-here" region to be mapped read-only.

I would also expect those pages to be some of the first to be evicted (especially some of the 99.3 % of the non-dirty pages, that are cheap to evict). Do you experience this not happening? Does it slow down the system as a whole?

This was a download. It was just easier to make a fast showcase for a such large memory scale. Because when downloading the memory consumption seems to grow indefinitely.

When seeding, memory consumption depends on peers activity. Only certain file parts are loaded, as shown in this output (file names omitted):

Address           Kbytes     RSS   Dirty Mode  Mapping
...
00007f0f3ca19000  949204   96008       0 r--s- 
00007f0f7690e000  220484    5440       0 r--s- 
00007f0f8405f000 1046808   15980       0 r--s- 
00007f1003853000 1048576    1472       0 r--s- 
00007f1043ea5000  995120   67236       0 r--s- 
00007f1080a71000 1048576    2604       0 r--s- 
00007f10c0a71000 1048576   53036       0 r--s- 
00007f1100a71000 1048576   27404       0 r--s- 
00007f1140a71000  888884   53708       0 r--s- 
00007f1176e7e000 1048576   59892       0 r--s- 
00007f11b6e7e000 1048576   41908       0 r--s- 
00007f11f6e7e000 1178632  153136       0 r--s- 
00007f123ed80000  797004    4668       0 r--s- 
00007f126f7d3000 1173584  142572       0 r--s- 
00007f12f6b50000    6748    4800       0 r--s- 
00007f12f71e7000 1045132  101892       0 r--s- 
00007f1336e8a000 1176632  140892       0 r--s- 
00007f137eb98000 1048576   55532       0 r--s- 
00007f13beb98000 1146904    5504       0 r--s- 
00007f1404b9e000 1178168  191456       0 r--s- 
00007f144ca2c000  886596   47408       0 r--s- 
00007f1482bfd000  836900   34880       0 r--s- 
00007f14b5d46000 1163336  149824       0 r--s- 
00007f14fcd58000 1165984  224952       0 r--s- 
...

When seeding RSS are freeing when file are not in use. But qBt memory consumption is still very large. With intensive seeding tasks it easily grows to gigabytes.
With libtorrent 1.x qBt consumes like 100M overall (with in-app disk cache disabled), regardless of anything.

@HanabishiRecca
Copy link
Contributor Author

HanabishiRecca commented Jan 13, 2022

Do you experience this not happening? Does it slow down the system as a whole?

I performed better test downloading very large file that is larger than my RAM.

free output:

               total        used        free      shared     buffers       cache   available
Mem:            15Gi       1.7Gi       183Mi       104Mi       0.0Ki        13Gi        13Gi
Swap:             0B          0B          0B

pmap output:

Address           Kbytes     RSS   Dirty Mode  Mapping
00007f085b17e000 34933680 13305724   37664 rw-s- filename

RSS caps at RAM amount avaliable. No system slowdown, OOM events or such. Don't have the swap though.
So good news, it works just like regular disk cache (belongs to cache column in free output), at least in Linux. Dirty amount is small as expected. The only scary thing is that it shows as resident memory in per-process stats.

@arvidn
Copy link
Owner

arvidn commented Jan 13, 2022

My impression is that Linux is one of the most sophisticated kernels when it comes to managing the page cache. Windows is having more problems where the system won't flush dirty pages early enough, requiring libtorrent to force flushing. But that leads to waves of disk thread lock-ups while flushing. There's some work to do on windows still, to get this to work well (unless persistent memory and the unification of RAM and SSD happens before then :) )

@HanabishiRecca
Copy link
Contributor Author

Yeah. Linux at least can be tweaked in all aspects and debugged.

I made some more research in per-process stats.

RssAnon:	  123820 kB
RssFile:	 5087496 kB
RssShmem:	    9276 kB

Mapped files obviously represented as RssFile and monitoring tools like htop seem to simply sum up all 3 values.
Can't say is this just monitoring software issue (should RssFile even be included?) or the situation is more complicated.

@ValdikSS
Copy link

This is an issue on Linux, as libtorrent's memory-mapped files implementation affects file eviction (swaping out) functionality.

Please watch the video where rechecking the torrent in qBittorrent forces the kernel to swap out 3 GB of RAM.

qbittorrent-440-recheck2-2022-01-16_20.18.30.mp4

Regular mmap()'ed files never trigger this behavior: they never counted as RSS and does not force to swap out the memory of other processes.

Relevant issue in qBittorrent repo: qbittorrent/qBittorrent#16146

@HanabishiRecca
Copy link
Contributor Author

HanabishiRecca commented Jan 16, 2022

I also found this behavior strange. It shouldn't work this way I think.

But diving in the source, not found anything suspicious. mmap seems to be used in a usual way

, m_mapping(m_size > 0 ? mmap(nullptr, static_cast<std::size_t>(m_size)

And flags are normal
MAP_FILE | MAP_SHARED

I also tried to play around with the flags, but nothing changes.

But I'm not an expert in C++ and Linux programming, so definitely can miss something.

@ValdikSS
Copy link

Ugh.

  1. mmap()'ed files are indeed shown in RSS on Linux as you read them
  2. When physical memory becomes low, the kernel will unmap sections of the file from physical memory based on its LRU (least recently used) algorithm. But the LRU is also global. The LRU may also force other processes to swap pages to disk, and reduce the disk cache. This can have a severely negative affect on the performance on other processes and the system as a whole.[1] — this is what I see on my system with vm.swappiness = 80
  3. It is possible to hint the kernel that you don't need some parts of the memory mapping with madvise MADV_DONTNEED without unmapping the file, but you'll need to implement your own LRU algo for this to be efficient in libtorrent case. You can't 'set' the mapped memory to reclaim itself more automatically, you'll need to manually call madvise on selected regions you use less than other.
  4. Mmaped files make memory consumption monitoring problematic, at least on Linux, which was also spotted by golang developers and jemalloc library developers which is used in Firefox:

"I looked at the task manager and Firefox sux!"

  1. Unless torrenting is the machine higher priority task, using mmap (at least on linux) for torrent data only decreases overall performance as it affects other workloads by swapping more than it could/should and "spoiling" LRU with less-priority data.

@arvidn
Copy link
Owner

arvidn commented Jan 16, 2022

MADV_DONTNEED will destroy the contents of dirty pages, so I don't think that's an option, but there's MADV_COLD in newer versions of linux.

@ValdikSS
Copy link

It seems I was not entirely correct in my previous comment. For some reason, my Linux system also swaps out anonymous memory upon reading huge files with the regular means (read/fread), and decreasing vm.swappiness does not help much. This might be recent change as I don't remember such behavior in the past. So take my comment with a grain of salt: everything might work fine, I need to do more tests.

@HanabishiRecca
Copy link
Contributor Author

@ValdikSS, try to adjust vm.vfs_cache_pressure.

@karabaja4
Copy link

karabaja4 commented Jan 26, 2022

I am very confused about what htop reports as memory usage for qbittorrent with libtorrent 2.x.

2022-01-26_18-53

Can someone explain these values?

@HanabishiRecca
Copy link
Contributor Author

Can someone explain these values?

Well, this is how the OS treats memory-mapped files. This question should be addressed to Linux kernel devs, I suppose.

@mayli
Copy link
Contributor

mayli commented Feb 3, 2022

For the mmap issue, how about creating many smaller maps instead of the large map of the entire file. And let the size of mapped chunks pool configurable, or default to a reasonable value. It's similar to the original cache_size, but it would reduce some confusions from most users that rely on tools that mix mmap-ed pages vs actual memory usage.

@arvidn
Copy link
Owner

arvidn commented Feb 9, 2022

For the mmap issue, how about creating many smaller maps instead of the large map of the entire file

I don't think it's obvious that it would make a difference. It would make it support 32 bit systems for sure, since you could stay within the 3 GB virtual address space. But whether unmapping a range of a file causes those pages to be forcefully flushed or evicted, would have to be demonstrated first.

If that would work, simply unmapping and remapping the file regularly might help, just like the periodic file close functionality.

@mayli
Copy link
Contributor

mayli commented Feb 9, 2022

But whether unmapping a range of a file causes those pages to be forcefully flushed or evicted, would have to be demonstrated first.

I believe the default behavior is to flush pages or delayed flush on sync, unless special flags were used such as MADV_DONTNEED.

But you can always use msync(2) to force the flush.

@SL-Gundam
Copy link

SL-Gundam commented Feb 15, 2022

Just my 2 cents
I'm using qbittorrent which uses libtorrent on windows.
In earlier versions of qbittorrent i set cache size to 0 (disabled) and turned OS cache on.
This worked perfectly where disk usage was the lowest it could ever be (5-20 %) with very little memory usage (160-200 MiB) for the actual qbittorrent process. Modified cache stayed relatively low aswell.

qbittorrent 4.4.0 started using libtorrent 2.0.
With qbittorrent 4.4.0 the cache size setting had disappeared.
When i added a couple of torrents the disk usage was around 40-50% and the memory shot up to 12-14 GiB for the qbittorrent process for similar torrents as described above.

Windows 10 x64
Hardware RAID 5 using 6 HDD drives
32 GiB of ram

Based on this https://libtorrent.org/upgrade_to_2.0-ref.html#cache-size i feel something is not quite right since it says that libtorrent 2.0 should exclusively use the OS cache.
If i understand corrently libtorrent 2.0 wanted to make my previous situation standard and unchangeable. But it does not behave the same. Or am i misunderstanding something?

@escape0707
Copy link

escape0707 commented Feb 15, 2022

@SL-Gundam

libtorrent 2 features memory mapped file. So it request the OS to virtually maps the file to the memory and let the OS to decide when to do actual read, write and release through this cpu cache - physical memory - disk stack.

OS will report a high memory usage but most of these usage should actually just be cached binaries that don't need to be freed at the moment. (Unless under some scenario windows didn't flush its cache early enough, which is what #6522 and this issue is talking about).

But I do think I observed disk usage and IO get higher than before when I first use a libtorrent 2 build. Don't sure if it's still the case for windows now as I've migrated to linux.

@USBhost
Copy link

USBhost commented Sep 24, 2022

the upcoming libtorrent-2.0.8 has some mitigations for this.

  • writes to files will be done using pwrite() (unless the target drive is a ramdisk or DAX, which is uncommon)
  • small files won't be memory mapped, but will use pread() and pwrite()

I'm hoping to finish a patch that caches SHA-1 contexts for partial pieces, to hash pieces incrementally as well, which would decrease pressure to read-back blocks once the piece completes.

Finally some updates to this problem. Idk if the last part will help my setup. We'll see if pwrite fixes my problems on my NFS share.

@BrodyStone21
Copy link

BrodyStone21 commented Oct 16, 2022

the upcoming libtorrent-2.0.8 has some mitigations for this.

Any idea of a timeline as to when 2.0.8 will be released?

@arvidn
Copy link
Owner

arvidn commented Oct 17, 2022

Any idea of a timeline as to when 2.0.8 will be released?

Probably this week or this coming weekend.

@BrodyStone21
Copy link

Any idea of a timeline as to when 2.0.8 will be released?

Probably this week or this coming weekend.

Cool, thanks for the response and your hard work!

@corvus1
Copy link

corvus1 commented Oct 23, 2022

Tried the fresh 2.0.8, it didn't change anything for me. Staying on 1.2 for now. Sorry...

@simonbcn
Copy link

Operating System: Arch Linux
KDE Plasma Version: 5.26.3
KDE Frameworks Version: 5.100.0
Qt Version: 5.15.7
Kernel Version: 6.0.10-native_amd-xanmod1-1 (64-bit)
Graphics Platform: Wayland
Processors: 64 × AMD Ryzen Threadripper 3970X 32-Core Processor
Memory: 62.7 GiB of RAM
Graphics Processor: AMD Radeon RX 580 Series

qBittorrent v4.4.5
libtorrent-rasterbar 2.0.8

Same problem.

@Sopel97
Copy link

Sopel97 commented Dec 27, 2022

Workaround: set https://www.libtorrent.org/single-page-ref.html#default-disk-io-constructor to posix-compliant

@mayli
Copy link
Contributor

mayli commented Mar 9, 2023

How about the concept of mapping smaller regions (1~16MB) of the file instead of mapping the entire file, and penalize/deprioritized IO requests that accessing non-mmaped regions. This method is essentially converting a torrent with large files (large mmap) to torrent with many smaller/medium files (smaller mmap).

The address space of all mappings could be restricted by the pool size. It's common practice to expose a limit on usage of mmap, such as sqlite's mmap_size and lmdb.

The libtorrent-rakshasa is using chunked mmap instead of mmap the entire file.

Although this will bring more overhead of mmap/mumap, the smaller RSS could improve the page cache hit ratio and UX for non-tech-savvy users to accept lt2.x isn't eating all the ram.

@GabenGar
Copy link

GabenGar commented Mar 10, 2023

UX for non-tech-savvy users to accept lt2.x isn't eating all the ram.

On windows it literally causes freezes, so it's not just an "UX" of "non-tech-savvy users".

@ToasterUwU
Copy link

ToasterUwU commented Apr 11, 2023

Having the same issue here. 2GB Download slowed down to speeds of like 50KBs and than even stopped even tho at least 20 Seeders were available, i checked the RAM and saw that i have a wonderful 20MB of free RAM from a total of 12GB.

This happened on Ubuntu Server 22.04LTS. Using QBittorrent-nox 4.5.2, came here from the issue on their side that mentioned this.

I changed the Disk IO Type to "POSIX-compliant" in the advanced settings of QBittorrent and rebooted the whole thing. Will edit this comment in a few hours probably with the result of if that fixed the issue or not.

This needs fixing, since no matter how big the download, after some time the whole thing just stops downloading and needs a actual reboot to work again for an hour or less.

EDIT: This seems to have fixed the issue. RAM usage is stable and below 1GB for over an hour now.

freebsd-git pushed a commit to freebsd/freebsd-ports that referenced this issue Apr 25, 2023
…sterbar 2.x

* Disable memmory mapped file handling and use "POSIX-compliant"
* Backport commit 8bcac1bed28f93c0712e915f43294b1e5fd03659 which
  reduces FilePoolSize
* Change status to experimental

This change only applies to new installs, if you have a configuration
already you need to apply these changes by hand

References:
qbittorrent/qBittorrent@8bcac1b
arvidn/libtorrent#6667 (comment)

PR:		270765
Reviewed by:	yuri (previous revision)
@ValdikSS
Copy link

POSIX-compliant disk I/O type in the advanced configuration makes qBittorrent interface very sluggish.
I tried to increase RAM usage limit up to 4 GB and also increased file pool size up to 50000, but nothing has changed.

Adding the torrent resulting it up to 1 minute interface freeze, and the files are seeded definitely slower than with libtorrent 1 or memory-mapped I/O.

Anyone else having the same issue?

@ToasterUwU
Copy link

ToasterUwU commented May 25, 2023

@ValdikSS Yes. I have the exact same issue and tried the same things to fix it, also without success.

So I'm sitting here (just like you) having to choose if I want my QBT instance to crash and not work at all, or have it work, but be completely uncontrollable while downloading large files (because the Web ui just times out until its done downloading) and have very slow download and upload speeds as well.

@HanabishiRecca
Copy link
Contributor Author

I have been using POSIX-compliant I/O on my seedbox for a while and now tried to use memory-mapped I/O again.
But now with 6.4.3 kernel on that machine memory-mapped mode actually causes OOM:

kernel: Out of memory: Killed process 330 (qbittorrent-nox) total-vm:105198504kB, anon-rss:58228kB, file-rss:5618460kB, shmem-rss:0kB, UID:1000 pgtables:33240kB oom_score_adj:0

☹️

@siroccal
Copy link

Workaround if you want to use memory-mapped IO, but still want to limit memory usage:

systemd-run --user --scope -p MemoryMax=2048M -p MemorySwapMax=0 qbittorrent

@ToasterUwU
Copy link

I "solved" the issue for me, by using qbittorrent-nox with libtorrent V1, that way it's not as fast as V2 (when it's working properly), but at least it doesn't grind to a halt when doing anything besides looking pretty.

@Sopel97
Copy link

Sopel97 commented Aug 24, 2023

Workaround if you want to use memory-mapped IO, but still want to limit memory usage:

systemd-run --user --scope -p MemoryMax=2048M -p MemorySwapMax=0 qbittorrent

This will OOM still. Not sure why you downvoted my workaround?

@siroccal
Copy link

This will OOM still. Not sure why you downvoted my workaround?

It doesn't seem to get OOM killed for me? If I use memory-mapped file I/O and use the systemd-run command and I recheck a 90 GB torrent on my NVMe SSD, then qBittorrent will only use the MemoryMax amount of memory without slowing down rechecking speed (5+ GB/s). With memory-mapped I/O but without the systemd-run command, the rechecking speed is good but qBittorrent uses up to more than 20 GB memory when rechecking a very large torrent.

If I set I/O to POSIX-compliant, then the memory usage will be good but checking a very large torrent causes qBittorrent to become very laggy (almost freezing) until it's done checking and rechecking speed is way lower (~1 GB/s). I think rechecking speed is so much lower because hashing threads seem to be limited to a single thread with POSIX-compliant I/O. There is no such limit with memory-mapped file I/O, and using 24 threads maximizes checking speed on my 6 core/12 thread processor.

@Zylsjsp
Copy link

Zylsjsp commented Nov 9, 2023

I'm a win10/Arch dual boot user, and shares same torrent folder in a NTFS partition on my hdd. As I migrated to qbittorrent on linux I suffer from qbit totally stuck all the time I add any large torrent, and my system blocks for a long time with no response, ps or htop doesn't response so I find it hard to trace the bug. I've suspected ntfs-3g might not be stable with heavy r/w before.

Workaround: set https://www.libtorrent.org/single-page-ref.html#default-disk-io-constructor to posix-compliant

After I read the issue I see where problem really comes from. BTW the frozen qbit process seems not able to trigger oom-killer now on my latest kernel v6.6.1-arch1-1, while it sometimes triggers before.

Hoping for a native fix for high RAM usage!

@HanabishiRecca
Copy link
Contributor Author

Someone need to actually deal with it. There are still fair amount of options that can be done.

On Linux side of things, MAP_NORESERVE and MADV_COLD seems to be worth trying. Judging by description, they may help against OOM.

MAP_NORESERVE
Do not reserve swap space for this mapping.

MADV_COLD
Deactivate a given range of pages. This will make the pages a more probable reclaim target should there be a memory pressure.

Unless someone wants to ask the kernel devs directly.

Also MAP_HUGETLB/MADV_HUGEPAGE can improve performance for distros where kernel's TRANSPARENT_HUGEPAGE_ALWAYS config option is not enabled.
MADV_RANDOM, already addressed in #7405, also may be good.

At the end of the day, we probably can fallback to manually calling MADV_DONTNEED/MADV_PAGEOUT/MADV_FREE after every read.
Btw it may also be interesting to try what will happen when any of that options are applied to the whole mmapped region instead.

I would probably try all of it myself, but I'm too bad with C++, especially considering complicated libtorrent codebase.

@Zylsjsp
Copy link

Zylsjsp commented Nov 20, 2023

@PriitUring
True. But I'm now using posix-compliant as a workaround. qbit ui is kinda sluggish in this mod, but far better than totally stuck and terminate losing status before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests