Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large file write on USB disk leads to freeze #3330

Closed
balexandrov opened this issue Jan 11, 2020 · 34 comments
Closed

Large file write on USB disk leads to freeze #3330

balexandrov opened this issue Jan 11, 2020 · 34 comments
Labels

Comments

@balexandrov
Copy link

balexandrov commented Jan 11, 2020

Creating a bug report/issue

Required Information

  • DietPi version | cat /DietPi/dietpi/.version
    #!/bin/bash
    G_DIETPI_VERSION_CORE=6
    G_DIETPI_VERSION_SUB=28
    G_DIETPI_VERSION_RC=0
    G_GITBRANCH='master'
    G_GITOWNER='MichaIng'
  • Distro version | echo $G_DISTRO_NAME or cat /etc/debian_version
    buster, 10.2
  • Kernel version | uname -a
    Linux DietPi 4.19.57-v7+ DietPi-Config | Nvidia driver: nouveau disable required for 750Ti #1244 SMP Thu Jul 4 18:45:25 BST 2019 armv7l GNU/Linux
  • SBC device | echo $G_HW_MODEL_DESCRIPTION or (EG: RPi3)
    RPi 3 Model B+ (armv7l)
  • Power supply used | (EG: 5V 1A RAVpower)
    Original Pie power supply
  • SDcard used | (EG: SanDisk ultra)
    Kingston
  • Can this issue be replicated on a fresh installation of DietPi?
    Yes, above are the values for fresh minimal instalation

Steps to reproduce

Fresh install. Mount external USB hard drive.
I've tried with 2 different drives in different enclosures and one of them tried to format with exFAT, NTFS, EXT4. The behavior is almost the same: On large file writes the Pi becomes unresponsive, event ssh console times out and everything stops. Initially found it when tried to download file with transmission. With top command can be observed (when it suceeds to show it) that "wa" usage hits 80-90%.
The same behavior can be observed with simple dd command:

Writing 512MB file - no problems
dd if=/dev/zero of=/mnt/dec13470-f879-4ac3-8d4a-f74649b12c1b/test.img bs=512M count=1 oflag=dsync
536870912 bytes (537 MB, 512 MiB) copied, 18.2792 s, 29.4 MB/s

800MB sometimes is good too
dd if=/dev/zero of=/mnt/dec13470-f879-4ac3-8d4a-f74649b12c1b/test.img bs=800M count=1 oflag=dsync
838860800 bytes (839 MB, 800 MiB) copied, 30.7547 s, 27.3 MB/s

But 1GB blocks everything and eventually finishes after a while...
dd if=/dev/zero of=/mnt/dec13470-f879-4ac3-8d4a-f74649b12c1b/test.img bs=1G count=1 oflag=dsync 2> hdd1.txt
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 242.693 s, 4.4 MB/s

The problem is only on writing, never observed on reading files. The drive is USB3 but Pi have USB2 ports only.

Here dmesg log for the drive

[ 654.468723] usb 1-1.1.2: new high-speed USB device number 6 using dwc_otg
[ 654.680054] usb 1-1.1.2: New USB device found, idVendor=152d, idProduct=0578, bcdDevice=32.02
[ 654.680066] usb 1-1.1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 654.680076] usb 1-1.1.2: Product: JMS579
[ 654.680089] usb 1-1.1.2: Manufacturer: JMicron
[ 654.680098] usb 1-1.1.2: SerialNumber: 819A40B34
[ 654.705986] usb 1-1.1.2: The driver for the USB controller dwc_otg_hcd does not support scatter-gather which is
[ 654.706010] usb 1-1.1.2: required by the UAS driver. Please try an other USB controller if you wish to use UAS.
[ 654.706023] usb-storage 1-1.1.2:1.0: USB Mass Storage device detected
[ 654.706535] usb-storage 1-1.1.2:1.0: Quirks match for vid 152d pid 0578: 1000000
[ 654.707543] scsi host0: usb-storage 1-1.1.2:1.0
[ 655.759300] scsi 0:0:0:0: Direct-Access TOSHIBA MK6465GSXN 3202 PQ: 0 ANSI: 6
[ 655.761789] sd 0:0:0:0: [sda] 1250263728 512-byte logical blocks: (640 GB/596 GiB)
[ 655.762164] sd 0:0:0:0: [sda] Write Protect is off
[ 655.762171] sd 0:0:0:0: [sda] Mode Sense: 47 00 00 08
[ 655.762462] sd 0:0:0:0: [sda] Disabling FUA
[ 655.762469] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 655.786099] sda: sda1
[ 655.787942] sd 0:0:0:0: [sda] Attached SCSI disk

Tried to change the io governor from mq-deadline to kyber - no effect

Tell me please what else can provide or test.

@balexandrov balexandrov changed the title LArge file write on USB disk leads to freeze Large file write on USB disk leads to freeze Jan 11, 2020
@Joulinar
Copy link
Collaborator

@balexandrov
Just a stupid question. Did you tried to write a large file to local file system as well (not to a mounted device)? How is the behaviour of you system than?

@MichaIng
Copy link
Owner

@balexandrov
Many thanks for your report.

Apart from SDcard test, did you try a different USB port as well? And what about CPU and RAM/swap usage by dd process?

The drive has an external power supply, right? Did you check the last dmesg as well when the system starts to become less responsive and ctrl+c/kill dd before fully loosing control?

@balexandrov
Copy link
Author

Thanks all! I'm continuing with investigations, because this really bothers me. It starts to look like a hardware problem though, something saturates and starts slows down. The activity diode on the Pi shines constantly w/o blinking. The original drive is WD Elements with external power supply, the test drive is 2.5 Toshiba in enclosure w/o power supply.
Just tried with the sd card that seems to be rather slow but the behavior is the same. I'm testing on one console through ssh, and watching top on another console.
100 MB file - normal, almost no freezing.
dd if=/dev/zero of=/mnt/test.img bs=100M count=1 oflag=dsync
104857600 bytes (105 MB, 100 MiB) copied, 10.818 s, 9.7 MB/s

512 MB freezed once and 2 times already there is no problem. With 1G file always freezes so bad that the console times out.
Here is the top stats when freezed and I've succeeded to see them:

Tasks: 123 total,   1 running,  74 sleeping,   0 stopped,   1 zombie
%Cpu(s):  0.1 us,  1.6 sy,  0.1 ni,  0.2 id, 98.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :   700640 total,    31460 free,   564936 used,   104244 buff/cache
KiB Swap:  1098748 total,   803580 free,   295168 used.    81008 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2051 root      20   0  527800 524332      8 D   2.8 74.8   0:04.87 dd
   46 root      20   0       0      0      0 S   1.1  0.0   0:01.57 kswapd0
 2055 root      20   0       0      0      0 I   1.1  0.0   0:00.23 kworker/u8:3-ev
   28 root       0 -20       0      0      0 D   1.0  0.0   0:00.82 kworker/3:0H+kb
 2100 root      20   0    8008   1328    804 R   0.4  0.2   0:00.13 top

I'll try with another Pi

@balexandrov
Copy link
Author

Can you please try this test on your device and tell me if it freezes. I don't have much experience with these mini computers and don't know what to expect...
dd if=/dev/zero of=/mnt/test.img bs=1G count=1 oflag=dsync

@Joulinar
Copy link
Collaborator

@balexandrov
I did a test on my RPi4B without issues.

root@DietPi4:~# dd if=/dev/zero of=/mnt/test.img bs=1G count=1 oflag=dsync
1+0 Datensätze ein
1+0 Datensätze aus
1073741824 bytes (1,1 GB, 1,0 GiB) copied, 61,2632 s, 17,5 MB/s
root@DietPi4:~#

I guess you are running out of memory. As far as I see on htop it's going to allocate the 1GB on Mem during dd

Unbenannt

@balexandrov
Copy link
Author

Thanks @Joulinar !

Maybe this is memory issue but the original problem was with transmission when downloading a torrent. Any ideas how to test and isolate the problem w/o filling up the memory?
I'm on RPi3B+ but found that some drives have problem with RPi4B and tried these quirks but no change in my case. Anyway have this in mind for your RPi4B - https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=245931

@balexandrov
Copy link
Author

Here stats from the original problem with transmission. It starts to download a torrent, looks well, after few megabytes downloaded it suddenly freezes, transmission remote gui times out, at the console succeeded to get this top stats:

top - 16:40:19 up  1:04,  2 users,  load average: 9.57, 2.61, 1.32
Tasks: 123 total,   1 running,  72 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.7 us,  2.5 sy,  0.0 ni,  2.7 id, 92.7 wa,  0.0 hi,  0.4 si,  0.0 st
KiB Mem :   700640 total,    25528 free,   649404 used,    25708 buff/cache
KiB Swap:  1098748 total,   847100 free,   251648 used.    13148 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
   46 root      20   0       0      0      0 S   5.7  0.0   0:08.24 kswapd0
 1188 debian-+  20   0  675748 517784    528 S   4.5 73.9   0:18.83 transmission-da
   23 root       0 -20       0      0      0 D   1.9  0.0   0:00.90 kworker/2:0H+kb
 5141 root      20   0    8188   1120    760 R   1.3  0.2   0:02.11 top

@balexandrov
Copy link
Author

It seems indeed like a memory issue. Transmission starts to eat up memory and after a minute or so it freezes. This is with 1 torrent and about 10 peer connections. About 5Mb/s download speed.
htop

@Joulinar
Copy link
Collaborator

Joulinar commented Jan 11, 2020

Also very low phys memory available. transmission-da is using close to 75% of your mem. Which is already quite a lot. You are already using SWAP space, which will slow down your device due to increased I/O on your SD card. Can you try to reboot your system to clean up?

Little bit strange as well is the KiB Mem : 700640 total which seems to less for a RPi3B+. My 3B+ device is showing KiB Mem : 999036 total. Probaly you have some HW challenges 😃

top - 17:46:51 up 8 days,  4:47,  1 user,  load average: 0,00, 0,00, 0,00
Tasks:  85 total,   1 running,  41 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0,0 us,  0,4 sy,  0,0 ni, 99,6 id,  0,0 wa,  0,0 hi,  0,0 si,  0,0 st
KiB Mem :   999036 total,   616780 free,   156172 used,   226084 buff/cache
KiB Swap:  1097724 total,  1097724 free,        0 used.   762300 avail Mem

@balexandrov
Copy link
Author

balexandrov commented Jan 11, 2020

It seems there is some problem with transmission 2.92, fixed in 2.94.
transmission/transmission#313 (comment)
I've just compiled it from the sources. Please update the package in the store.
Just tried it and again eats up all the memory in first 2 minutes with single torrent file but does not freeze. Becomes very sluggish however and I'm fed up already with it and am leaving this project of mine. I've spent few days in headbanging on simple tasks.
Will revert back to normal PC with bsd - much more stable.

@Joulinar
Copy link
Collaborator

@balexandrov
I checked which package is pulled by DietPi scripts and it's version 2.94-2. This is the latest available Debian package. So if you have installed 2.92, you might running an outdated version.

 DietPi-Software
─────────────────────────────────────────────────────
 Mode: Installing Transmission: bittorrent server with web interface (c)

[ INFO ] DietPi-Software | APT installation for: transmission-daemon, please wait...
Selecting previously unselected package transmission-daemon.
(Reading database ... 30603 files and directories currently installed.)
Preparing to unpack .../transmission-daemon_2.94-2_armhf.deb ...
Unpacking transmission-daemon (2.94-2) ...
Setting up transmission-daemon (2.94-2) ...
Created symlink /etc/systemd/system/multi-user.target.wants/transmission-daemon.service → /lib/systemd/system/transmission-daemon.service.
Processing triggers for systemd (241-7~deb10u2+rpi1) ...
[  OK  ] DietPi-Software | G_AGI transmission-daemon

https://packages.debian.org/search?searchon=names&keywords=transmission

BTW: DietPi has nothing to do with Transmission Software. We just use official packages available.

@balexandrov
Copy link
Author

I've forgot the tool and installed it with apt install and it shows 2.92. Will try clean installation tomorrow again.
apt list|grep transmission
libtransmission-client-perl/oldstable 0.0805-1 all
python-transmissionrpc/oldstable 0.11-3 all
python-transmissionrpc-doc/oldstable 0.11-3 all
python3-transmissionrpc/oldstable 0.11-3 all
transmission/oldstable 2.92-2+deb9u1 all
transmission-cli/oldstable 2.92-2+deb9u1 armhf
transmission-common/oldstable 2.92-2+deb9u1 all
transmission-daemon/oldstable,now 2.92-2+deb9u1 armhf [residual-config]
transmission-gtk/oldstable 2.92-2+deb9u1 armhf
transmission-qt/oldstable 2.92-2+deb9u1 armhf
transmission-remote-cli/oldstable 1.7.0-1 all
transmission-remote-gtk/oldstable 1.3.1-2 armhf

@MichaIng
Copy link
Owner

MichaIng commented Jan 11, 2020

@Joulinar @balexandrov
Transmission version depends on Debian version, since it is pulled from regular Debian repo:

The issue with the version that is shipped by Stretch currently is known: #2413
I was thinking to ship self-compiled binaries there, but @balexandrov you mean your own compiled v2.94 has the same issue??

EDIT: Ah, the solution was not the version bump but to compile it against libcurl4-openssl-dev instead of libcurl4-gnutls-dev: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=865624#32

Not beautiful but somehow working workaround: #2413 (comment)

The bug is known to Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=865624
It has been fixed for Buster and I asked for a backport to Stretch but answer was "do it yourself".

Other "solution" of course it to simply use a different downloader, Deluge, rTorrent or such, dietpi-software offers plenty alternatives.

@Joulinar
Copy link
Collaborator

Joulinar commented Jan 11, 2020

@MichaIng
the used Debian version is Buster

Distro version | echo $G_DISTRO_NAME or cat /etc/debian_version > buster, 10.2

so it should pull 2.94, isn't it 😉

@balexandrov
you could use dietpi-software to install Transmission. no need to run apt-get install by your own

@MichaIng
Copy link
Owner

@Joulinar
😄 you're right, then apt list Stretch packages but no Buster packages is a real issue.

@balexandrov
Did you dist-upgrade your system to Buster or manually change the sources.list entries? apt list|grep transmission output is not possible on a fresh RPi image. Mixing both distros would explain issues in general.
To check the actual applied lists:

cat /etc/apt/sources.list{,.d/*}

@balexandrov
Copy link
Author

balexandrov commented Jan 13, 2020

@MichaIng After mine encountered issues, I've went on path to eliminate side factors and tried to reproduce it on fresh image. So that was the one fresh image, downloaded the same day from https://dietpi.com and let it update itself on the first boot. I'm not very sure what are Buster and Stretch versions.
The original sd card was with image which my son have installed some time ago and am not sure what he did there so I've started from clean installation and succeeded to reproduce it with dd. I'm not sure why i allocates so much memory but that replicates the freeze.

Haven't got time to further test it these days but probably will go on your suggested path to just use another torrent client. I'm only experienced in Transmission but will try Deluge as you suggested. I need only basic torrent use: 20-30 movie torrents that I want to watch on a TV.... my first disappointment was that RPi 3b+ have only USB 2 and now that glitch.... took me 2 days headbanging.

@MichaIng
Copy link
Owner

@balexandrov

I'm not very sure what are Buster and Stretch versions.

"Stretch" and "Buster" are the codenames for the current stable and the last stable (oldstable) Debian versions: https://en.wikipedia.org/wiki/Debian#Code_names
They come with different software package repositories, so that each package that you install via apt-get (what DietPi does as well in many cases) will pull the correct binary package that fits e.g. to library versions shipped by the same Debian version/repository. Hence mixing repositories can lead to package dependency conflicts or simply binaries which do not run or run incorrectly due to wrong/incompatible library versions.

Before doing any further debugging, I hence would first assure a clean and consistent DietPi image, with a single main APT repository in use, as shipped by our images by default.

@balexandrov
Copy link
Author

balexandrov commented Jan 13, 2020

Before doing any further debugging, I hence would first assure a clean and consistent DietPi image, with a single main APT repository in use, as shipped by our images by default.

Thanks a lot for your support. Indeed the list with 2.92 Transmission was on the old image. On the clean one I've only went with minimal installation and the tests with dd. There the offered package is 2.94
Will try now on the clean one with transmission.

@balexandrov
Copy link
Author

balexandrov commented Jan 13, 2020

Just tried, on the fresh install sd card, installed transmission - no problems, 230/976 MB used ram and no freezing.
Thanks!

@balexandrov
Copy link
Author

Thanks @MichaIng @Joulinar for your support.
Just one last comment here. It turned out that the main issue is the preallocation of large torrent files (ie visible on >5GB) on exFAT (and ext4) file systems that causes drive overload till this space is allocated. The problem is the same even on Windows. The solution is to use NTFS and voila...!
Here similar issue:
qbittorrent/qBittorrent#8186

@MichaIng
Copy link
Owner

@balexandrov
Interesting, never heard of ext4 having any issues with this (also the qBittorrent issue mentions exFAT only). With exFAT you have the additional issue that currently, all mono-based software (Sonarr, Radarr, Lidarr, ...) cannot create any files on file systems, which do not support UNIX permissions (exFAT + vFAT (FAT32) and actually NTFS as well): #3179
NTFS works when you install ntfs-3g package and add the permissions mount options, which emulates UNIX permissions, both done automatically when adding the drive with dietpi-drive_manager.

As well I remember some other issues with high CPU usage in combination with exFAT: #3027 #3025
Linux 5.4 added native support for exFAT, lets hope that this enhances the situation compared to the userspace drivers: https://kernelnewbies.org/Linux_5.4#EROFS_and_exFAT

@balexandrov
Copy link
Author

balexandrov commented Jan 20, 2020

I've tested with ext4 formated drive and the effect was the same (but tested it only once..will check again). The problem is with every external drive. On the sd card there is no such problem...
The issue is not the performance itself (good enough for USB2) but the preallocation of a large file - it seems that it tries to allocate it and fill with zeroes. The speed of the drive is about 25-30 MB/s on USB2 and this means that about 3 mins for 5GB torrent the drive is on 100% load and every program that tries to touch it -freezes, waiting for it. This does not happen with sequential copy of files or the benchmark. I've had to test with a torrent client - transmission and deluge and the effect is the same even on Windows.

@MichaIng
Copy link
Owner

MichaIng commented Jan 20, 2020

@balexandrov
Maybe qBittorrent uses a bad method to pre-allocate the space then. ext4 has a function which does not rely on writing zeros to the space, hence writing data doubled in fact, and especially not transferring every zero byte by byte through USB: https://en.wikipedia.org/wiki/Ext4#Features

@ghost
Copy link

ghost commented May 15, 2020

Raspberry Pi 4B running Manjaro ARM. Having this issue as well :(

@ghost
Copy link

ghost commented May 15, 2020

Edit: yep, @balexandrov was totally right. The preallocation of large files is definitely the problem here.

Somehow transmission is filling up the system memory when downloading files to an external HDD. I could literally see it rockets up to 4GB until ~200MB is left which then causes everything to just freeze. Anything that tries to access the mount point will then freeze as well. Ughhh. I though it was a problem with the filesystem driver but obviously it isn't. I believe I've just witnessed the exact same issue as @balexandrov.

Also some background:
RPi 4B 4GB, Manjaro Arm
Latest Transmission: transmission-daemon 2.94 (d8e60ee44f)
Latest OS: Linux pi 4.19.118-1-MANJARO-ARM #1 SMP PREEMPT Mon Apr 27 15:17:51 CDT 2020 aarch64 GNU/Linux

@ghost
Copy link

ghost commented May 15, 2020

@balexandrov @MichaIng

My sincere apologies for creating so many notifications but may I ask how you guys managed to fix this issue in the end? I've spent hours troubleshooting this in which I had to do countless force shutdowns on the external HDD, which is obviously going to damage its longevity. I'd really appreciate if a workaround can be given.

@Joulinar
Copy link
Collaborator

@JQ555888
I guess solution was to use ntfs file system instead of ext4. That's what was described by @balexandrov

@ghost
Copy link

ghost commented May 15, 2020

Thank you so much @Joulinar. My bad for totally missing that comment.

I have just switched from NTFS to exFAT and finally ext4 last month for better compatibility with Linux, but I guess I really need to change it back to NTFS then. It's kind of ironic that what I thought was the best turns out to be creating even more hassle since I had to install ext4 drivers on my Mac and Windows machines. But thanks again.

@balexandrov
Copy link
Author

Yes, few months later it works stable and without issues on NTFS and RB3B+ downloading and serving of about hundred of torrents to my TV with Minidlna.

@ghost
Copy link

ghost commented May 15, 2020

Yep. Switching to NTFS and using ntfs-3g seem to have solved the issue. Huge thanks to @balexandrov and @Joulinar!

@MichaIng
Copy link
Owner

@JQ555888 @balexandrov @Joulinar
Many thanks for pointing out the underlying issue and that it is present (respectively a different issue that is affecting RAM usage as well...) on Buster as well. I'll do some research to find out if this is known and possible to solve, e.g. if it's possible to disable pre-allocation all together. It generally doesn't make sense as long as the file is not extremely important (e.g. a swap file) or if you run things on the limit of drive space and traffic plays a major role. Both should not be the case when running a torrent downloader basically unauthenticated. And doubled disk writes is nothing nice for any SDcard or USB stick or even just due to often slow USB bus on SBCs, and basically should be avoided in every case where much data is written (large downloads).

@balexandrov
Copy link
Author

I want to add one more issue that had a role in this story. When downloading torrents with say more than 4 Mb/sec Transmission's interface frequently locks up for 5-10-20 sec, mount.ntfs (the disk is ntfs now) consumes about 80% CPU time.

These were remediated successfully with adding "big_writes" to the mount line. Its now like this
UUID=6AFA42D2FA4299E9 /mnt/datastore ntfs noatime,lazytime,rw,permissions,nofail,big_writes

@MichaIng Consider if this option need to be used by default. I'm not sure about the reliability but in my case I'm on UPS. There are more tweaks for NTFS but this is the essential one.

@MichaIng
Copy link
Owner

MichaIng commented Jul 6, 2020

Many thanks for the hint. Little research: https://unix.stackexchange.com/a/544864

  • No known downsides
  • libfuse3 has this enabled by default
  • libfuse3 support planned for ntfs-3g by contributor

We need to add it 👍.

MichaIng added a commit that referenced this issue Jul 6, 2020
+ CHANGELOG | DietPi-Drive_Manager: For NTFS mounts, the "big_files" mount option is now added by default, which reduces CPU load and by this may increase performance. Many thanks to @balexandrov for suggesting this enhancement: #3330 (comment)
@MichaIng
Copy link
Owner

MichaIng commented Jul 6, 2020

Done: 78dc1bb
Changlog: 2712e90

@MichaIng MichaIng mentioned this issue Aug 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants