Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dmesg] ZFS: Unable to set "noop" scheduler #169

Closed
user318 opened this issue Mar 23, 2011 · 8 comments
Closed

[dmesg] ZFS: Unable to set "noop" scheduler #169

user318 opened this issue Mar 23, 2011 · 8 comments
Milestone

Comments

@user318
Copy link

user318 commented Mar 23, 2011

During mount zfs for some reason tries to set noop scheduler on partition, not the whole drive. In this case whole drive was given to zfs and it created partition on it.
Question is why is it trying to set noop on partition? Also I've made couple of simpe tests (moving some files around) and setting different schedulers on whole drive. And during this tests noop was slower than deadline and cfq. So if this is case-dependent, I think it is more preferrable for user to set scheduler by its own.

@behlendorf
Copy link
Contributor

Are you using the 0.6.0-rc2 release? There was a bug like this which was fixes post -rc1. ZFS should try and set the scheduler but it should only be doing it for whole block devices, never for partitions.

As for performance and testing/results I'd you have some I'd love to see them. In theory the noop scheduler for ZFS should be best since it schedules it's own IO. Setting noop should just get you front and back merging without any additional overhead. However, this default value was only set on theory and not practice. So if you have some hard number which show which choice is currently best I've love to see them.

Also you can have ZFS set the scheduler of your choice when it loads by setting the zfs_vdev_scheduler module option. By default it is noop but you can set it to any valid scheduler, or none if you don't want it changing anything.

@user318
Copy link
Author

user318 commented Mar 23, 2011

Are you using the 0.6.0-rc2 release?

I used current git source. As I understand, it is 0.6.0-rc2 + latest patches.

I can redo my performance test later. Now I can just say that I put several copies of zfs sources and several iso images on zfs system and measured "tar c" of them all.

@user318
Copy link
Author

user318 commented Mar 25, 2011

Last patches seems to introduce bit more instability or may be I stressed zfs less before that. With lot IO it hangs at some point without any notes in dmesg. Deadlock maybe?
On main zfs-testing-machine it hangs a lot, so I have no time to complete tests.
I made test at home on other machine (archlinux with 2.6.32 kernel).
Files looks like this. It is several project sources with compiled binaries.

[root@cherry ~]# cd /pool/
[root@cherry pool]# df -h .
Filesystem            Size  Used Avail Use% Mounted on
pool                  4.0G  3.2G  849M  79% /pool
[root@cherry pool]# find . | wc -l
20089

I have run something like "du -shc" and after it my first test deadlocked too. So I needed to reboot (reset).
Test after reset. Scheduler is deadline:

[root@cherry pool]# cat /sys/block/sdc/queue/scheduler 
noop anticipatory [deadline] cfq 
[root@cherry pool]# tar c . | pv >/dev/null
tar: .: file changed as we read it                 <=>                         ]
3.06GB 0:04:37 [11.3MB/s] [                     <=>                            ]
[root@cherry pool]# tar c . | pv >/dev/null
3.06GB 0:04:33 [11.5MB/s] [                         <=>                        ]
[root@cherry pool]# 

Also I want to note "tar: .: file changed as we read it". There was no other IO on this partition. I think it is related to unclean umount. But I thaught zfs can handle this without problems.
After this, I tried to reboot. Reboot hanged at some point, I think on umount. I have to reset. Next test with noop scheduler:

[root@cherry pool]# echo noop >/sys/block/sdc/queue/scheduler
[root@cherry pool]# tar c . | pv >/dev/null
tar: .: file changed as we read it             <=>                             ]
3.06GB 0:04:39 [11.2MB/s] [                   <=>                              ]
[root@cherry pool]# tar c . | pv >/dev/null
3.06GB 0:04:35 [11.4MB/s] [                       <=>                          ]
[root@cherry pool]# 

Change notice again. Speed is the same, may be a bit slower. Another reboot (with reset). And now test with cfq:

[root@cherry pool]# echo cfq  >/sys/block/sdc/queue/scheduler
[root@cherry pool]# tar c . | pv >/dev/null
tar: .: file changed as we read it                                             ]
3.06GB 0:03:15 [16.1MB/s] [   <=>                                              ]
[root@cherry pool]# tar c . | pv >/dev/null
3.06GB 0:03:10 [16.5MB/s] [        <=>                                         ]
[root@cherry pool]# 

And this is faster than noop and deadline. (I need to reset after this too)

@chani
Copy link

chani commented Mar 30, 2011

There's another problem with CFQ, which is currently also noticable in software raid on some kernels and seems to be a known bug. As soon as you mix a desktop system with a software raid and create high i/o (just try bonnie++ for example) you're deadlocking your desktop. Solution is switching to another scheduler (noop, deadline) or using cgroups, iirc i've seen a kernel option in 2.6.38 which solves this trouble.

So, just wanted to point out you/some users might have trouble with CFQ + ZFS as soon as you're using a desktop environment on the same machine. Also i remember that brian said he's/they're currently only working on getting it implemented, performance stuff will be done later.

@rlaager
Copy link
Member

rlaager commented Apr 4, 2011

I get this at startup:
[ 22.370757] ZFS: Unable to set "noop" scheduler for /dev/sda1 (sda): -14
[ 22.370775] ZFS: Unable to set "noop" scheduler for /dev/sdb1 (sdb): -14
[ 22.564358] ZFS: Unable to set "noop" scheduler for /dev/sdb1 (sdb): -14
[ 22.564368] ZFS: Unable to set "noop" scheduler for /dev/sda1 (sda): -14

First off, I'm not sure why it tries twice. But second, it's definitely trying to set it for the partition rather than the whole drive. I'm running 0.6.0.5-0ubuntu3~natty1 from the PPA.

@behlendorf
Copy link
Contributor

Does you still always see this at start up with the current code?

The error here is a little misleading. When you initially configure zfs, if you give it whole devices, it will automatically partition the device and create a GPT partition table. The first partition will be aligned exactly to the 1 MiB boundary to ensure correct 512/4096 sector alignment and to leave some headroom. In addition the whole disk property will be set. This property is then used when opening the device to determine if it should attempt to adjust the elevator for the whole device. If it owns the entire device it attempts to do this and will use the correct device name, not the partition name. See vdev_elevator_switch() for the details.

Unfortunately, I've seen spurious EFAULT errors in the past for some reason when attempting to change the elevator so it now retries three times and only emits an error on the 3rd failure.

@user318
Copy link
Author

user318 commented Apr 23, 2011

I confirm. It does not complains about "unable to set noop" with current code. And it set noop on whole device.
Can You say where the "whole device" flag is set? Is it some internal structures of zfs on disk? Because I have see no flags in gpt and no flags in zpool get.
Just interesting where it can be seen and can I change it for some reason?

@behlendorf
Copy link
Contributor

The wholedisk property gets set as a key/value pairs in a per-vdev configuration nvlist. If your interested in how all of this happens I would start by looking at cmd/zpool/zpool_vdev.c:make_leaf_vdev(). It is here where the passed device is determined if it's a whole disk, partition, or file. The ZPOOL_CONFIG_WHOLE_DISK key is they set accordingly in the nvlist for future consumption.

prateekpandey14 pushed a commit to prateekpandey14/zfs that referenced this issue Mar 13, 2019
(cherry picked from commit 27f17d5)

Signed-off-by: kmova <kiran.mova@openebs.io>
sdimitro pushed a commit to sdimitro/zfs that referenced this issue Feb 14, 2022
ixhamza added a commit to ixhamza/zfs that referenced this issue Nov 15, 2023
…a-release

Sync with Upstream zfs-2.2-release
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants