[dmesg] ZFS: Unable to set "noop" scheduler #169

user318 · 2011-03-23T09:23:00Z

During mount zfs for some reason tries to set noop scheduler on partition, not the whole drive. In this case whole drive was given to zfs and it created partition on it.
Question is why is it trying to set noop on partition? Also I've made couple of simpe tests (moving some files around) and setting different schedulers on whole drive. And during this tests noop was slower than deadline and cfq. So if this is case-dependent, I think it is more preferrable for user to set scheduler by its own.

behlendorf · 2011-03-23T18:38:24Z

Are you using the 0.6.0-rc2 release? There was a bug like this which was fixes post -rc1. ZFS should try and set the scheduler but it should only be doing it for whole block devices, never for partitions.

As for performance and testing/results I'd you have some I'd love to see them. In theory the noop scheduler for ZFS should be best since it schedules it's own IO. Setting noop should just get you front and back merging without any additional overhead. However, this default value was only set on theory and not practice. So if you have some hard number which show which choice is currently best I've love to see them.

Also you can have ZFS set the scheduler of your choice when it loads by setting the zfs_vdev_scheduler module option. By default it is noop but you can set it to any valid scheduler, or none if you don't want it changing anything.

user318 · 2011-03-23T19:38:25Z

Are you using the 0.6.0-rc2 release?

I used current git source. As I understand, it is 0.6.0-rc2 + latest patches.

I can redo my performance test later. Now I can just say that I put several copies of zfs sources and several iso images on zfs system and measured "tar c" of them all.

user318 · 2011-03-25T20:14:29Z

Last patches seems to introduce bit more instability or may be I stressed zfs less before that. With lot IO it hangs at some point without any notes in dmesg. Deadlock maybe?
On main zfs-testing-machine it hangs a lot, so I have no time to complete tests.
I made test at home on other machine (archlinux with 2.6.32 kernel).
Files looks like this. It is several project sources with compiled binaries.

[root@cherry ~]# cd /pool/
[root@cherry pool]# df -h .
Filesystem            Size  Used Avail Use% Mounted on
pool                  4.0G  3.2G  849M  79% /pool
[root@cherry pool]# find . | wc -l
20089

I have run something like "du -shc" and after it my first test deadlocked too. So I needed to reboot (reset).
Test after reset. Scheduler is deadline:

[root@cherry pool]# cat /sys/block/sdc/queue/scheduler 
noop anticipatory [deadline] cfq 
[root@cherry pool]# tar c . | pv >/dev/null
tar: .: file changed as we read it                 <=>                         ]
3.06GB 0:04:37 [11.3MB/s] [                     <=>                            ]
[root@cherry pool]# tar c . | pv >/dev/null
3.06GB 0:04:33 [11.5MB/s] [                         <=>                        ]
[root@cherry pool]#

Also I want to note "tar: .: file changed as we read it". There was no other IO on this partition. I think it is related to unclean umount. But I thaught zfs can handle this without problems.
After this, I tried to reboot. Reboot hanged at some point, I think on umount. I have to reset. Next test with noop scheduler:

[root@cherry pool]# echo noop >/sys/block/sdc/queue/scheduler
[root@cherry pool]# tar c . | pv >/dev/null
tar: .: file changed as we read it             <=>                             ]
3.06GB 0:04:39 [11.2MB/s] [                   <=>                              ]
[root@cherry pool]# tar c . | pv >/dev/null
3.06GB 0:04:35 [11.4MB/s] [                       <=>                          ]
[root@cherry pool]#

Change notice again. Speed is the same, may be a bit slower. Another reboot (with reset). And now test with cfq:

[root@cherry pool]# echo cfq  >/sys/block/sdc/queue/scheduler
[root@cherry pool]# tar c . | pv >/dev/null
tar: .: file changed as we read it                                             ]
3.06GB 0:03:15 [16.1MB/s] [   <=>                                              ]
[root@cherry pool]# tar c . | pv >/dev/null
3.06GB 0:03:10 [16.5MB/s] [        <=>                                         ]
[root@cherry pool]#

And this is faster than noop and deadline. (I need to reset after this too)

chani · 2011-03-30T09:03:35Z

There's another problem with CFQ, which is currently also noticable in software raid on some kernels and seems to be a known bug. As soon as you mix a desktop system with a software raid and create high i/o (just try bonnie++ for example) you're deadlocking your desktop. Solution is switching to another scheduler (noop, deadline) or using cgroups, iirc i've seen a kernel option in 2.6.38 which solves this trouble.

So, just wanted to point out you/some users might have trouble with CFQ + ZFS as soon as you're using a desktop environment on the same machine. Also i remember that brian said he's/they're currently only working on getting it implemented, performance stuff will be done later.

rlaager · 2011-04-04T20:11:33Z

I get this at startup:
[ 22.370757] ZFS: Unable to set "noop" scheduler for /dev/sda1 (sda): -14
[ 22.370775] ZFS: Unable to set "noop" scheduler for /dev/sdb1 (sdb): -14
[ 22.564358] ZFS: Unable to set "noop" scheduler for /dev/sdb1 (sdb): -14
[ 22.564368] ZFS: Unable to set "noop" scheduler for /dev/sda1 (sda): -14

First off, I'm not sure why it tries twice. But second, it's definitely trying to set it for the partition rather than the whole drive. I'm running 0.6.0.5-0ubuntu3~natty1 from the PPA.

behlendorf · 2011-04-22T19:35:14Z

Does you still always see this at start up with the current code?

The error here is a little misleading. When you initially configure zfs, if you give it whole devices, it will automatically partition the device and create a GPT partition table. The first partition will be aligned exactly to the 1 MiB boundary to ensure correct 512/4096 sector alignment and to leave some headroom. In addition the whole disk property will be set. This property is then used when opening the device to determine if it should attempt to adjust the elevator for the whole device. If it owns the entire device it attempts to do this and will use the correct device name, not the partition name. See vdev_elevator_switch() for the details.

Unfortunately, I've seen spurious EFAULT errors in the past for some reason when attempting to change the elevator so it now retries three times and only emits an error on the 3rd failure.

user318 · 2011-04-23T17:42:02Z

I confirm. It does not complains about "unable to set noop" with current code. And it set noop on whole device.
Can You say where the "whole device" flag is set? Is it some internal structures of zfs on disk? Because I have see no flags in gpt and no flags in zpool get.
Just interesting where it can be seen and can I change it for some reason?

behlendorf · 2011-04-23T18:05:12Z

The wholedisk property gets set as a key/value pairs in a per-vdev configuration nvlist. If your interested in how all of this happens I would start by looking at cmd/zpool/zpool_vdev.c:make_leaf_vdev(). It is here where the passed device is determined if it's a whole disk, partition, or file. The ZPOOL_CONFIG_WHOLE_DISK key is they set accordingly in the nvlist for future consumption.

(cherry picked from commit 27f17d5) Signed-off-by: kmova <kiran.mova@openebs.io>

…s#110)" (openzfs#169) This reverts commit cbe4186.

…a-release Sync with Upstream zfs-2.2-release

behlendorf closed this as completed in e2448b0 Apr 22, 2011

prateekpandey14 pushed a commit to prateekpandey14/zfs that referenced this issue Mar 13, 2019

fix(travis): setup to run on stable branches (openzfs#169)

db58400

(cherry picked from commit 27f17d5) Signed-off-by: kmova <kiran.mova@openebs.io>

sdimitro pushed a commit to sdimitro/zfs that referenced this issue Feb 14, 2022

Revert "DOSE-875 user/kernel interface versioning for upgrade (openzf…

5f74e01

…s#110)" (openzfs#169) This reverts commit cbe4186.

ixhamza added a commit to ixhamza/zfs that referenced this issue Nov 15, 2023

Merge pull request openzfs#169 from truenas/zfs-2.2-release-sync-cobi…

8e57472

…a-release Sync with Upstream zfs-2.2-release

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dmesg] ZFS: Unable to set "noop" scheduler #169

[dmesg] ZFS: Unable to set "noop" scheduler #169

user318 commented Mar 23, 2011

behlendorf commented Mar 23, 2011

user318 commented Mar 23, 2011

user318 commented Mar 25, 2011

chani commented Mar 30, 2011

rlaager commented Apr 4, 2011

behlendorf commented Apr 22, 2011

user318 commented Apr 23, 2011

behlendorf commented Apr 23, 2011

[dmesg] ZFS: Unable to set "noop" scheduler #169

[dmesg] ZFS: Unable to set "noop" scheduler #169

Comments

user318 commented Mar 23, 2011

behlendorf commented Mar 23, 2011

user318 commented Mar 23, 2011

user318 commented Mar 25, 2011

chani commented Mar 30, 2011

rlaager commented Apr 4, 2011

behlendorf commented Apr 22, 2011

user318 commented Apr 23, 2011

behlendorf commented Apr 23, 2011