Disk space is not properly allocated when using unequal size vdevs #3389

dechamps · 2015-05-09T19:38:29Z

(I think this bug applies to all ZFS platforms, not just ZFS On Linux.)

There is a legitimate use case for having vdevs of unequal size in a pool, if the user only has a heterogeneous set of disks on hand. It sounds like ZFS should be able to handle this just fine, but in practice it doesn't: such a pool will become a ticking time bomb that will start behaving miserably as soon as a significant amount of data is poured into it.

Steps to reproduce:

$ dd if=/dev/zero of=/tmp/d1 bs=1 count=1 seek=134217728 # create 128M file
$ dd if=/dev/zero of=/tmp/d2 bs=1 count=1 seek=536870912 # create 512M file
$ zpool create testspace /tmp/d1 /tmp/d2
$ zpool iostat -v testspace
                                                   capacity
pool                                             alloc   free
-----------------------------------------------  -----  -----
testspace                                        68.5K   627M
  /tmp/d1                                          26K   123M
  /tmp/d2                                        42.5K   504M
-----------------------------------------------  -----  -----

So far so good. Here's the ticking time bomb:

 $ dd if=/dev/zero of=/testspace/d2 bs=1M count=384 # write 384M to pool

d1 is 5 times smaller than d2. So one might reasonably expect ZFS to do The Right Thing and simply allocate 76M on d1 and 308M on d2, keeping both vdevs usage ratio constant (~60% in this case). Unfortunately, we are disappointed:

 $ zpool iostat -v testspace
                                                    capacity
 pool                                             alloc   free
 -----------------------------------------------  -----  -----
 testspace                                         379M   248M
   /tmp/d1                                         119M  4.14M
   /tmp/d2                                         261M   243M
 -----------------------------------------------  -----  -----

It all goes downhill from there. Performance goes out the window as most of the future writes will end up only being written to one vdev d2, since it's the only one that has space left. Adding insult to injury, ZFS desperately persists in trying to allocate blocks from d1 despite the fact that it's already 97% full, which is crazy and results in huge slowdowns because the allocator is struggling to find free space on d1.

This behavior is highly surprising and I strongly believe it should be considered a bug; a quick Google search reveals no such warnings about using heterogeneous vdevs in a zpool, which means the user will probably only discover this when it's already too late, since the issues only start cropping up after the pool is populated with a large amount of data.

I suspect the problem lies in the metaslab allocator bias code. The comments on that code sound like it's supposed to equalize vdev utilization, but that only seems to work out for the "new empty vdev" use case, not for the "unequal vdev capacities" use case. I suspect the math used in that code doesn't actually work out in the long term when vdev capacities are different.

The text was updated successfully, but these errors were encountered:

dechamps · 2015-05-10T15:46:35Z

I wrote a fix in #3391.

dechamps mentioned this issue May 10, 2015

Allocate disk space fairly in the presence of vdevs of unequal size #3391

Closed

behlendorf closed this as completed in bb3250d Jun 22, 2015

behlendorf mentioned this issue Jun 29, 2015

Space re-balancing with large file blocks #3540

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disk space is not properly allocated when using unequal size vdevs #3389

Disk space is not properly allocated when using unequal size vdevs #3389

dechamps commented May 9, 2015

dechamps commented May 10, 2015

Disk space is not properly allocated when using unequal size vdevs #3389

Disk space is not properly allocated when using unequal size vdevs #3389

Comments

dechamps commented May 9, 2015

dechamps commented May 10, 2015