Skip to content
Henryk Paluch edited this page Dec 9, 2023 · 3 revisions

BTRFS

BTRFS is official open-source alternative to ZFS (ZFS was open sourced under CDDL license by Sun but closed later by Oracle so there is friction and doubt if it is legally "safe" to use it) from Oracle.

Benefits:

  • official open-source from Oracle
  • included in standard kernel (no license issues)
  • very easy management and mount of sub-volumes (unlike ZFS where you have to export/import and use complex commands when you want just temporarily mount some ZFS volume elsewhere).

Problems:

  • MySQL performance is very slow:
  • free space accounting is problematic (you need to enable "group quota", but enabling group quota will slow down whole filesystems when you have more than few snapshots...)
    • unlike that ZFS is always properly reporting both sub-volume usage and fragmentation level
  • may corrupt filesystem when disk is full
  • reported free space may not be available due fragmentation and disk filesystem may become full even when free space exists.

MySQL on BTRFS in depth

Main article that shows that there is a problem:

I decided to repeat my trivial MySQL benchmarks with test-ATIS, this time under Ubuntu 22.04 LTS (VM Inside Proxmox VE 8.1.3, lvm-thin on Seagate IronWolf 4TB, cache=unsafe, discard=on).

Here are brief results for test-ATIS and package mariadb-server version 10.6.12-0ubuntu0.22.04.1 Kernel: 5.15.0-52-generic

Filesystem Options Time (seconds, less is better)
ext4 defaults 23
btrfs defaults 32
btrfs nocow 28
btrfs nobarrier 30
btrfs nocow,nobarrier 30

There are some things that really puzzles me (why nobarrier is slower ?).

However found this thread:

Where is recommended:

innodb_doublewrite = 0
innodb_flush_method = O_DSYNC

Here is my /etc/mysql/mariadb.conf.d/99-local.cnf

[mysqld]
datadir             = /mnt/btrfs/data1
#datadir            = /mnt/btrfs/data2nocow
innodb_doublewrite  = 0
innodb_flush_method = O_DSYNC

Remember to always verify such settings with SQL command:

show variables like 'innodb_doublewrite';
show variables like 'innodb_flush_method';

Here are results

Filesystem Options Time (seconds, less is better)
ext4 defaults 23
btrfs defaults 32
btrfs defaults + innodb tunning 34

Hmm, worse

But here is something I did - but keep in mind, that in case of unclean shutdown you will loose data!!!:

Here is my /etc/mysql/mariadb.conf.d/99-local.cnf

[mysqld]
datadir                 = /mnt/btrfs/data1
innodb_doublewrite = 0
# NEVER USE nosync IN PRODUCTION!
innodb_flush_method = nosync

And now it beats ext4 (with sync which is unfair of course):

Filesystem Options Time (seconds, less is better)
ext4 defaults 23
ext4 defaults + innodb nosync 15
btrfs defaults 32
btrfs defaults + innodb tunning 34
btrfs defaults + innodb nosync 14

The difference is so striking that I also tried same settings under ext4. Of course: NEVER USE IT FOR PRODUCTION!

So summary is actually optimistic - btrfs is not slow by design, but the fsync(2) and friends are suboptimal so far.

Clone this wiki locally