Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

btrfs stats collector #1100

Closed
alevchuk opened this issue Oct 6, 2018 · 17 comments · Fixed by #1512
Closed

btrfs stats collector #1100

alevchuk opened this issue Oct 6, 2018 · 17 comments · Fixed by #1512

Comments

@alevchuk
Copy link

alevchuk commented Oct 6, 2018

Enhancement

Please develop a collector of BTRFS stats. They look like this:

$ sudo btrfs dev stats /mnt/btrfs/
[/dev/mmcblk0p3].write_io_errs   0
[/dev/mmcblk0p3].read_io_errs    0
[/dev/mmcblk0p3].flush_io_errs   0
[/dev/mmcblk0p3].corruption_errs 0
[/dev/mmcblk0p3].generation_errs 0
[/dev/sda].write_io_errs   72430
[/dev/sda].read_io_errs    76151
[/dev/sda].flush_io_errs   61
[/dev/sda].corruption_errs 0
[/dev/sda].generation_errs 0

Documentation: https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-device#DEVICE_STATS

@SuperQ
Copy link
Member

SuperQ commented Oct 7, 2018

Do you know where in /proc or /sys this information is available?

@hhoffstaette
Copy link
Contributor

hhoffstaette commented Oct 7, 2018

I previously looked into writing an exporter for btrfs but got sidetracked. The information exposed
in /sys/fs/btrfs/ is quite exhaustive, but AFAICT the device stats/errors as shown above are
currently (as of 4.18.12) not exposed in sysfs and would require ioctls.
I vaguely remember seeing some patches to expose them in sysfs but just checked again and it
seems they never made it upstream.
The easiest way to accomplish this - and everything else - will probably be to use python-btrfs to make a standalone btrfs_exporter in python instead of recreating all the kernel/userspace/ioctl mappings in golang.
I will raise the issue on the btrfs list and see if I can scrape together willing participants.

@SuperQ
Copy link
Member

SuperQ commented Oct 7, 2018

I see a few Go libraries for handling btrfs. They seem focused around sub volume management.

Just a reminder about implementation requirements here:

  • No subprocesses.
  • No extra privileges.

@hhoffstaette
Copy link
Contributor

Just a reminder about implementation requirements here:

  • No subprocesses.
  • No extra privileges.

At least the dev stats/error information currently requires privileged access.

@kdave
Copy link

kdave commented Oct 8, 2018

Reading device stats does not require root, the only part that does is the reset. https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git/tree/fs/btrfs/ioctl.c?h=v4.18#n4692

@hhoffstaette
Copy link
Contributor

hhoffstaette commented Oct 8, 2018

Hi David,

Reading device stats does not require root, the only part that does is the reset.

..and yet it doesn't work for me on 4.18.12 with -progs v4.17.1. end of strace:

openat(AT_FDCWD, "/mnt/backup", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 4
fstat(4, {st_mode=S_IFDIR|0755, st_size=276, ...}) = 0
ioctl(4, BTRFS_IOC_FS_INFO, {max_id=1, num_devices=1, fsid=d163af2f-6e03-4972-bfd6-30c68b6ed312, nodesize=16384, sectorsize=4096, clone_alignment=4096}) = 0
ioctl(4, BTRFS_IOC_TREE_SEARCH, {key={tree_id=BTRFS_CHUNK_TREE_OBJECTID, min_objectid=BTRFS_ROOT_TREE_OBJECTID, max_objectid=BTRFS_ROOT_TREE_OBJECTID, min_offset=1, max_offset=UINT64_MAX, min_transid=0, max_transid=UINT64_MAX, min_type=BTRFS_DEV_ITEM_KEY, max_type=BTRFS_DEV_ITEM_KEY, nr_items=30}}) = -1 EPERM (Operation not permitted)
close(4)                                = 0
write(2, "ERROR: ", 7ERROR: )                  = 7
write(2, "getting device info for /mnt/bac"..., 67getting device info for /mnt/backup failed: Operation not permitted) = 67
write(2, "\n", 1

Both btrfs_ioctl_tree_search{_v2} unconditionally check for CAP_SYS_ADMIN.

@kdave
Copy link

kdave commented Oct 8, 2018

Something is calling the search tree ioctl that's not accessible. btrfs dev stats /mnt/path works for me here.

@hhoffstaette
Copy link
Contributor

hhoffstaette commented Oct 8, 2018

For the peanut gallery:

I tracked the privilege issue down to difference in behaviour of btrfs-progs
when querying a mountpoint vs. querying a device (and can now reproduce it reliably).
The mailing list thread has it all, but the tl;dr: is that getting stats
directly from a single device works without CAP_SYS_ADMIN, whereas querying the mount point
of a filesystem - even if it has only a single device - will fail without privileges.

However, invoking the ioctl programmatically on a mount point will work (here /tmp/test is loop0):

$python3.6
Python 3.6.6 (default, Oct  1 2018, 11:15:11) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import btrfs
>>> fs = btrfs.FileSystem("/tmp/test")
>>> btrfs.utils.pretty_print(fs.dev_stats(1)) 
devid 1 write_errs 0 read_errs 0 flush_errs 0 generation_errs 0 corruption_errs 0

So at least for accessing the device stats/errors programmatically no privilege is necessary
after all. Mystery solved!

@Andysimcoe
Copy link

It would be useful if it could expose metrics relating to the usage too:
data_ratio:
1.00
device_allocated:
21.02GiB
device_missing:
0.00B
device_size:
50.00GiB
device_unallocated:
28.98GiB
free_estimated:
33.76GiB
free_estimated_min:
19.27GiB
global_reserve:
88.55MiB
global_reserve_used:
0.00B
metadata_ratio:
2.00
used:
12.02GiB

@vmorris
Copy link

vmorris commented Nov 19, 2018

+1 for @Andysimcoe btrfs metrics list.

Any idea when this might start to emerge?

@SuperQ
Copy link
Member

SuperQ commented Nov 19, 2018

Adding more procfs/sysfs parsing should be added to https://github.com/prometheus/procfs. Once we have parsing in the library, we can add it to the exporter.

@antonagestam
Copy link

antonagestam commented Dec 3, 2018

After having a btrfs system crash because the metadata allocation ran full, I'd suggest also exposing the data from btrfs fi df /mnt/. Is that available through python-btrfs?

Update: this looks like what I'm looking for: https://github.com/knorrie/python-btrfs/blob/master/examples/btrfs-fi-df.py

I might look into creating a separate exporter for this.

@hhoffstaette
Copy link
Contributor

hhoffstaette commented Dec 3, 2018

Update on this:

The easiest way to accomplish this - and everything else - will probably be to use [python-btrfs]
(https://github.com/knorrie/python-btrfs) to make a standalone btrfs_exporter in python instead of
recreating all the kernel/userspace/ioctl mappings in golang.

I started a standalone btrfs_exporter based on the official python client and python-btrfs, but stopped because it was horrible to develop. I don't really know python and the official client library is not only super weird and fragile, it also leaks memory. It took me less than three days to get somewhere in C++ (using prometheus-cpp) and I already have more working than before, plus it is faster, doesn't leak memory and uses 1/10th the disk space/memory.
No timeline yet since I just got dragged into a surprise work contract.

@antonagestam
Copy link

@hhoffstaette Which is the official python client?

@hhoffstaette
Copy link
Contributor

@hhoffstaette Which is the official python client?

https://github.com/prometheus/client_python

@discordianfish
Copy link
Member

See #1200, apparently some stats are available in procfs. Ideally somebody would add support for that to procfs and we could have a 'real' collector.

silkeh added a commit to silkeh/node_exporter that referenced this issue Feb 19, 2020
Resolves prometheus#1100

Signed-off-by: Silke Hofstra <silke@slxh.eu>
SuperQ pushed a commit that referenced this issue Feb 19, 2020
* Add procfs/btrfs to vendor folder
* Add Btrfs collector

Resolves #1100

Signed-off-by: Silke Hofstra <silke@slxh.eu>
@leth
Copy link
Contributor

leth commented Oct 7, 2022

I'm happy to report that this was added in #2193 and was released in 1.4.0 / 2022-09-24 😄
Please let me know if you find any issues!

oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this issue Apr 9, 2024
* Add procfs/btrfs to vendor folder
* Add Btrfs collector

Resolves prometheus#1100

Signed-off-by: Silke Hofstra <silke@slxh.eu>
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this issue Apr 9, 2024
* Add procfs/btrfs to vendor folder
* Add Btrfs collector

Resolves prometheus#1100

Signed-off-by: Silke Hofstra <silke@slxh.eu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants