-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zsysctl service gc doesn't clean up bpool #155
Comments
Thanks for reporting this bug and for the logs. As a side note, I see you are using docker. We are in discussion with the docker team to change the default layout so that we are not using snapshots. I strongly suggest you to follow the instructions on https://github.com/ubuntu/zsys/wiki/Performance-issue-with-docker-on-ubuntu-20.04-LTS with docker daemon stopped. Also, you can remove then some stopped containers as docker is filing up many datasets. |
I'll take a look at the docker change. Here's the /boot usage:
|
Thank you for the docker script. Looks like I can't run it out of the box (ifs/greps/etc don't match up to my system) but I can scavenge some of the purging logic from it. |
I have experienced this too. In my case during system setup using an ansible playbook there are several calls to apt. As the initrd gets rebuilt multiple times with each version getting a new snapshot I quickly ran out of space on my 1Gb bpool even with only two kernels. (I can't remember if I manually set that size or if it was the automatic partitioning from the 20.04 installer.) |
Yes, the following is the output of update-grub for just 4 kernels (one of which has already been purged) and usage of just a coupole of weeks. It looks really huge to me:
|
I'm having this same problem.
|
I'm having the same issue.
When I run
but all those kernels have been already removed. |
Try with zsysctl to remove state, it worked for me.
|
It didn't work for me.
|
Sorry the right part of the
You can list the boot snaphots anytime with
|
@sdelrio Thanks! Now it worked. I had to remove more than 10 states! but now I have plenty of free space with a only a few kernels installed. |
Here's the log lines that will probably bring people here to fix this issue:
If you update your zfs-on-root ubuntu system often (which it looks like all of us do), you'll probably run into this issue. The solution is what @sdelrio recommends:
|
if it helps I modify the
|
I'm still somewhat confused. I did the @sdelrio + @stephen-mw recipe to remove unnecessary intermediate "states" and end up with: # du -sh /boot
241M /boot
# zfs list -t all -r bpool -o space
NAME AVAIL USED USEDSNAP USEDDS USEDREFRESERV USEDCHILD
bpool 955M 837M 0B 96K 0B 837M
bpool@mksnapshot-focal - 0B - - - -
bpool/BOOT 955M 836M 0B 96K 0B 836M
bpool/BOOT@focal - 0B - - - -
bpool/BOOT@mksnapshot-focal - 0B - - - -
bpool/BOOT/ubuntu_4wysfp 955M 836M 618M 217M 0B 0B
bpool/BOOT/ubuntu_4wysfp@focal - 64K - - - -
bpool/BOOT/ubuntu_4wysfp@mksnapshot-focal - 64K - - - -
bpool/BOOT/ubuntu_4wysfp@autozsys_ewktga - 72K - - - -
bpool/BOOT/ubuntu_4wysfp@autozsys_sp4q3t - 72K - Which seem close to each other (zfs has some kind of bookkeeping overhead). Did this "work"? How do I know? Will I have to do this exercise again in a few months? Is /etc/zsys.conf "incremental"? Meaning if my only change is history.keeplast:5, can I state just that, i.e.: history:
keeplast: 5 I understand you're not responsible for my lack of zfs + zsys knowledge, but I get a little squeemish when I get "boot" related error messages because I don't understand grub + initrd + kernels all that much better. |
Exact, zfs will only keep the changes. If you only want you change the value, you should get the default config and modify that value. Take a look on https://didrocks.fr/2020/06/04/zfs-focus-on-ubuntu-20.04-lts-zsys-state-collection/ for the default policy. So if your /boot has small changes, you could keep a lot of
Depending on the case you could need to update this values so you can get more snapshots or less. |
In case it is useful, here are a few more details about how it occurred in my case. I am using a new install of Ubuntu 20.04. This is my first experience with zfs and I hadn't made any changes to the default zfs configuration, or issued any zfs commands at all, for that matter.
When I noticed the "ERROR couldn't save system state: Minimum free space to take a snapshot..." errors, my system had 3 kernels installed in
I've attached the output from I was able to recover disk space using the commands that @sdelrio posted. I just kept deleting the oldest snapshot. After deleting 18 snapshots, the size of the filesystem, as reported by |
The default minfreepoolspace is 20% and you have 81% usage on /boot, perhaps you need to remove old kernel or just your 364mb size for that partition is really low. I have 3 kernels and manual compiled vga drivers and I have 353Mb used.
|
I may be just re-stating what @sdelrio already said (thanks!), but putting all of this together, it looks to me like there is an incompatibility between the default size of the Let's consider the worst case for bpool, in terms of state storage. I think the worst case would be that every time zsys stores a new state, there is a new kernel in |
Just came across this problem on 20.10. |
I am also getting this message. I installed using the default ZFS settings on the Ubuntu Installer on the entire disk and it created a small partition to store the bpool. It should either make a bigger partition or limit the max number of snapshots to a lower number, like 5 or 10. Where can I configure the max autosnapshots (saved states) to keep? |
Source: https://didrocks.fr/2020/06/04/zfs-focus-on-ubuntu-20.04-lts-zsys-state-collection/ |
I've made a script to do manual garbage collection. Right now it's interactive, if I'm feeling less lazy I might make it so you can pass CLI variables to it. Unless anyone else wants to tackle that. Cheers! DISCLAIMER: User at your own risk, I accept no responsibility or liability whatsoever. You have been warned. |
Here's a simple script to remove the first 5 entries:
|
I'm about to upgrade to hirstute using |
Yes, (if your system didn’t hit the threadshold as explained in this bug) please see https://didrocks.fr/2020/05/28/zfs-focus-on-ubuntu-20.04-lts-zsys-general-principle-on-state-management/ on the rollback principle. |
@didrocks, slick! I upgraded two "zfs rooted" machines without a lot of backup ceremony. Here are my commands for posterity. I have a few questions below. zsysctl state save --system before-21.04 # manual state save in case upgrade gets confused
do-release-upgrade --allow-third-party
# answer questions, install stuff, reboot
# after reboot
lsb_release -sc
hirsute
zsysctl show
Name: rpool/ROOT/ubuntu_jwoh13
ZSys: true
Last Used: current
History:
- Name: rpool/ROOT/ubuntu_jwoh13@autozsys_ugazw0
Created on: 2021-06-07 11:20:12
- Name: rpool/ROOT/ubuntu_jwoh13@autozsys_zr26da
Created on: 2021-06-07 11:20:11
- Name: rpool/ROOT/ubuntu_jwoh13@autozsys_8jcolo
Created on: 2021-06-07 11:20:11
- Name: rpool/ROOT/ubuntu_jwoh13@autozsys_c8tzhz
Created on: 2021-06-07 10:55:04
- Name: rpool/ROOT/ubuntu_jwoh13@autozsys_4t2hsk
Created on: 2021-06-07 10:55:04
- Name: rpool/ROOT/ubuntu_jwoh13@autozsys_tj0b24
Created on: 2021-06-07 10:55:03
- Name: rpool/ROOT/ubuntu_jwoh13@autozsys_lwy3w8
Created on: 2021-06-07 10:53:32
- Name: rpool/ROOT/ubuntu_jwoh13@autozsys_ovtmvf
Created on: 2021-06-07 10:53:31
- Name: rpool/ROOT/ubuntu_jwoh13@autozsys_e4w0vr
Created on: 2021-06-07 10:53:31
- Name: rpool/ROOT/ubuntu_jwoh13@before-21.04
Created on: 2021-06-07 10:37:01
Users:
- Name: mcarifio
History:
- rpool/USERDATA/mcarifio_f8h849@autozsys_tb5ooj (2021-06-07 11:25:50)
- rpool/USERDATA/mcarifio_f8h849@autozsys_zr26da (2021-06-07 11:20:12)
- rpool/USERDATA/mcarifio_f8h849@autozsys_ugazw0 (2021-06-07 11:20:12)
- rpool/USERDATA/mcarifio_f8h849@autozsys_8jcolo (2021-06-07 11:20:11)
- rpool/USERDATA/mcarifio_f8h849@autozsys_1m5lby (2021-06-07 11:19:28)
- rpool/USERDATA/mcarifio_f8h849@autozsys_c8tzhz (2021-06-07 10:55:04)
- rpool/USERDATA/mcarifio_f8h849@autozsys_4t2hsk (2021-06-07 10:55:04)
- rpool/USERDATA/mcarifio_f8h849@autozsys_tj0b24 (2021-06-07 10:55:03)
- rpool/USERDATA/mcarifio_f8h849@autozsys_lwy3w8 (2021-06-07 10:53:32)
- rpool/USERDATA/mcarifio_f8h849@autozsys_ovtmvf (2021-06-07 10:53:31)
- rpool/USERDATA/mcarifio_f8h849@autozsys_e4w0vr (2021-06-07 10:53:31)
- rpool/USERDATA/mcarifio_f8h849@before-21.04 (2021-06-07 10:37:01)
- rpool/USERDATA/mcarifio_f8h849@autozsys_7v2ng6 (2021-06-07 10:18:28)
... many lines removed ...
- Name: root
History:
- rpool/USERDATA/root_f8h849@autozsys_zr26da (2021-06-07 11:20:12)
- rpool/USERDATA/root_f8h849@autozsys_ugazw0 (2021-06-07 11:20:12)
- rpool/USERDATA/root_f8h849@autozsys_8jcolo (2021-06-07 11:20:11)
- rpool/USERDATA/root_f8h849@autozsys_c8tzhz (2021-06-07 10:55:04)
- rpool/USERDATA/root_f8h849@autozsys_4t2hsk (2021-06-07 10:55:04)
- rpool/USERDATA/root_f8h849@autozsys_tj0b24 (2021-06-07 10:55:03)
- rpool/USERDATA/root_f8h849@autozsys_lwy3w8 (2021-06-07 10:53:32)
- rpool/USERDATA/root_f8h849@autozsys_ovtmvf (2021-06-07 10:53:31)
- rpool/USERDATA/root_f8h849@autozsys_e4w0vr (2021-06-07 10:53:31)
- rpool/USERDATA/root_f8h849@before-21.04 (2021-06-07 10:37:01)
- rpool/USERDATA/root_f8h849@autozsys_5ddk02 (2021-06-06 16:07:37)
- rpool/USERDATA/root_f8h849@autozsys_z63w7x (2021-06-06 16:00:07)
- rpool/USERDATA/root_f8h849@autozsys_qr1gda (2021-06-05 06:09:13)
- rpool/USERDATA/root_f8h849@autozsys_mtgsww (2021-06-04 06:26:02)
- rpool/USERDATA/root_f8h849@autozsys_jcv2s0 (2021-06-03 06:07:08)
- rpool/USERDATA/root_f8h849@autozsys_4t9it7 (2021-06-02 06:03:46)
- rpool/USERDATA/root_f8h849@autozsys_dil5hr (2021-05-31 18:40:02)
- rpool/USERDATA/root_f8h849@autozsys_7hl69w (2021-05-26 06:30:29)
As you can see I "reaped" all the older system snapshots and then cut one of my own: If I look at the system snapshot I assume I can take both snapshots and clones of any zfs file system on my machine. It looks like I should stay clear of any name with an Finally looking at the
My understanding of this logic is that you're using the uuid of rpool/ROOT/ubuntu_jwoh13 to set grub2's root. zfs list rpool/ROOT/ubuntu_jwoh13 -o guid
GUID
15224702881960400867 which is not
Where is it? |
I suggest reading the whole blog post suite on https://didrocks.fr/tags/zfs/, which will answer most of your question (in the repo README, this is referenced), but report are not really the place to ask for user support outside of bugs, let’s try to keep the bugs focused please. |
Yes, of course. Got carried away in my enthusiasm. @Venotek, ty for that "reaping script". Saved me some time. |
When I try to remove older snapshots using |
Here is the debug output. I can't tell if its actually making progress or stuck looping. In my previous attempt I let it run for 5 minutes. This output is just the first 20 seconds of another attempt to run it. |
I have attempted to remedy this problem on a Ubuntu 21.04 system by using the /etc/zsys.conf from @sdelrio #155 (comment) and a few invocations of the bash one-liner by @berenddeboer #155 (comment). Performing these and applying an update has resulted in my system being unable to boot without manually entering in the linux and initrd commands into grub. What is the best venue for me to seek help with this? EDIT: This seems to have fixed my problem: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1867007/comments/6 |
My script works great. |
Looking at this from the distance of a running 22.04 installation whose
Here the chosen way to clean up system and associated user states, leaving only the last two recent ones, is: zsysctl show | grep -P '(?<=rpool/ROOT/ubuntu_......@autozsys_)(.*)' -o | tail -n +3 | tac | xargs -L 1 zsysctl state remove -s --dry-run Caution: Destructive.The invisibility of the $ zsysctl show | grep -P '(?<=rpool/ROOT/ubuntu_......@autozsys_)(.*)' -o | tac | xargs -L 1 zsysctl state remove -v -s --dry-run
INFO Anforderung den Systemzustand "q0ofwa" zu entfernen
Zustand rpool/USERDATA/yala_......@autozsys_q0ofwa löschen
Zustand rpool/USERDATA/root_......@autozsys_q0ofwa löschen
Zustand rpool/ROOT/ubuntu_......@autozsys_q0ofwa löschen
INFO Anforderung den Systemzustand "xthef3" zu entfernen
Zustand rpool/USERDATA/root_......@autozsys_xthef3 löschen
Zustand rpool/USERDATA/yala_......@autozsys_xthef3 löschen
Zustand rpool/ROOT/ubuntu_......@autozsys_xthef3 löschen
The below demonstrates how we go from a cluttered $ zfs list -r -t all bpool
NAME USED AVAIL REFER MOUNTPOINT
bpool 1.48G 278M 96K /boot
bpool/BOOT 1.47G 278M 96K none
bpool/BOOT/ubuntu_...... 1.47G 278M 311M /boot
bpool/BOOT/ubuntu_......@autozsys_bn5c0r 72K - 307M -
bpool/BOOT/ubuntu_......@autozsys_yxnexc 0B - 307M -
bpool/BOOT/ubuntu_......@autozsys_ftf950 0B - 307M -
bpool/BOOT/ubuntu_......@autozsys_tmihy8 0B - 308M -
bpool/BOOT/ubuntu_......@autozsys_fq1qaz 0B - 308M -
bpool/BOOT/ubuntu_......@autozsys_w9gf5c 0B - 308M -
bpool/BOOT/ubuntu_......@autozsys_8k4fif 0B - 308M -
bpool/BOOT/ubuntu_......@autozsys_ranw95 0B - 308M -
bpool/BOOT/ubuntu_......@autozsys_5f9i8r 72K - 308M -
bpool/BOOT/ubuntu_......@autozsys_jkwu4v 113M - 433M -
bpool/BOOT/ubuntu_......@autozsys_thqqx6 72K - 433M -
bpool/BOOT/ubuntu_......@autozsys_uj7p3w 64K - 308M -
bpool/BOOT/ubuntu_......@autozsys_nyom7g 72K - 308M -
bpool/BOOT/ubuntu_......@autozsys_u9qosw 64K - 308M -
bpool/BOOT/ubuntu_......@autozsys_ujtu0w 0B - 308M -
bpool/BOOT/ubuntu_......@autozsys_u3tku7 0B - 308M -
bpool/BOOT/ubuntu_......@autozsys_6b6vae 72K - 308M -
bpool/BOOT/ubuntu_......@autozsys_uz70ms 0B - 435M -
bpool/BOOT/ubuntu_......@autozsys_05yezg 0B - 435M -
bpool/BOOT/ubuntu_......@autozsys_hw6cvn 0B - 310M -
bpool/BOOT/ubuntu_......@autozsys_v8iv4c 0B - 310M -
bpool/BOOT/ubuntu_......@autozsys_aakwy2 0B - 310M -
bpool/BOOT/ubuntu_......@autozsys_y4ntwf 0B - 310M -
bpool/BOOT/ubuntu_......@autozsys_q0ofwa 64K - 310M -
bpool/BOOT/ubuntu_......@autozsys_xthef3 72K - 310M - Via an already more balanced, but also more recent representation of captured and bootable Kernel and system history, as seen meanwhile the execution of the sweep command: $ zfs list -r -t all bpool
NAME USED AVAIL REFER MOUNTPOINT
bpool 1006M 785M 96K /boot
bpool/BOOT 1003M 785M 96K none
bpool/BOOT/ubuntu_...... 1003M 785M 311M /boot
bpool/BOOT/ubuntu_......@autozsys_uj7p3w 72K - 308M -
bpool/BOOT/ubuntu_......@autozsys_nyom7g 72K - 308M -
bpool/BOOT/ubuntu_......@autozsys_u9qosw 64K - 308M -
bpool/BOOT/ubuntu_......@autozsys_ujtu0w 0B - 308M -
bpool/BOOT/ubuntu_......@autozsys_u3tku7 0B - 308M -
bpool/BOOT/ubuntu_......@autozsys_6b6vae 72K - 308M -
bpool/BOOT/ubuntu_......@autozsys_uz70ms 0B - 435M -
bpool/BOOT/ubuntu_......@autozsys_05yezg 0B - 435M -
bpool/BOOT/ubuntu_......@autozsys_hw6cvn 0B - 310M -
bpool/BOOT/ubuntu_......@autozsys_v8iv4c 0B - 310M -
bpool/BOOT/ubuntu_......@autozsys_aakwy2 0B - 310M -
bpool/BOOT/ubuntu_......@autozsys_y4ntwf 0B - 310M -
bpool/BOOT/ubuntu_......@autozsys_q0ofwa 64K - 310M -
bpool/BOOT/ubuntu_......@autozsys_xthef3 72K - 310M - Into an oblivious state of recency, yet ultimately also preparedness for snapshotting of system states ( $ zfs list -r -t all bpool
NAME USED AVAIL REFER MOUNTPOINT
bpool 611M 1.15G 96K /boot
bpool/BOOT 608M 1.15G 96K none
bpool/BOOT/ubuntu_...... 608M 1.15G 311M /boot
bpool/BOOT/ubuntu_......@autozsys_q0ofwa 72K - 310M -
bpool/BOOT/ubuntu_......@autozsys_xthef3 72K - 310M -
It is now possible to create system states with This adds another level of indirection to potentially solve the earlier described regressions, which are potentially an upstream issue (to another package of ZSYS).
Ultimately we could declare an interval of minimum ( Eventually a sparse deletion strategy could keep older point-in-time recoveries, while newer one's are selected for deletion on the basis of a "thinning" factor. This "thinning" factor appears to be directly proportional to the free space of the Or we only keep system snapshots that have at least one of the last three Kernel versions available, and destroy all all others. How do you cope with this regression in the long run, did you find other procedures that are well established? |
Describe the bug
During my daily apt update, I got this message:
Checking bpool, I can see that
sudo du -h /boot
shows 330M whilesudo zfs list -t all -r bpool
shows the dataset taking up 1.56G of the available 1.88G. Some of the auto snapshots are from 3 months ago.I noticed there was no gc timer; so tried running it manually instead.
I ran
sudo zsysctl service gc --all -vv
which cleaned up rpool a bit, but didn't touch bpool.I thought about manually deleting them, but saw that others reported having update-grub problems after doing so.
If it is intentional that gc shouldn't clean up the bpool as well; then maybe some documentation on the correct way to clean it up would be ideal.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Either via timer or manually, able to gc the bpool
For ubuntu users, please run and copy the following:
ubuntu-bug zsys --save=/tmp/report
/tmp/report
content:Screenshots
If applicable, add screenshots to help explain your problem.
Installed versions:
/etc/os-release
)zsysctl version
output)Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: