Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Straw Man 1-prep/templates/iiab-expand-rootfs.service based on 2020's PR #2522 + /usr/sbin/iiab-expand-rootfs "bash -xe" exit-on-error (to defer deleting /.expand-rootfs) #3337

Merged
merged 11 commits into from Sep 26, 2022

Conversation

holta
Copy link
Member

@holta holta commented Aug 8, 2022

  1. To move the state-of-the-art forward, this is just a Straw Man PR to build on and evolve @jvonau and @tim-moody's strict ordering of resize service #2522 discussion 2 years ago and similar work:

  2. The race condition between iiab-expand-rootfs.service and fsck would appear to be the most serious problem (rootfs can fail to fully expand: race condition betw iiab-expand-rootfs & systemd-fsck ? (journalctl confusingly/regularly portrays 2 simultaneous boots due to lack of RTC?) #3325).

  3. Other resiliency improvements can certainly also be added where they prove effective.

  4. Personally...I'm not at all in favor of systemctl disable iiab-expand-rootfs.service after rootfs has expanded (e.g. near the bottom of /usr/sbin/iiab-expand-rootfs).

    The reason is that grassroots communities should retain the full freedom to touch /.expand-rootfs at absolutely anytime — to expand their own regional/spontaneous/generative/remix IIAB disk images — even when (especially when) their community tooling (and skills!) remain comparatively primitive.

@holta holta added this to the 8.0 milestone Aug 8, 2022
@holta
Copy link
Member Author

holta commented Aug 8, 2022

See also @jvonau and @tim-moody's #3325 suggestions from 9 days ago:

[JV] Think there is a race condition between systemd-fsck and iiab-expand-rootfs

[JV] Might want to look at using 'After=' in the systemd unit file, for better staging of events.

[TM] what is the fsck target to go after?

[TM] After=local-fs.target for example. seems the fsck service can return two error targets, including a reboot, so should stay away from those.

@holta
Copy link
Member Author

holta commented Aug 8, 2022

FWIW the RasPiOS version of systemd-fsckd.service is:

[Unit]
Description=File System Check Daemon to report status
Documentation=man:systemd-fsckd.service(8)
DefaultDependencies=no
Requires=systemd-fsckd.socket
Before=shutdown.target

[Service]
ExecStart=/lib/systemd/systemd-fsckd
StandardOutput=journal+console

And its Ubuntu man page is:
https://manpages.ubuntu.com/manpages/jammy/man8/systemd-fsckd.service.8.html

@holta
Copy link
Member Author

holta commented Aug 8, 2022

Just FYI systemd-fsckd.service is identical on RasPiOS, Ubuntu 22.04, Mint 20.3, Mint 21 and Debian 11 (as pasted in just above), which can't hurt.

@holta
Copy link
Member Author

holta commented Aug 8, 2022

An oversimplified (but plausible) concern is that rootfs expansion race conditions (similar to #3325) might occur during 1 out of every 20 boots:

By default, fsck runs after 20 system reboots but should be run manually if your system runs for weeks or months with rebooting.

I don't know if {Ubuntu, Mint, Debian, RasPiOS} abide by this default? And what are default maximums for allowed days/weeks/months (between each fsck) which also affect this?

One very hypothetical collision avoidance tactic could be to force fsck to take a holiday on next boot — eliminating both race condition risk and also the annoying "double delay" (completing both fsck and rootfs expansion, during the same boot).

e.g. with crude hacks like tune2fs -C mount-count and tune2fs -i interval-between-checks[d|m|w] : (or possibly using some much more elegant "please don't fsck on next boot" configuration file/directive, if this approach is genuinely desirable/practical...)

The frequency of file system checks is changed by using the tune2fs command.

https://www.thegeekdiary.com/maintaining-linux-filesystems-using-fsck-and-tune2fs/

@holta
Copy link
Member Author

holta commented Aug 8, 2022

Just FYI others skip fsck, or (temporarily!?) disable fsck, as follows:

  1. https://askubuntu.com/questions/1250119/how-to-skip-filesystem-checks-during-boot/1250141#1250141

    ...command line option fsck.mode=skip can be used to skip the disk check...

  2. https://unix.stackexchange.com/questions/239709/how-to-stop-filesystem-check-fsck-on-boot/239731#239731

    tune2fs -c 0 -i 0 /dev/[sdX]     # But this would need to be turned back on later.

@holta
Copy link
Member Author

holta commented Aug 8, 2022

Not an endorsement of this kind of approach, but FYI/recap:

tune2fs -c max-mount-counts Adjust the number of mounts after which the filesystem will be checked...

tune2fs -C mount-count Set the number of times the filesystem has been mounted...

tune2fs -i interval-between-checks[d|m|w] Adjust the maximal time between two filesystem checks. No suffix or d will interpret the number interval-between-checks as days, m as months, and w as weeks. A value of zero will disable the time-dependent checking...

@holta
Copy link
Member Author

holta commented Aug 8, 2022

[TM] what is the fsck target to go after?

@jvonau's approach 2 years ago (looks reasonable) was to defer rootfs expansion until After=systemd-remount-fs.service which in turn means that systemd-fsck-root.service has already completed.

As confirmed by After=systemd-fsck-root.service within this systemd-remount-fs.service below:

[Unit]
Description=Remount Root and Kernel File Systems
Documentation=man:systemd-remount-fs.service(8)
Documentation=https://www.freedesktop.org/wiki/Software/systemd/APIFileSystems
DefaultDependencies=no
Conflicts=shutdown.target
After=systemd-fsck-root.service
Before=local-fs-pre.target local-fs.target shutdown.target
Wants=local-fs-pre.target

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/lib/systemd/systemd-remount-fs

CLARIF: the above systemd-remount-fs.service exists (with identical contents as above) across Ubuntu 22.04, Mint 21, RasPiOS and Debian 11.

@holta
Copy link
Member Author

holta commented Aug 8, 2022

Across our 4 mainline OS's, here is systemd-fsck-root.service — which likewise is identical on each distro/OS:

[Unit]
Description=File System Check on Root Device
Documentation=man:systemd-fsck-root.service(8)
DefaultDependencies=no
Conflicts=shutdown.target
Before=local-fs.target shutdown.target
Wants=systemd-fsckd.socket
After=systemd-fsckd.socket
ConditionPathIsReadWrite=!/
ConditionPathExists=!/run/initramfs/fsck-root

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/lib/systemd/systemd-fsck
TimeoutSec=0

@holta
Copy link
Member Author

holta commented Aug 8, 2022

FYI this PR's Before=dphys-swapfile.service refers to a file (dphys-swapfile.service pasted in below) that exists on RasPiOS — but does not exist on the other 3 mainline OS's:

[Unit]
Description=dphys-swapfile - set up, mount/unmount, and delete a swap file
Documentation=man:dphys-swapfile(8)

[Service]
Type=oneshot
ExecStart=/sbin/dphys-swapfile setup
ExecStart=/sbin/dphys-swapfile swapon
ExecStop=/sbin/dphys-swapfile swapoff
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target

@jvonau
Copy link
Contributor

jvonau commented Aug 8, 2022

Further history #723

@jvonau
Copy link
Contributor

jvonau commented Aug 8, 2022

TimeoutSec=0 would be a good addition for the revised unit file.

@holta
Copy link
Member Author

holta commented Aug 8, 2022

FYI this PR's Before=dphys-swapfile.service refers to a file (dphys-swapfile.service pasted in below) that exists on RasPiOS — but does not exist on the other 3 mainline OS's:

The above might appear dangerous. But in the end this appears harmless — as systemd apparently ignores Before=X and After=Y requirements that refer to non-existent services: (e.g. X, Y)

At least as of systemd v219:

Adding an After=A to B is harmless when A does not exist; B will still start.

According to: https://serverfault.com/questions/1062205/does-systemd-fail-if-a-dependency-in-after-requires-doesnt-exist/1096424#1096424

@holta
Copy link
Member Author

holta commented Aug 8, 2022

TimeoutSec=0 would be a good addition for the revised unit file.

Done. I used the more modern syntax: TimeoutSec=infinity

(FWIW systemd is trying to transition towards 0 meaning 0, and infinity meaning infinity — ever since systemd 229 was release 6.5 years ago.)

Documentation:

@holta
Copy link
Member Author

holta commented Aug 8, 2022

@deldesir a great introduction to systemd unit files would appear to be:

https://www.digitalocean.com/community/tutorials/understanding-systemd-units-and-unit-files

@holta
Copy link
Member Author

holta commented Aug 8, 2022

TimeoutSec=0 would be a good addition for the revised unit file.

More than a good addition! This looks like a real lifesaver!

That was overlooked in the era of smaller drives / microSD cards / etc: 3798685

@holta
Copy link
Member Author

holta commented Aug 9, 2022

Here's a person whose 256GB IIAB microSD cards only show 15GB (in one case) and 60GB (in the other case) :

It would appear he/she is facing the same set of #3325-like issues.

@holta holta changed the title Straw Man 1-prep/templates/iiab-expand-rootfs.service based on 2020's PR #2522 Straw Man 1-prep/templates/iiab-expand-rootfs.service based on 2020's PR #2522 + /usr/sbin/iiab-expand-rootfs "bash -xe" exit-on-error (to defer deleting /.expand-rootfs) Aug 12, 2022
@holta
Copy link
Member Author

holta commented Aug 12, 2022

Stretch Goal:

Merge this PR this month (August 2022) if we find enough testing talent to validate this step forward!

@holta
Copy link
Member Author

holta commented Aug 29, 2022

@pickypet please if you can help us test this in early September!

A simple working smoke-test with Mint 21 or Ubuntu 22.04 will hopefully be sufficient — demonstrating that the rooffs is indeed expanding properly — when using a larger/conventional HDD.

(Call me along the way where you need help, so this hopefully happens soon in early Sept, Thanks!)

@holta
Copy link
Member Author

holta commented Sep 14, 2022

Merge this PR this month (August 2022) if we find enough testing talent to validate this step forward!

It didn't happen last month so let's make it happen this month (September 2022).

And possibly evolve/improve on this PR if @jvonau discovers further refinements that are necessary — based on various "reproducers" he's now generating on different Raspberry Pi OS's (and different faster/slower Raspberry Pi hardware) here:

@jvonau
Copy link
Contributor

jvonau commented Sep 19, 2022

https://linuxconfig.org/how-to-force-fsck-to-check-filesystem-after-system-reboot-on-linux
sudo tune2fs -c 1 /dev/mmcblk0p2 triggers to always fsck on boot for debugging

jerry@NM-64-desktop-RasPiOS:~ $ sudo systemctl status systemd-fsck-root.service 
* systemd-fsck-root.service - File System Check on Root Device
     Loaded: loaded (/lib/systemd/system/systemd-fsck-root.service; enabled-runtime; vendor preset: enabled)
     Active: active (exited) since Mon 2022-09-19 16:29:21 CDT; 18min ago
       Docs: man:systemd-fsck-root.service(8)
   Main PID: 146 (code=exited, status=0/SUCCESS)
      Tasks: 0 (limit: 4164)
        CPU: 0
     CGroup: /system.slice/systemd-fsck-root.service
Sep 19 16:29:21 NM-64-desktop-RasPiOS systemd-fsck[160]: e2fsck 1.46.2 (28-Feb-2021)
Sep 19 16:29:21 NM-64-desktop-RasPiOS systemd-fsck[160]: rootfs: clean, 154220/7744944 files, 17413572/31150592 blocks (check after next mount)

jerry@NM-64-desktop-RasPiOS:~ $ sudo systemctl status systemd-remount-fs.service
* systemd-remount-fs.service - Remount Root and Kernel File Systems
     Loaded: loaded (/lib/systemd/system/systemd-remount-fs.service; enabled-runtime; vendor preset: enabled)
     Active: active (exited) since Mon 2022-09-19 16:29:21 CDT; 19min ago
       Docs: man:systemd-remount-fs.service(8)
             https://www.freedesktop.org/wiki/Software/systemd/APIFileSystems
   Main PID: 166 (code=exited, status=0/SUCCESS)
      Tasks: 0 (limit: 4164)
        CPU: 0
     CGroup: /system.slice/systemd-remount-fs.service
Sep 19 16:29:21 NM-64-desktop-RasPiOS systemd[1]: Finished Remount Root and Kernel File Systems.

jerry@NM-64-desktop-RasPiOS:~ $ sudo systemctl status iiab-expand-rootfs.service 
* iiab-expand-rootfs.service - Root Filesystem Auto-Expander
     Loaded: loaded (/etc/systemd/system/iiab-expand-rootfs.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Mon 2022-09-19 16:29:24 CDT; 20min ago
   Main PID: 254 (code=exited, status=1/FAILURE)
        CPU: 321ms
Sep 19 16:29:24 NM-64-desktop-RasPiOS iiab-expand-rootfs[286]: ++ parted /dev/mmcblk0 -ms unit s p
Sep 19 16:29:24 NM-64-desktop-RasPiOS iiab-expand-rootfs[287]: ++ tail -n 1
Sep 19 16:29:24 NM-64-desktop-RasPiOS iiab-expand-rootfs[288]: ++ cut -f 1 -d:
Sep 19 16:29:24 NM-64-desktop-RasPiOS iiab-expand-rootfs[254]: + LAST_PART_NUM=2
Sep 19 16:29:24 NM-64-desktop-RasPiOS iiab-expand-rootfs[254]: + '[' 2 -ne 2 ']'
Sep 19 16:29:24 NM-64-desktop-RasPiOS iiab-expand-rootfs[254]: + growpart /dev/mmcblk0 2
Sep 19 16:29:24 NM-64-desktop-RasPiOS iiab-expand-rootfs[338]: NOCHANGE: partition 2 could only be grown by -33 [fudge=2048]
Sep 19 16:29:24 NM-64-desktop-RasPiOS systemd[1]: iiab-expand-rootfs.service: Main process exited, code=exited, status=1/FAILURE
Sep 19 16:29:24 NM-64-desktop-RasPiOS systemd[1]: iiab-expand-rootfs.service: Failed with result 'exit-code'.
Sep 19 16:29:24 NM-64-desktop-RasPiOS systemd[1]: Failed to start Root Filesystem Auto-Expander.

jerry@NM-64-desktop-RasPiOS:~ $ sudo systemctl status dphys-swapfile             
* dphys-swapfile.service - dphys-swapfile - set up, mount/unmount, and delete a swap file
     Loaded: loaded (/lib/systemd/system/dphys-swapfile.service; enabled; vendor preset: enabled)
     Active: active (exited) since Mon 2022-09-19 16:29:27 CDT; 1h 4min ago
       Docs: man:dphys-swapfile(8)
   Main PID: 641 (code=exited, status=0/SUCCESS)
      Tasks: 0 (limit: 4164)
        CPU: 0
     CGroup: /system.slice/dphys-swapfile.service

Sep 19 16:29:26 NM-64-desktop-RasPiOS systemd[1]: Starting dphys-swapfile - set up, mount/unmount, and delete a swap file...
Sep 19 16:29:26 NM-64-desktop-RasPiOS dphys-swapfile[507]: want /var/swap=2048MByte, checking existing: keeping it
Sep 19 16:29:27 NM-64-desktop-RasPiOS systemd[1]: Finished dphys-swapfile - set up, mount/unmount, and delete a swap file.

pid ordering looks sane.

@holta
Copy link
Member Author

holta commented Sep 19, 2022

      Active: failed (Result: exit-code) since Mon 2022-09-19 16:29:24 CDT; 20min ago
    Main PID: 254 (code=exited, status=1/FAILURE)

@jvonau the failure of iiab-expand-rootfs.service above may well be expected (if the disk had no further room to expand).

Can you however test an actual rootfs expansion — on RasPiOS and any other OS?

@jvonau
Copy link
Contributor

jvonau commented Sep 20, 2022

      Active: failed (Result: exit-code) since Mon 2022-09-19 16:29:24 CDT; 20min ago
    Main PID: 254 (code=exited, status=1/FAILURE)

@jvonau the failure of iiab-expand-rootfs.service above may well be expected (if the disk had no further room to expand).

Can you however test an actual rootfs expansion — on RasPiOS and any other OS?

It's expected and why the service is conditional. No, what would that further prove? The question of the ordering is already answered recreating a 'dirty filesystem' and prolonging the fsck process would be the real acid test.

@jvonau
Copy link
Contributor

jvonau commented Sep 20, 2022

Methodology:
jvonau@kickass:/mnt/scratch/git/iiab$ sudo nano /media/jvonau/rootfs/etc/systemd/system/iiab-expand-rootfs.service alter to match PR
[sudo] password for jvonau:
jvonau@kickass:/mnt/scratch/git/iiab$ sudo nano /media/jvonau/rootfs/usr/sbin/iiab-expand-rootfs alter add echo

jvonau@kickass:/mnt/scratch/git/iiab$ sudo chroot /media/jvonau/rootfs
root@kickass:/# systemctl disable iiab-expand-rootfs
Removed /etc/systemd/system/multi-user.target.wants/iiab-expand-rootfs.service.
root@kickass:/# systemctl enable iiab-expand-rootfs
Created symlink /etc/systemd/system/local-fs.target.wants/iiab-expand-rootfs.service → /etc/systemd/system/iiab-expand-rootfs.service.
root@kickass:/# sync
root@kickass:/# exit

sudo nano /media/jvonau/rootfs/etc/fake-hwclock.data move time forward 2 months
sudo nano /media/jvonau/boot/cmdline.txt remove splash quite

jvonau@kickass:/mnt/scratch/git/iiab$ sudo tune2fs -c 1 /dev/sdb2
tune2fs 1.46.5 (30-Dec-2021)
Setting maximal mount count to 1

Results:

jvonau@kickass:/mnt/scratch/git/iiab$ cat ~/install-sum.txt 
-- Journal begins at Mon 2022-04-04 14:52:30 UTC, ends at Tue 2022-09-20 15:40:02 UTC. --
Apr 04 14:52:30 raspberrypi systemd-fsck[152]: e2fsck 1.46.2 (28-Feb-2021)
Apr 04 14:52:30 raspberrypi systemd-fsck[152]: rootfs: clean, 104077/242880 files, 802150/970752 blocks
-- Boot 051858ef96664f8b8fdfff4325e993af --
Apr 04 14:52:46 raspberrypi systemd-fsck[151]: e2fsck 1.46.2 (28-Feb-2021)
Apr 04 14:52:46 raspberrypi systemd-fsck[151]: rootfs: clean, 104103/242880 files, 804772/970752 blocks
-- Boot fd13670672ed4eb1ad9711a25c622cae --
Apr 04 14:52:46 raspberrypi systemd[1]: systemd-fsck-root.service: Succeeded.
Apr 04 14:52:46 raspberrypi systemd[1]: Stopped File System Check on Root Device.
-- Boot 549ffca864984728b6a2a83b4da065cd --
Jun 16 19:37:20 raspberrypi systemd-fsck[154]: e2fsck 1.46.2 (28-Feb-2021)
Jun 16 19:37:20 raspberrypi systemd-fsck[154]: rootfs: clean, 104357/7699296 files, 1333540/31150592 blocks
-- Boot 051858ef96664f8b8fdfff4325e993af --
Jun 16 19:37:20 raspberrypi systemd[1]: systemd-fsck-root.service: Succeeded.
Jun 16 19:37:20 raspberrypi systemd[1]: Stopped File System Check on Root Device.
-- Boot 549ffca864984728b6a2a83b4da065cd --
Jun 16 20:50:59 box systemd[1]: systemd-fsck-root.service: Succeeded.
Jun 16 20:50:59 box systemd[1]: Stopped File System Check on Root Device.
-- Boot a06560a4507447f094afd6ff382231be --
Jun 16 20:50:59 box systemd[1]: Finished File System Check on Root Device.
Jun 16 20:50:59 box systemd-fsck[157]: e2fsck 1.46.2 (28-Feb-2021)
Jun 16 20:50:59 box systemd-fsck[157]: rootfs: clean, 320400/7699296 files, 3186767/31150592 blocks
Jun 16 20:52:13 box systemd[1]: systemd-fsck-root.service: Succeeded.
Jun 16 20:52:13 box systemd[1]: Stopped File System Check on Root Device.
-- Boot 6caf1b057f1946d2b18523e57a9caab2 --
Aug 16 20:52:13 box systemd-fsck[153]: e2fsck 1.46.2 (28-Feb-2021)
Aug 16 20:52:13 box systemd-fsck[153]: rootfs has been mounted 2 times without being checked, check forced.
Aug 16 20:52:13 box systemd-fsck[153]: Pass 1: Checking inodes, blocks, and sizes
Aug 16 20:52:18 box systemd-fsck[153]: Pass 2: Checking directory structure
Aug 16 20:52:23 box systemd-fsck[153]: Pass 3: Checking directory connectivity
Aug 16 20:52:24 box systemd-fsck[153]: Pass 4: Checking reference counts
Aug 16 20:52:24 box systemd-fsck[153]: Pass 5: Checking group summary information
Aug 16 20:52:24 box systemd-fsck[153]: rootfs: 319022/720544 files (0.6% non-contiguous), 2504809/2914696 blocks
Aug 16 20:52:24 box systemd[1]: Finished File System Check on Root Device.
-- Journal begins at Mon 2022-04-04 14:52:30 UTC, ends at Tue 2022-09-20 15:41:16 UTC. --
Jun 16 20:51:03 box systemd[1]: Starting Root Filesystem Auto-Expander...
Jun 16 20:51:03 box iiab-expand-rootfs[434]: + '[' -f /.expand-rootfs ']'
Jun 16 20:51:03 box iiab-expand-rootfs[434]: + '[' -f /.resize-rootfs ']'
Jun 16 20:51:03 box systemd[1]: iiab-expand-rootfs.service: Succeeded.
Jun 16 20:51:03 box systemd[1]: Finished Root Filesystem Auto-Expander.
-- Boot 6caf1b057f1946d2b18523e57a9caab2 --
Aug 16 20:52:24 box systemd[1]: Starting Root Filesystem Auto-Expander...
Aug 16 20:52:26 box iiab-expand-rootfs[171]: + '[' -f /.expand-rootfs ']'
Aug 16 20:52:26 box iiab-expand-rootfs[171]: + echo '/usr/sbin/iiab-expand-rootfs: Expanding rootfs partition'
Aug 16 20:52:26 box iiab-expand-rootfs[171]: /usr/sbin/iiab-expand-rootfs: Expanding rootfs partition
Aug 16 20:52:26 box iiab-expand-rootfs[175]: ++ findmnt / -o SOURCE -n
Aug 16 20:52:26 box iiab-expand-rootfs[171]: + ROOT_PART=/dev/mmcblk0p2
Aug 16 20:52:26 box iiab-expand-rootfs[177]: ++ lsblk -no pkname /dev/mmcblk0p2
Aug 16 20:52:26 box iiab-expand-rootfs[171]: + ROOT_DEV=/dev/mmcblk0
Aug 16 20:52:26 box iiab-expand-rootfs[179]: ++ echo /dev/mmcblk0p2
Aug 16 20:52:26 box iiab-expand-rootfs[180]: ++ grep -o '[[:digit:]]*$'
Aug 16 20:52:26 box iiab-expand-rootfs[171]: + ROOT_PART_NUM=2
Aug 16 20:52:26 box iiab-expand-rootfs[182]: ++ parted /dev/mmcblk0 -ms unit s p
Aug 16 20:52:26 box iiab-expand-rootfs[183]: ++ tail -n 1
Aug 16 20:52:26 box iiab-expand-rootfs[184]: ++ cut -f 1 -d:
Aug 16 20:52:26 box iiab-expand-rootfs[171]: + LAST_PART_NUM=2
Aug 16 20:52:26 box iiab-expand-rootfs[171]: + '[' 2 -ne 2 ']'
Aug 16 20:52:26 box iiab-expand-rootfs[171]: + growpart /dev/mmcblk0 2
Aug 16 20:52:27 box iiab-expand-rootfs[189]: CHANGED: partition=2 start=532480 old: size=23317568 end=23850048 new: size=124293087 end=124825567
Aug 16 20:52:27 box iiab-expand-rootfs[171]: + sleep 2
Aug 16 20:52:29 box iiab-expand-rootfs[171]: + echo '/usr/sbin/iiab-expand-rootfs: Resizing Root Partition'
Aug 16 20:52:29 box iiab-expand-rootfs[171]: /usr/sbin/iiab-expand-rootfs: Resizing Root Partition
Aug 16 20:52:29 box iiab-expand-rootfs[171]: + resize2fs /dev/mmcblk0p2
Aug 16 20:52:29 box iiab-expand-rootfs[435]: resize2fs 1.46.2 (28-Feb-2021)
Sep 20 15:31:16 box iiab-expand-rootfs[435]: Filesystem at /dev/mmcblk0p2 is mounted on /; on-line resizing required
Sep 20 15:31:16 box iiab-expand-rootfs[435]: old_desc_blocks = 1, new_desc_blocks = 4
Sep 20 15:31:16 box iiab-expand-rootfs[435]: The filesystem on /dev/mmcblk0p2 is now 15536635 (4k) blocks long.
Sep 20 15:31:16 box iiab-expand-rootfs[171]: + rc=0
Sep 20 15:31:16 box iiab-expand-rootfs[171]: + '[' 0 -eq 0 ']'
Sep 20 15:31:16 box iiab-expand-rootfs[171]: + rm -f /.expand-rootfs /.resize-rootfs
Sep 20 15:31:16 box systemd[1]: Finished Root Filesystem Auto-Expander.

Order looks perfect... Next to recreate the dirty filesystem by yanking the power in the middle of the filesystem resizing.

@holta
Copy link
Member Author

holta commented Sep 20, 2022

Next to recreate the dirty filesystem by yanking the power in the middle of the filesystem resizing.

This will be one of the most important tests, Great 🏗️

@jvonau
Copy link
Contributor

jvonau commented Sep 20, 2022

-- Journal begins at Mon 2022-04-04 14:41:41 UTC, ends at Tue 2022-09-20 19:20:26 UTC. --
-- Boot 78449a0e58c8451eac3518bcd2da2d9d --
Aug 16 20:52:40 box systemd-fsck[152]: e2fsck 1.46.2 (28-Feb-2021)
Aug 16 20:52:40 box systemd-fsck[152]: rootfs: clean, 158902/447040 files, 1272244/1783171 blocks

-- Boot 32eff329181c422eb8907161d6b77328 --
Aug 16 20:52:40 box systemd-fsck[153]: e2fsck 1.46.2 (28-Feb-2021)
Aug 16 20:52:40 box systemd-fsck[153]: Setting free inodes count to 361289 (was 288137)
Aug 16 20:52:40 box systemd-fsck[153]: Setting free blocks count to 818270 (was 527948)
Aug 16 20:52:40 box systemd-fsck[153]: rootfs: clean, 158903/520192 files, 1278882/2097152 blocks

-- Boot bd5e13fd2a86499ea0594268c5fab443 --
Sep 20 19:10:10 box systemd-fsck[154]: e2fsck 1.46.2 (28-Feb-2021)
Sep 20 19:10:10 box systemd-fsck[154]: rootfs: clean, 158940/520192 files, 1288102/2097152 blocks
Sep 20 19:10:10 box systemd[1]: systemd-fsck-root.service: Succeeded.
Sep 20 19:10:10 box systemd[1]: Stopped File System Check on Root Device.
Sep 20 19:10:10 box systemd[1]: systemd-fsck-root.service: Succeeded.
Sep 20 19:10:10 box systemd[1]: Stopped File System Check on Root Device.

-- Journal begins at Mon 2022-04-04 14:41:41 UTC, ends at Tue 2022-09-20 19:25:45 UTC. --
-- Boot 78449a0e58c8451eac3518bcd2da2d9d --
Aug 16 20:52:40 box systemd[1]: Starting Root Filesystem Auto-Expander...
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + '[' -f /.expand-rootfs ']'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + echo 'EXPANDING rootfs partition'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: EXPANDING rootfs partition
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + echo 'EXPANDING rootfs partition'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: EXPANDING rootfs partition
Aug 16 20:52:41 box iiab-expand-rootfs[166]: ++ findmnt / -o SOURCE -n
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + ROOT_PART=/dev/mmcblk0p2
Aug 16 20:52:41 box iiab-expand-rootfs[168]: ++ lsblk -no pkname /dev/mmcblk0p2
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + ROOT_DEV=/dev/mmcblk0
Aug 16 20:52:41 box iiab-expand-rootfs[170]: ++ echo /dev/mmcblk0p2
Aug 16 20:52:41 box iiab-expand-rootfs[171]: ++ grep -o '[[:digit:]]*$'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + ROOT_PART_NUM=2
Aug 16 20:52:41 box iiab-expand-rootfs[173]: ++ parted /dev/mmcblk0 -ms unit s p
Aug 16 20:52:41 box iiab-expand-rootfs[175]: ++ cut -f 1 -d:
Aug 16 20:52:41 box iiab-expand-rootfs[174]: ++ tail -n 1
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + LAST_PART_NUM=2
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + '[' 2 -ne 2 ']'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + growpart /dev/mmcblk0 2

-- Boot 32eff329181c422eb8907161d6b77328 --
Aug 16 20:52:40 box systemd[1]: Starting Root Filesystem Auto-Expander...
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + '[' -f /.expand-rootfs ']'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + echo 'EXPANDING rootfs partition'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: EXPANDING rootfs partition
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + echo 'EXPANDING rootfs partition'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: EXPANDING rootfs partition
Aug 16 20:52:41 box iiab-expand-rootfs[166]: ++ findmnt / -o SOURCE -n
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + ROOT_PART=/dev/mmcblk0p2
Aug 16 20:52:41 box iiab-expand-rootfs[168]: ++ lsblk -no pkname /dev/mmcblk0p2
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + ROOT_DEV=/dev/mmcblk0
Aug 16 20:52:41 box iiab-expand-rootfs[170]: ++ echo /dev/mmcblk0p2
Aug 16 20:52:41 box iiab-expand-rootfs[171]: ++ grep -o '[[:digit:]]*$'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + ROOT_PART_NUM=2
Aug 16 20:52:41 box iiab-expand-rootfs[173]: ++ parted /dev/mmcblk0 -ms unit s p
Aug 16 20:52:41 box iiab-expand-rootfs[174]: ++ tail -n 1
Aug 16 20:52:41 box iiab-expand-rootfs[175]: ++ cut -f 1 -d:
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + LAST_PART_NUM=2
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + '[' 2 -ne 2 ']'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + growpart /dev/mmcblk0 2
Aug 16 20:52:41 box iiab-expand-rootfs[180]: NOCHANGE: partition 2 is size 124293087. it cannot be grown
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + echo 'RESIZING root filesystem'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: RESIZING root filesystem
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + echo 'RESIZING root filesystem'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: RESIZING root filesystem
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + echo 'RESIZING root filesystem'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: RESIZING root filesystem
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + echo 'RESIZING root filesystem'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: RESIZING root filesystem
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + echo 'RESIZING root filesystem'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: RESIZING root filesystem
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + echo 'RESIZING root filesystem'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: RESIZING root filesystem
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + echo 'RESIZING root filesystem'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: RESIZING root filesystem
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + echo 'RESIZING root filesystem'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: RESIZING root filesystem
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + echo 'RESIZING root filesystem'
Aug 16 20:52:41 box iiab-expand-rootfs[162]: RESIZING root filesystem
Aug 16 20:52:41 box iiab-expand-rootfs[162]: + resize2fs /dev/mmcblk0p2
Aug 16 20:52:41 box iiab-expand-rootfs[221]: resize2fs 1.46.2 (28-Feb-2021)
Aug 16 20:52:42 box iiab-expand-rootfs[162]: /usr/sbin/iiab-expand-rootfs: line 70: 221 Segmentation fault resize2fs $ROOT_PART
Aug 16 20:52:42 box iiab-expand-rootfs[162]: + rc=139
Aug 16 20:52:42 box iiab-expand-rootfs[162]: + '[' 139 -eq 0 ']'
Aug 16 20:52:42 box systemd[1]: Finished Root Filesystem Auto-Expander.

-- Boot bd5e13fd2a86499ea0594268c5fab443 --
Sep 20 19:10:10 box systemd[1]: Starting Root Filesystem Auto-Expander...
Sep 20 19:10:10 box iiab-expand-rootfs[163]: + '[' -f /.expand-rootfs ']'
Sep 20 19:10:10 box iiab-expand-rootfs[163]: + echo 'EXPANDING rootfs partition'
Sep 20 19:10:10 box iiab-expand-rootfs[163]: EXPANDING rootfs partition
Sep 20 19:10:10 box iiab-expand-rootfs[163]: + echo 'EXPANDING rootfs partition'
Sep 20 19:10:10 box iiab-expand-rootfs[163]: EXPANDING rootfs partition
Sep 20 19:10:10 box iiab-expand-rootfs[167]: ++ findmnt / -o SOURCE -n
Sep 20 19:10:10 box iiab-expand-rootfs[163]: + ROOT_PART=/dev/mmcblk0p2
Sep 20 19:10:10 box iiab-expand-rootfs[168]: ++ lsblk -no pkname /dev/mmcblk0p2
Sep 20 19:10:10 box iiab-expand-rootfs[163]: + ROOT_DEV=/dev/mmcblk0
Sep 20 19:10:10 box iiab-expand-rootfs[171]: ++ echo /dev/mmcblk0p2
Sep 20 19:10:10 box iiab-expand-rootfs[172]: ++ grep -o '[[:digit:]]*$'
Sep 20 19:10:10 box iiab-expand-rootfs[163]: + ROOT_PART_NUM=2
Sep 20 19:10:10 box iiab-expand-rootfs[175]: ++ parted /dev/mmcblk0 -ms unit s p
Sep 20 19:10:10 box iiab-expand-rootfs[176]: ++ tail -n 1
Sep 20 19:10:10 box iiab-expand-rootfs[177]: ++ cut -f 1 -d:
Sep 20 19:10:11 box iiab-expand-rootfs[163]: + LAST_PART_NUM=2
Sep 20 19:10:11 box iiab-expand-rootfs[163]: + '[' 2 -ne 2 ']'
Sep 20 19:10:11 box iiab-expand-rootfs[163]: + growpart /dev/mmcblk0 2
Sep 20 19:10:11 box iiab-expand-rootfs[181]: NOCHANGE: partition 2 is size 124293087. it cannot be grown
Sep 20 19:10:11 box iiab-expand-rootfs[163]: + echo 'RESIZING root filesystem'
Sep 20 19:10:11 box iiab-expand-rootfs[163]: RESIZING root filesystem
Sep 20 19:10:11 box iiab-expand-rootfs[163]: + echo 'RESIZING root filesystem'
Sep 20 19:10:11 box iiab-expand-rootfs[163]: RESIZING root filesystem
Sep 20 19:10:11 box iiab-expand-rootfs[163]: + echo 'RESIZING root filesystem'
Sep 20 19:10:11 box iiab-expand-rootfs[163]: RESIZING root filesystem
Sep 20 19:10:11 box iiab-expand-rootfs[163]: + echo 'RESIZING root filesystem'
Sep 20 19:10:11 box iiab-expand-rootfs[163]: RESIZING root filesystem
Sep 20 19:10:11 box iiab-expand-rootfs[163]: + echo 'RESIZING root filesystem'
Sep 20 19:10:11 box iiab-expand-rootfs[163]: RESIZING root filesystem
Sep 20 19:10:11 box iiab-expand-rootfs[163]: + echo 'RESIZING root filesystem'
Sep 20 19:10:11 box iiab-expand-rootfs[163]: RESIZING root filesystem
Sep 20 19:10:11 box iiab-expand-rootfs[163]: + echo 'RESIZING root filesystem'
Sep 20 19:10:11 box iiab-expand-rootfs[163]: RESIZING root filesystem
Sep 20 19:10:11 box iiab-expand-rootfs[163]: + echo 'RESIZING root filesystem'
Sep 20 19:10:11 box iiab-expand-rootfs[163]: RESIZING root filesystem
Sep 20 19:10:11 box iiab-expand-rootfs[163]: + echo 'RESIZING root filesystem'
Sep 20 19:10:11 box iiab-expand-rootfs[163]: RESIZING root filesystem
Sep 20 19:10:11 box iiab-expand-rootfs[163]: + resize2fs /dev/mmcblk0p2
Sep 20 19:10:11 box iiab-expand-rootfs[222]: resize2fs 1.46.2 (28-Feb-2021)
Sep 20 19:11:44 box iiab-expand-rootfs[222]: Filesystem at /dev/mmcblk0p2 is mounted on /; on-line resizing required
Sep 20 19:11:44 box iiab-expand-rootfs[222]: old_desc_blocks = 1, new_desc_blocks = 4
Sep 20 19:11:44 box iiab-expand-rootfs[222]: The filesystem on /dev/mmcblk0p2 is now 15536635 (4k) blocks long.
Sep 20 19:11:44 box iiab-expand-rootfs[163]: + rc=0
Sep 20 19:11:44 box iiab-expand-rootfs[163]: + '[' 0 -eq 0 ']'
Sep 20 19:11:44 box iiab-expand-rootfs[163]: + rm -f /.expand-rootfs /.resize-rootfs
Sep 20 19:11:44 box systemd[1]: Finished Root Filesystem Auto-Expander.

@holta
Copy link
Member Author

holta commented Sep 22, 2022

Thanks @tim-moody for the excellent suggestion to reboot immediately — which is almost immediate (i.e. invisible) to the operator thankfully:

@holta
Copy link
Member Author

holta commented Sep 22, 2022

Good progress during today's call (http://minutes.iiab.io).

@jvonau recommends further improving the After= and Before= lines in roles/1-prep/templates/iiab-expand-rootfs.service so that fsck race conditions are genuinely under control.

Sounds great. Hopefully that converges in coming days and this PR can be merged at that point.

@jvonau
Copy link
Contributor

jvonau commented Sep 22, 2022

To gather as much info as needed could the exit codes from growpart and resize2fs be echoed back for some better data capture moving forward? Just having rc=$? after each would be enough to view the return code when using 'bash -x'

+ growpart /dev/mmcblk0 2
NOCHANGE: partition 2 is size 124293087. it cannot be grown
+ rc=1

The filesystem on /dev/mmcblk0p2 is now 15536635 (4k) blocks long.
+ rc=0

@jvonau
Copy link
Contributor

jvonau commented Sep 23, 2022

raspi-config method failure on current image noted at #3375 (comment). Think your call to raspi-config might be incomplete, I didn't see the correct init= line get written to /boot/cmdline.txt when I used the same syntax from the command line.

@jvonau
Copy link
Contributor

jvonau commented Sep 24, 2022

Given upstream have updated the routine that is used during firstboot to support raspi-imager's seeding of system values (written to /boot/firstrun.sh ie countriy code, ssid/pw, ssh) perhaps it might be wise to have the prefabbed images behave in the same way? You gain the very valuable visual user facing feedback at the monitor that the filesystem is being tinkered with and should wait until complete while wanted customization is being preformed with the auto reboot at the end. You lose the ability to ssh in until after the auto reboot takes place if called as init= in /boot/cmdline.txt but I don't see a reason why calling firstboot in place of raspi-config from iiab-expand-rootfs would not work. I'm generally in favor of doing things the same way as upstream so the behavior is a consistent user experience whether it's a stock image or a modified one as people come to expect the same user experience.

@holta
Copy link
Member Author

holta commented Sep 24, 2022

You gain the very valuable visual user facing feedback at the monitor that the filesystem is being tinkered with and should wait

  1. Preserving modularity thanks to a single command (raspi-config --expand-rootfs or similar) should allow us to contain (entirely hand off!) creeping complexity year-by-year — delegating that to the upstream maintainer.

  2. As such, should we perhaps advocate for upstream (maintainers of raspi-config --expand-rootfs) to use their own organization's firstboot /boot/firstrun.sh mechanism to beef this up?

@holta
Copy link
Member Author

holta commented Sep 24, 2022

raspi-config method failure on current image noted at #3375 (comment). Think your call to raspi-config might be incomplete, I didn't see the correct init= line get written to /boot/cmdline.txt when I used the same syntax from the command line.

Is there evidence of a bug in raspi-config --expand-config ?

Or enough to try to find reproducer pattern(s) ?!

@holta
Copy link
Member Author

holta commented Sep 25, 2022

@jvonau recommends further improving the After= and Before= lines in roles/1-prep/templates/iiab-expand-rootfs.service so that fsck race conditions are genuinely under control.

@jvonau should this PR be merged now — or do the After= and Before= lines need to be refined in your opinion?

@avni
Copy link
Member

avni commented Sep 26, 2022

@jvonau should this PR be merged now — or do the After= and Before= lines need to be refined in your opinion?

From 9/26: @jvonau confirms that no additional changes are required to the After= and Before= lines. He also confirmed that the After= line is more important than the Before= line regarding avoiding the race condition. /cc @holta

@holta holta merged commit 7bb5e1b into iiab:master Sep 26, 2022
@holta
Copy link
Member Author

holta commented Sep 26, 2022

Future Work, if/as Raspberry Pi OS evolves, e.g. to allow visual indication of ongoing progress during rootfs expansion — and better blocking during boot process (to fully protect rootfs):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants