Skip to content

Conversation

@danzatt
Copy link
Contributor

@danzatt danzatt commented Mar 17, 2025

There is a bug in coreos baselayout which wipes
/etc/{group,gshadow,passwd,shadow} when
reinstalling/removing/upgrading the baselayout package.

The deleted files are touched in staging area, so the package ships empty configuration files, overwriting the original configs on the system. Instead we move the touch to postinst, which only touches the existing files (or creates them when they're nonexistent).

This is blocking the NVIDIA sysext work, as the nvidia-drivers ebuild assumes the video group exists in /etc/group.

Note: as upgrading baselayout package breaks the SDK container due to this bug, we likely also need to rebuild the SDK

[Title: describe the change in one sentence]

[ describe the change in 1 - 3 paragraphs ]

How to use

[ describe what reviewers need to do in order to validate this PR ]

Testing done

[Describe the testing you have done before submitting this PR. Please include both the commands you issued as well as the output you got.]

  • Changelog entries added in the respective changelog/ directory (user-facing change, bug fix, security fix, update)
  • Inspected CI output for image differences: /boot and /usr size, packages, list files for any missing binaries, kernel modules, config files, kernel modules, etc.

There is a bug in coreos baselayout which wipes
/etc/{group,gshadow,passwd,shadow} when
reinstalling/removing/upgrading the baselayout package.

The deleted files are touched in staging area, so the package ships
empty configuration files, overwriting the original configs on the
system. Instead we move the touch to postinst, which only touches the
existing files (or creates them when they're nonexistent).
@danzatt danzatt requested review from chewi and krnowak March 17, 2025 16:04
Copy link
Contributor

@chewi chewi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this broke in the SDK, but I think I can guess. If building a new SDK works, that should be fine.

@danzatt
Copy link
Contributor Author

danzatt commented Mar 18, 2025

The reason is, when build_packages is called it invokes update_chroot, which in turn runs emerge --update --newuse --verbose --with-bdeps=y --select coreos-devel/sdk-depends world. This causes the baselayout package to upgrade, which firstly removes the old baselayout, but because config protection is turned off, it also wipes the /etc config files during the removal. This removes the portage group and emerge seems to stop working and aborts. If you try reinstalling baselayout package (even without this PR) the same thing happens.

If I re-enable config protection for /etc the baselayout upgrade finishes just fine.

index f64005c965..dbb2d60899 100755
--- a/update_chroot
+++ b/update_chroot
@@ -263,7 +263,7 @@ remove_hard_blocks \
 # Second pass, update everything else.
 EMERGE_FLAGS+=( --deep )
 info "Updating all SDK packages"
-sudo -E ${EMERGE_CMD} "${EMERGE_FLAGS[@]}" \
+CONFIG_PROTECT="$CONFIG_PROTECT /etc" sudo -E ${EMERGE_CMD} "${EMERGE_FLAGS[@]}" \
     coreos-devel/sdk-depends world
 
 info "Removing obsolete packages"

So I am not sure, what the right approch would be. To add this patch to this PR or rebuild the SDK container with the new baselayout.

@chewi
Copy link
Contributor

chewi commented Mar 18, 2025

Nightly builds always rebuild the SDK. This will only break development branches where someone has rebased on this change before the next nightly is published. This temporary inconvenience can be avoided by giving everyone a heads up. I'd prefer that over enabling config protection, which has consequences.

@github-actions
Copy link

Build action triggered: https://github.com/flatcar/scripts/actions/runs/13920689370

Copy link
Member

@krnowak krnowak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Updating packages in place is very lightly if barely tested in Flatcar (read: not at all) as it normally affects only SDK. I think creating empty files should be in pkg_postinst from the beginning anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants