Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Since e2fsprogs-1.47.0-1.fc39 landed in Rawhide, cannot rebase from a fresh Rawhide Silverblue install to any earlier release #470

Closed
AdamWill opened this issue May 25, 2023 · 15 comments
Labels
bug Something isn't working rawhide

Comments

@AdamWill
Copy link

This issue tracker is intended only for Silverblue specific issues. We would like to ask you to try to reproduce the issue on a relevant Fedora Workstation release. If you will be able to reproduce there, then please report it in Red Hat Bugzilla (see How to file a bug) or in upstream (preferred for GNOME projects) and not in this issue tracker.

Describe the bug
e2fsprogs-1.47.0-1.fc39 seems to create ext4 filesystems with some new feature. This means earlier versions of e2fsprogs can't fsck them. Because an fsck on the /boot partition is a required element during system startup, if we rebase from a Rawhide Silverblue install to any earlier release, boot fails because fsck of /boot fails.

To Reproduce
Please describe the steps needed to reproduce the bug:

  1. Install a recent Fedora Rawhide Silverblue (this is actually a bit tricky as official compose installer image build is currently failing; I'm seeing this in openQA, which builds its own ostrees and installer images). You need to use an installer with e2fsprogs-1.47.0-1.fc39 to see the issue.
  2. Rebase to an earlier release
  3. Try and boot

Expected behavior
The system should boot successfully

Screenshots
38fail

@sandeen is there anything we can do here? can we somehow have e2fsprogs not create non-backward-compatible filesystems, or are we stuck with it?

@AdamWill AdamWill added the bug Something isn't working label May 25, 2023
@AdamWill
Copy link
Author

Per https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1031622 , this is due to the 'orphan_file' feature that is enabled by default in 1.47.0. It seems any earlier version of e2fsprogs cannot fsck a filesystem which was created with the orphan_file feature turned on.

Could we perhaps disable it by default for a release or two, @sandeen , to avoid problems like this? Maybe only make it default once F39 is the oldest supported release?

@travier
Copy link
Member

travier commented May 26, 2023

This is a "pure" downgrade scenario as this will not impact users that upgrade from an older release to rawhide and then go back to a stable release.

I don't think we should care about that. We should probably change the test to do something else.

@AdamWill
Copy link
Author

well, I could see someone installing Rawhide to try it out, encountering a bug, and wanting to just rebase down to 38 to avoid the problem, but OK. There is nothing else the test can do, that I can think of. The requirement is "check that rebasing from Rawhide" works. I'm not aware of anything else the test can rebase to that would work.

@travier
Copy link
Member

travier commented May 26, 2023

We should support the reverse: Installing F38 and then updating to Rawhide and then rollback to F38.

I don't think we can ever support installing from a newer Fedora release and then downgrading to an older release.

@AdamWill
Copy link
Author

The point isn't to test any specific rebase operation, but to test that the rebase mechanism itself works. If we don't test a rebase from Rawhide to something, the rebase code in Rawhide could be broken for months and we would never know until it branches.

@travier
Copy link
Member

travier commented May 26, 2023

Yes, for that test to mean something, we need to build a new image or a layered one and rebase to it.

@AdamWill
Copy link
Author

For update tests at least it might be possible to test rebasing to the 'stock' Rawhide (not the custom ostree we build for testing the update). That wouldn't work for compose tests, though, and I don't really want to have to stuff another image creation step in there (it takes a long time).

Anyhow, I've filed an upstream issue to see if the sudden cut-off here was intended - tytso/e2fsprogs#147

@jlebon
Copy link
Member

jlebon commented May 26, 2023

For testing the rebase mechanism, we sometimes create a branch locally and rebase to it. E.g. https://github.com/coreos/rpm-ostree/blob/5cc3c84b97199abbd72268dd4a4f963a6215cc0d/tests/vmcheck/test-layering-unified.sh#L75-L91

It's synthetic and not super realistic (e.g. it doesn't test remote bits), but it's still better than nothing.

@AdamWill
Copy link
Author

well, while we're talking workarounds, is it possible to overlay after rebasing? or if I overlay, then rebase, what happens to the overlay?

I'm thinking, of course, about overlaying a newer e2fsprogs build onto the rebased system. For now I can work around this by having the 'create the installer' step of the test use a scratch build of e2fsprogs 1.47.0 with the feature disabled, but that's somewhat fragile (any time e2fsprogs is updated, I have to redo the scratch build).

@AdamWill
Copy link
Author

FWIW, I figured out a better generic workaround for this for the update tests at least. The update tests build their own ostree (with the update packages included), so I've made that test give it a unique ref, not just name it the same as the official ref it's based off. So now we can verify that we installed our custom ref, and test rebasing to the official ref (we don't need to rebase to a different version).

This doesn't help the compose tests, which by design are testing the official images that deploy the official refs. But at least it solves the problem for update tests...

@ziswiler
Copy link

I am a long-time Silverblue user and find this bug very very inconvenient. I just got myself a new notebook. Installed 39 on it. Unfortunately, there are some other issues with 39 so I thought, well, it's Silverblue, I just re-base to whatever and be done with it. Nops, re-basing to 38 doesn't even boot! WTF! BTW: Re-basing to Rawhide works but that has the same other issue so won't help me any...

@sandeen
Copy link

sandeen commented Nov 26, 2023

Uff, I had missed these notifications. Too many bug/issue trackers.

Sure, you can turn on or off any on-disk features you want, if you want to override mkfs defaults, something anaconda normally doesn't want to do. But as others have said, rebasing down from rawhide doesn't seem like a normal (or even supported?) workflow. I don't know if mkfs parameters can be passed via kickstart; if so, that might be one option.

It was a bit antisocial for upstream to introduce the orphan_file feature and make it default all in the same release. That said, rawhide is supposed to be the place for the latest & greatest package versions and features. It's not a given that those packages will always be downgradeable to prior versions ...

In i.e. XFS land, we'd add new features as available but non-default for several releases before making it default, to avoid this sort of problem and at least allow backwards compatibility across a few older versions.

@ziswiler
Copy link

Note that this has nothing to do with Rawhide. This is a serious production issue as 39 can NOT be downgraded to 38.

@travier
Copy link
Member

travier commented Nov 27, 2023

Rebasing to an older version is not supported: #470 (comment)

@travier
Copy link
Member

travier commented Nov 27, 2023

Going to close this one as this is not something we can support. The e2fsprogs issue is tracked in tytso/e2fsprogs#147.

@travier travier closed this as not planned Won't fix, can't repro, duplicate, stale Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working rawhide
Projects
None yet
Development

No branches or pull requests

5 participants