Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dkms fails on kernel 5.18.0+ #13562

Closed
mcondarelli opened this issue Jun 15, 2022 · 21 comments
Closed

dkms fails on kernel 5.18.0+ #13562

mcondarelli opened this issue Jun 15, 2022 · 21 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@mcondarelli
Copy link

System information

Type Version/Name
Distribution Name Debian
Distribution Version Sid
Kernel Version 5.17.0-3-amd64
Architecture x86_64
OpenZFS Version zfs-2.1.4-1+b1

Describe the problem you're observing

dkms installation fails.

System was trying to create kernel modules for new kernel (5.18.0)

DKMS make.log for zfs-2.1.4 for kernel 5.18.0-1-amd64 (x86_64)

Rebooting with new kernel leads to non-working system (I have /home in zpool).

Going back to previous kernel (the one in "Kernel Version") "resuscitate" my workstation.

I stopped upgrading, but that is only a temporary measure, of course.
Any hint about how to proceed would be most welcome.

Describe how to reproduce the problem

sudo apt update && sudo apt upgrade -y

Include any warning/errors/backtraces from the system logs

I attach the full make.log.

Relevant issue is apparently kernel functionbio_alloc() changed its signature
and now needs 4 parameters instead of 2 as old one.

There are other errors, they seem to be "just warnings" (but could be relevant
nonetheless)

@mcondarelli mcondarelli added the Type: Defect Incorrect behavior (e.g. crash, hang) label Jun 15, 2022
@rhalualani
Copy link

I think the easiest way to dkms to build is:
1: edonr.c in module/icp/algs/edonr/edonr.c at first "case 512:" statement
hashState224(state)->DoublePipe sould instead be hashState512(state)->DoublePipe.
2: For the other errors, Easiest way to get it to build is:
In /usr/src/kernels/5.18...../include/linux/compiler_attributes.h around line 346
Change: define __compiletime_warning(msg) attribute((warning(msg)))
to: define __compiletime_warning(msg) (leave off the attribute stuff)
It will then hopefully build.

@rhalualani
Copy link

Or course, my kernels are Fedora 5.18.xxx Oracle8 and CentOS8.

@rincebrain
Copy link
Contributor

Yes, 2.1.4 is marked as supporting up through 5.17, and the upcoming 2.1.5 has 5.18 and 5.19 build fixes. So if you absolutely must run 5.18 right now, I'd cherrypick them from the upcoming 2.1.5.

@rhalualani
Copy link

In addition I also ran
git clone https://github.com/zfsonlinux/zfs.git
to get latest zfs-2.1.99 which should eventually become 2.1.5

@rincebrain
Copy link
Contributor

No, 2.1.99 is not 2.1.5 or latest git, it's a tag to work around a flaw in how the packaging calculation works, neither is latest git going to become 2.1.5, that's not how point releases of branches work here.

@extralight
Copy link

I am having the same problem on a debian 12 (testing) on kernel 5.18.

Since this is critical, could you please tell me if there is an eta on when 2.1.5 would be available on debian testing?

@rincebrain
Copy link
Contributor

I would assume sometime after all the CI is passing on the zfs-2.1.5-staging branch.

If it's critical I would suggest you either not use 5.18 or cherrypick the fixes for 5.18 compatibility into 2.1.4.

@extralight
Copy link

I would assume sometime after all the CI is passing on the zfs-2.1.5-staging branch.

What I meant was, will the release of 2.1.5 into debian testing happen in days or would it be weeks or even months?

Also what is CI?

@rincebrain
Copy link
Contributor

That's a question for the Debian package maintainers, OpenZFS upstream doesn't maintain the packages in Debian.

CI, or Continuous Integration, was being used here to describe the set of builds and tests that are automatically run regularly on PRs and commonly all the builds not failing and the tests passing would be expected before integrating the PR.

@gertvdijk
Copy link

Work is being done on this; please use the search, several issues have been raised already.

2.1.5 release with Linux 5.18 & 5.19 compatibility is in review: ➡ #13532 ⬅️ you want to subscribe to this one. Or even better; review the PR and test it. 😉

(The release 2.1.4 for which this report is addressed clearly indicates it was only tested up to Linux 5.17.)

@HankB
Copy link

HankB commented Jun 19, 2022

I'm not sure if this is of any help, but I see the same issue on Debian Bookworm (testing) with 2.1.4 and 5.18. Build log is at https://paste.debian.net/1244546. Command to reproduce this is

sudo dpkg-reconfigure linux-image-5.18.0-1-amd64 linux-headers-5.18.0-1-amd64

Severity is minimal for me - I can just boot the previous kernel until this is resolved.

Thanks!

@RAMChYLD
Copy link

I'm affected on more than one front.

On OpenSuSE Tumbleweed, ZFS is officially broken as it has moved on to using kernel 5.18. I assume ZFS on Tumbleweed is automated somewhat because the kernel packages stopped coming as soon as the rollover to 5.18 occurred, I assumed this means the zfs modules has stopped building and no one was alerted to it. Even worse, OpenSuSE Tumbleweed no longer offers DKMS modules, the last DKMS zfs package was for 0.7.0.

On Ubuntu, Liquorix kernel users are affected, since Liquorix rolled over to 5.18 over the past two days.

At the moment, the best I could do was clone the git and try to build the DKMS package. On OpenSUSE there is an error in regards to where the dkms keeps it's postinst scripts ( /usr/libexec/dkms instead of /usr/lib/dkms) but is otherwise not picky.

However, on Ubuntu, it's a nightmare. dpkg would flag the deb file as in conflict with it's own and will need to be forced. And even after that, apt stops working until the offending package is removed.

Ironically, the one that gives me exactly zero issues is Arch. Switching to dkms-git immediately solves all my problems.

So yeah, hoping that 2.1.5 comes out real soon, or if the OpenZFS team can work something out with the two distros. I depend on ZFS heavily, my home directories on both OpenSuSE and Ubuntu machines are on ZFS volumes.

@dreamcat4
Copy link

@RAMChYLD yes this problem isn't actually unique to zfs, it also affect the nvidia drivers from time to time. The best thing to do here really is to be asking both SUSE Timbleweed and also liquorix to maintain an n-1 minor version strategy on the kernel. So that it still remains both possible and convenient to rollback by one .n-1 at any time. Actually come to think of it... for liquorix you probably could just manually reinstall the previous version by specifying to apt on the cmdline? That might work (so long as the previous pkgs are still kept available lingering around up there in the PPA). Sorry i forgot the cmdline too busy to dig that up right now.

@dvogel
Copy link

dvogel commented Jun 22, 2022

Anyone else coming here with a broken Debian install, this bug has also been reported against the zfs-dkms contrib package. It is frustrating to run into these bugs but Richard Laager has a good explanation for why this occurred despite the Debian maintainers trying to avoid the issue:

I think this is because it Depends: a kernel << 5.18 and not
Conflicts/Breaks a kernel >= 5.18. Since you can install multiple kernel
packages, your existing kernel package is satisfying the dependency.

Therefore you likely only got here because you also have a <=5.17 kernel installed. If that is indeed the case, you can remove the

sudo apt purge linux-image-5.18.*

@mcondarelli
Copy link
Author

Severity is not "minimal" because Debian routinely removes old kernels on "normal updates" and ondkms systems kernel modules tend to "vanish".

This should be addressed with thee utmost urgency.

I will try to mitigate problem installing locally 2.1.5, but that is not going to be "a stroll in the park".

@mcondarelli
Copy link
Author

@dvogel There is no more linux-image-5.17.* on Debian Sid; anyone with the problem has better to mark his current kernel as "manually installed" ASAP!

@rincebrain
Copy link
Contributor

rincebrain commented Jun 22, 2022

(This isn't really the place for Debian-specific advice, but you can always go grab the old kernel packages from snapshot.debian.org if you need them - it's inconvenient, but not "no option if the old version is not installed and removed from the live archive".)

I generally advise not running OpenZFS for production use cases on distros without a fixed kernel release for reasons like this. There's going to be a time lag between new Linux kernel versions and OpenZFS fixes for breakage making it into stable releases, so you're either going to end up manually holding your kernel version back, or explicitly breaking like this, or sometimes in more insidious ways (a few times it's broken so that your performance went horribly but kept running).

(Disclaimer I neglected to post: that's my personal opinion, I have no official position other than being allowed to mark bugs with tags on the project, I couldn't speak for them even if I wanted to.)

@gertvdijk
Copy link

gertvdijk commented Jun 22, 2022

Severity is not "minimal" because Debian routinely removes old kernels on "normal updates" and ondkms systems kernel modules tend to "vanish".

This should be addressed with thee utmost urgency.

You're not supposed to run Debian unstable with rolling kernel updates if you require a stable system. (Or any other rolling release distribution for that matter really.)
Just install Debian stable/Bullseye that currently ships with kernel 5.10.x for such requirements and you'll be fine with ZFS.

Also, this is not a Debian support desk. OpenZFS releases are independent of Debian. Debian is downstream here and includes ZFS for stable releases and they make sure it's working for stable. Running unstable means you're on your own, for testing purposes.

Anyway, in the meantime 2.1.5 is released and you can contact Debian developers to pull in the new upstream version. https://github.com/openzfs/zfs/releases/tag/zfs-2.1.5

@mcondarelli
Copy link
Author

Thanks @gertvdijk

I am happy to say other developers do not seem to share your attitude.
Problem was completely solved by synergy between OpenZFS developers releasing zfs-2.1.5 at topmost speed (kudos!) and Debian maintainers releasing an update package less than 24 hours after.
Many thanks to everybody, including people commenting here and helping pushing for a solution.

MANY Thanks!

@bootc
Copy link

bootc commented Jun 23, 2022

zfs-linux 2.1.5-1 has now been uploaded to Debian so it should appear in sid some time today:
https://tracker.debian.org/news/1339391/accepted-zfs-linux-215-1-source-into-unstable/

Speaking as a Debian Developer (but not involved in the kernel or ZFS), the fact that ZFS broke on Debian with a new kernel is neither Debian's problem nor ZFS upstream's. If you choose to run unstable/sid or even testing/bookworm at the moment you are signing up to test what might go into a future Debian stable release. It's not supported and the only real help you'll get is by sending bug reports and being patient.

I'm sure the OpenZFS folks try their best to release versions of ZFS with support for new kernels as soon as possible. The zfs-linux Debian developers similarly, I'm sure, try their best to upload packages with new upstream releases as soon as they can. The Linux kernel folks in Debian don't give a hoot about out-of-tree modules such as ZFS not being ready for their new kernels, and nor should they.

If you run sid or testing and find ZFS doesn't build against a new kernel, be patient and stick with the old kernel until things fall into place. That's all there is to it.

@gmelikov
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests