-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modprobe fails with "Exec format error" and "Skipping invalid relocation target, existing value is nonzero for type 1" #14398
Comments
Maybe a compiler version mismatch? Between the version the kernel was compiled with and the one used to compile the modules? |
@AllKind don't think so...
|
I see a couple of people posting that reinstalling their kernel headers packages and rebuilding fixed this for them. You could try that. |
@rincebrain Ok thanks... I wiped out everything including the zfs-2.1.7 source tree. Since this kernel is so old, had to define Debian snapshot repositories (very cool I might add and thank you Debian for keeping old code). Then, I reinstalled the headers, followed by the OP commands. The results are the same, though, unfortunately. Commands and output for: cleanup, header reinstall, zfs rebuild+reinstall
mod commands and output
|
Any particular reason you're running an ancient snapshot kernel? I don't have any particular reason to think a newer one would help here, just wondering since you mentioned reaching into snapshot.debian.org to get it, and buster is up to 4.19.0-23... |
@rincebrain Not really, that was just the last time I did a full upgrade. I could even go to 5.10 from buster-backports, too. If a new kernel or reboot wouldn't help then I can keep enjoying my uptime. Frankly, I don't even understand what the error means in the first place. I'll keep doing research or try going back to the old ZFS version. Maybe I should post about it in the kernel community to see if anyone there has insights? |
I can't say it would or wouldn't. I've personally come to dislike the notion of long monolithic server uptimes as a feature, for similar reasons to why people dislike having special snowflake servers with no reproducibility in their setup - the longer it's been since you checked that booting works, the more likely it is something broke, and if it breaks, you're going to be hard-pressed to reproduce every magical thing about how it worked before. But YMMV. |
Can't say I've seen that problem ever. If it booted fine 738 days ago it'll boot fine again, but that may be because I practically haven't changed anything. I could maybe see how installing all kinds of updates upon updates and never rebooting might mess something up, but that seems unlikely as well since the last working kernel is always saved as a fallback in such cases... Either way, it's not like I force these sorts of situations, but it is nice to be reminded of how impeccable our electricity, hardware, and software are that provides our valuable services. |
Yes, if you don't update, it's unlikely to fail on reboot. My remark was on installing updates without rebooting. |
Welp, nobody seems to have a clue what causes these modprobe errors. I started to get the same errors with the QAT modules too, so I also asked Intel about it, as well as Linux-Modules mailing list (you'd think someone there would know). So that's pretty funny. It's probably for the better. This hardware deserved an upgrade after over 2 years of constantly running stale stanky code. So I gave it a new BIOS firmware, new BMC firmware, new bootloaders, new partition tables, a new kernel, upgraded the operating system, upgraded all the software packages, and reinstalled the QAT drivers. Now it's humming along with its QuickAssist engines, waiting for work! but ZFS is still a PITA! 😂 Where do I begin? Well, with the latest 2.1.8 release, I'm running into this nearly 3 year old issue where Finally, I thought I'd give my other QAT+Debian bud @ioguix's idea a try and
Somehow they figured out a way to force insert it... but that was a bad idea because now the QAT is complaining when I tried It just can't handle it (+1350 lines)
In the end, zpool is just hung now. Who knows what's happening to my filesystem. |
Oh yes, I'm on to something. It all depends on which freaking message one decides to pursue... I decided to look into After learning more than I probably ever need to know about IOMMU, I found that it was related to Intel Virtualization Technology, which I swear I disabled in the BIOS since we don't use any virtualization on this computer. However, after double-checking, it was enabled! After disabling that and making sure system still booted, I also found the kernel command line option Then I reconfigured and rebuilt the QAT drivers and once again tried
just going to wait until the task finds some new data to backup and write to the pool, and make sure the compression counters go up, too... humm still waiting, it's a lot of data to read through! got to go afk now but so far everything looks good! Update 1/26: Yes looks like it is all running finally!
|
@AGI-chandler In the future, when you stumble on errors like "FATAL: modpost: GPL-incompatible module zfs.ko uses GPL-only symbol 'perf_trace_buf_alloc'" or similar issues having to do with Linux developers changing more and more of the EXPORT_SYMBOL instances to EXPORT_SYMBOL_GPL, do yourself a big favor and simply change CDDL to GPL before building. For more info see: You have every right to do that, also Linux developers shouldn't be upset/bothered with it since only closed source proprietary drivers would have issue with slapping GPL to their module. CDDL is COPYLEFT just like GPL, and since you are not "distributing", there's zero legal issues. |
Thanks @jittygitty I'll definitely remember that because you're right I couldn't care less what the license is, I will make the code work for us. I'm pretty sure our educational use of the code is further protected by "fair use" provisions as well. In the end though, in regards to the original error this issue was opened for: no one seems to actually know what is the cause nor what is the solution, other than shutting down the system and rebooting. I even asked the linux kernel modules mailing list, I thought for sure someone there might know but nope 😂 I think the computer was just tired of running stinky old code so I gave it a bunch of new code everywhere, BIOS firmware, Linux kernel, OS, all the software including ZFS, and didn't need to manually compile ZFS anymore since Debian zfs-dkms package can automatically pick up the QAT drivers and add that functionality to the modules. Yes it's been humming along for a short 23 days now and already processed nearly 42 TB of written data oh behalf of ZFS! Anyway I'm pretty sure most of the errors in this issue have been addressed so guess I can close this. |
System information
Describe the problem you're observing
Thought I'd try my luck with upgrading, and it's not in my favor. I've been at it for several hours now and I can't find anyone else out there with these same error messages and environment. After exporting the pool, unloading the old modules, uninstalling the old zfs, downloading the new zfs, configuring, building and installing the debs, I now cannot load the modules. For example:
and the syslog shows:
<datetime> <hostname> kernel: [<uptime>] module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 1, loc 000000002cf3eefb, val ffffffffc093a1f0
Describe how to reproduce the problem
# ./autogen.sh
# ./configure --enable-systemd
# make -j16 deb-utils deb-kmod
# dpkg -i *.deb
# modprobe zfs
modprobe: ERROR: could not insert 'zfs': Exec format error
#
Include any warning/errors/backtraces from the system logs
# tail /var/log/syslog
[...]
<datetime> <hostname> kernel: [<uptime>] module: x86/modules: Skipping invalid relocation target, existing value is nonzero for type 1, loc 000000002cf3eefb, val ffffffffc093a1f0
#
The text was updated successfully, but these errors were encountered: