New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to start swap-create@ service with zstd compressor #53
Comments
Just curious, which commit? |
Apparently, there is no fix in the master branch, it's just the configurator works from my console, but the service refuses to start. Could you check if this problem is reproduced on your system @ignatenkobrain? |
I assume you checked the closed issue on zstd with same error message, but if not, you might want to look at this. |
Thanks for the reply! |
This is a very interesting problem, because if you manually run |
Do you have selinux enabled? Any avcs? |
Hmm, in my tests, the zstd module is automatically loaded when the device with zstd compression is initialized. We could execute |
Of course. In Fedora SELinux enabled by default. |
Apparently the kernel sometimes doesn't load the appropriate compression algorithm (see systemd#50, systemd#53). Let's do an explicit modprobe call, similarly to what we do for the zram module itself. The operation is minimalistic, and we only load the modules for algs that were explicitly configured (on the assumption that the default alg has to be loaded already), and not known to the kernel. Hopefully fixes systemd#53.
Not an issue on new Ubuntu 20.10 VPS instance:
On a new Fedora 33 instance, I add zstd to the default config:
I noticed that with the Ubuntu setup which is installing from git, that if I initialized with the example config and didn't change to When stopping the |
I raised selinux, but actually I don't think it can be related, since there's no explicit modprobe involved. (And even if an explicit modprobe is done, the generator is running in I think a race condition, i.e. a kernel bug, is the most likely explanation. The PR would resolve the issue in that case. |
I don't get how it would be a race condition? Results are consistent and reproducible between Fedora and Ubuntu as described above. On Fedora zstd is never loaded unless done so explicitly, while Ubuntu was loading it when zstd was not loaded but later configured to. |
At the moment when |
@keszybz I cloned your branch onto a fresh Fedora 33 instance, installed it (and just to make sure I was running your build, added a
I've verified the branch I cloned does have your commits, so I'm not sure that it fixes that issue with Fedora. |
Apparently the kernel sometimes doesn't load the appropriate compression algorithm (see systemd#50, systemd#53). Let's do an explicit modprobe call, similarly to what we do for the zram module itself. The operation is minimalistic, and we only load the modules for algs that were explicitly configured (on the assumption that the default alg has to be loaded already), and not known to the kernel. Hopefully fixes systemd#53.
OK, so the issue is not because comp = zcomp_create(zram->compressor);
if (IS_ERR(comp)) {
pr_err("Cannot initialise %s compressing backend\n",
zram->compressor);
err = PTR_ERR(comp);
goto out_free_meta;
} which calls struct zcomp *zcomp_create(const char *compress)
{
struct zcomp *comp;
int error;
if (!zcomp_available_algorithm(compress))
return ERR_PTR(-EINVAL);
comp = kzalloc(sizeof(struct zcomp), GFP_KERNEL);
if (!comp)
return ERR_PTR(-ENOMEM);
comp->name = compress;
error = zcomp_init(comp);
if (error) {
kfree(comp);
return ERR_PTR(error);
}
return comp;
} If the compression wasn't available, EINVAL would be returned. We get ENOMEM either from |
So Fedora has a slightly more updated 5.8 kernel? Presumably if it was kernel related it'd be something to do with Fedora's kernel config more than the kernel version compared to Ubuntu?
Not exactly my forte :) I don't personally use Fedora, I only created an instance on Vultr to more easily test zram-generator, and then verify reproducibility of the issue here and test your PR to provide feedback. It does seem like it'd be a bug for reporting to Fedora rather than the kernel bugzilla no? My understanding is that zram considers |
It doesn't really matter if it's Fedora or Ubuntu: everybody is using the same upstream kernel. |
My point is I can't reproduce it on Ubuntu. I'm not sure if Fedora is configured the same or what differences it has such as SELinux which I don't believe the Ubuntu system has enabled. I'll create an account and submit the reproduction info if it's helpful, it's 1am here, so I'll take care of that after I've rested :) I'm still skeptical that it's kernel issue, since If your code was loading the module in the same manner as |
It's called a "heisenbug" — seemingly unrelated changes cause the bug go away. In this particular case, I assume the kernel is trying to allocate a few pages of contiguous memory, and depending on what else is happening on the machine, sometimes this fails. So the fact that it happens when you do an additional modprobe call one way, but not the other, doesn't really mean anything, because tiny differences in timing can affect reproducibility. |
TL;DRAs suspected, your code was faulty. It made incorrect assumptions, which is why it didn't work when I tested it against Fedora 33. It's not a kernel bug, your code just never loaded the module in the first place, which I tried to raise concern about as a possibility. The correct solution, is to check You should relay this information to your bug report and close it. @keszybz Morning from NZ :) I tested out my thoughts from last night, that the problem was related to your code and not the kernel (Ubuntu and Fedora behaviour differences to autoload zstd aside). I removed An issue with documentation for me is I have no idea where I avoided your logic (which seems to depend on getting the available compressors from an existing initialized zram device in the first place? Explicitly zram0, I haven't checked if that can cause any issues if that were the device you wanted to initialize with zstd, you probably have a better idea there), instead I just called the command to So I suspected there was an issue in your implementation. for compression in compressions.into_iter().filter(|c| !known.contains(c)) {} Here you're filtering out any compression a device has been configured to use against whatever compressors Correct me if I'm wrong, but If you want to check for loaded kernel modules, we can check let path = Path::new("/proc/modules");
let content = fs::read_to_string(path).with_context(|| format!("Failed to read {:?}", path))?;
let loaded_modules = content.lines().into_iter().flat_map(|m|{
m.split_whitespace().next()
}).map(String::from).collect();
Ok(loaded_modules) This works in a sense, we can know if let path = Path::new("/proc/crypto");
let content = fs::read_to_string(path).with_context(|| format!("Failed to read {:?}", path))?;
let available_crypto: Vec<String> = content.lines().into_iter()
.filter(|line| line.starts_with("driver"))
.flat_map(|m|{
let s = m.rsplit(':').next().expect("should have `:` delimiter");
s.trim().strip_suffix("-generic")
})
.map(String::from)
.collect();
Ok(available_crypto) This works well! Quick breakdown,
You can see there are multiple entries for different types, so you could filter on I'm not much of a rust dev, so feel free to improve on the code snippet. Here's what returned vec may look like:
This approach will also work for |
Thank you for the extensive investigation. This doesn't mean that there is no kernel bug — the kernel does load the crypt module when necessary, so none of this should be necessary. But if it helps, then that's good. Let's continue the discussion in the pull request. |
Creation of a zram device (with zstd as the compression alg) occasionally fails with ENOMEM: Jan 13 09:59:39 vultr.guest systemd[1]: Starting Create swap on /dev/zram3... Jan 13 09:59:39 vultr.guest kernel: Can't allocate a compression stream Jan 13 09:59:39 vultr.guest kernel: zram: Cannot initialise zstd compressing backend Jan 13 09:59:39 vultr.guest zram-generator[997]: Error: Failed to configure disk size into /sys/block/zram3/disksize Jan 13 09:59:39 vultr.guest zram-generator[997]: Caused by: Jan 13 09:59:39 vultr.guest zram-generator[997]: Cannot allocate memory (os error 12) https://bugzilla.kernel.org/show_bug.cgi?id=211173 The kernel normally loads the compressor module when a device with a given compression module is loaded, but this doesn't always work. Apparently, loading the compression module up front avoids the issue. Fixes systemd#50, systemd#53. Note: /sys/block/zram0/comp_algorithm lists compressors known to zram, not the ones currently loaded. /proc/crypto lists currently loaded crypto & compression modules.
Creation of a zram device (with zstd as the compression alg) occasionally fails with ENOMEM: Jan 13 09:59:39 vultr.guest systemd[1]: Starting Create swap on /dev/zram3... Jan 13 09:59:39 vultr.guest kernel: Can't allocate a compression stream Jan 13 09:59:39 vultr.guest kernel: zram: Cannot initialise zstd compressing backend Jan 13 09:59:39 vultr.guest zram-generator[997]: Error: Failed to configure disk size into /sys/block/zram3/disksize Jan 13 09:59:39 vultr.guest zram-generator[997]: Caused by: Jan 13 09:59:39 vultr.guest zram-generator[997]: Cannot allocate memory (os error 12) https://bugzilla.kernel.org/show_bug.cgi?id=211173 The kernel normally loads the compressor module when a device with a given compression module is loaded, but this doesn't always work. Apparently, loading the compression module up front avoids the issue. Fixes systemd#50, systemd#53. Note: /sys/block/zram0/comp_algorithm lists compressors known to zram, not the ones currently loaded. /proc/crypto lists currently loaded crypto & compression modules.
Could you create a new release? There is a commit in the master branch that fixed the zram creation with the zstd compressor.
In version 0.2.0 you cannot create zram with zstd, the error is as follows:
Additional info:
System: Fedora 33 (KDE)
SystemD version: 246.7-2.fc33
zram-generator version: 0.2.0-4.fc33
The text was updated successfully, but these errors were encountered: