Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use SRSO spec_rstack_overflow kernel setting? #177

Closed
adrelanos opened this issue Dec 5, 2023 · 16 comments
Closed

use SRSO spec_rstack_overflow kernel setting? #177

adrelanos opened this issue Dec 5, 2023 · 16 comments

Comments

@adrelanos
Copy link
Member

To test:

cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow

https://kernel.org/doc/html/latest/admin-guide/hw-vuln/srso.html

	spec_rstack_overflow=
			[X86] Control RAS overflow mitigation on AMD Zen CPUs

			off		- Disable mitigation
			microcode	- Enable microcode mitigation only
			safe-ret	- Enable sw-only safe RET mitigation (default)
			ibpb		- Enable mitigation by issuing IBPB on
					  kernel entry
			ibpb-vmexit	- Issue IBPB only on VMEXIT
					  (cloud-specific mitigation)

Best to use spec_rstack_overflow=ibpb?

Or not needed because we're already using spec_store_bypass_disable=on?

@adrelanos
Copy link
Member Author

sudo dmesg | grep -i speculative

@monsieuremre
Copy link
Contributor

monsieuremre commented Dec 8, 2023

Whenever there is no appropriate microcode applied, setting this parameter will result in:

 * 'Vulnerable: No microcode':

   The processor is vulnerable, no microcode extending IBPB
   functionality to address the vulnerability has been applied.

There might be a better way to achieve the protections when needed and when avaiable. We already set spectre_v2=on, which means all mitigations available are applied at all times. This also enforces spectre_v2_user=on, which, as far as I understand, should be equal to spectre_v2_user=seccomp,ibpb given the hardware. So I think we should be covered already.

https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html

But even if my understanding is wrong and we are better off setting the parameter given the hardware, I would still say that this option is of very little advantage if at all, on debian. Because this is a new thing apparently. The latest microcode updates are needed to enable these mitigations. Also most of the time the latest Kernel. You see, quoting from the link you provided:
This is a mitigation for the speculative return stack overflow (SRSO) vulnerability found on AMD processors and The issue is tracked under CVE-2023-20569. Debian is always very, very late with microcode updates. And by late, I mean you get no updates until the next interim point release. So, I doubt if one could even use these mitigations with the firmware available on debian repos or with the default debian kernel, in the first place anyway. I take this opportunity to once again underline the importance of using a rolling release as the base, as I have been suggesting one for some time now. Note that there are also point release distros that only roll their kernel, microcode and firmware at all times, like Fedora.

On a seperate but related topic, I suggest we do the following:

Set mitigations=auto,nosmt. Mitigate all CPU vulnerabilities, disabling SMT if needed. This is for users who always want to be fully mitigated, even if it means losing SMT.

Default is mitigations=auto. Mitigate all CPU vulnerabilities, but leave SMT enabled, even if it's vulnerable.

Now we already get this protection because we set nosmt=force. I don't think disabling smt at all times even when there is no risk of exploit is as optimal as setting mitigations=auto,nosmt. So maybe we set this one instead.

@adrelanos
Copy link
Member Author

spectre_v2_user=seccomp,ibpb

                        seccomp,ibpb
                                - Like "seccomp" above, but only STIBP is
                                  controlled per thread. IBPB is issued
                                  always when switching between different
                                  user space processes.

The word "only" is concerning?

This seems a weaker setting than:

                        on      - Unconditionally enable mitigations. Is
                                  enforced by spectre_v2=on

mitigations=auto,nosmt sounds good.

                        auto,nosmt
                                Mitigate all CPU vulnerabilities, disabling SMT
                                if needed.  This is for users who always want to
                                be fully mitigated, even if it means losing SMT.
                                Equivalent to: l1tf=flush,nosmt [X86]
                                               mds=full,nosmt [X86]
                                               tsx_async_abort=full,nosmt [X86]
                                               mmio_stale_data=full,nosmt [X86]
                                               retbleed=auto,nosmt [X86]

It might obsolete a few of our other settings.

@monsieuremre
Copy link
Contributor

This seems a weaker setting than:

I think because it is. I'm suggesting this change. What I'm telling here is, when it is on, available mitigations are chosen. It is better than hardcoding.

@adrelanos
Copy link
Member Author

adrelanos commented Dec 11, 2023

Not sure what you're suggesting? Setting spectre_v2_user=seccomp,ibpb? Why would we weaken the settings?

To make sure something is enabled rather than nothing? That would be the wrong approach.

verification: Better to ask the kernel from within the running system by using the usual API (/proc or /sys) if mitigation are enabled. This kind of testing, confirmation would be a job for systemcheck.

We could drop most and use mitigations=auto,nosmt instead. Everything that is covered by mitigations= does not need to be duplicated. Since we don't have verification it's still non-ideal but acceptable. Still flying blind.

@monsieuremre
Copy link
Contributor

Not sure what you're suggesting? Setting spectre_v2_user=seccomp,ibpb? Why would we weaken the settings?

No, the opposite. I am just making a point. A point that we don't need to set the thing you suggested manually, because we already have spectre_v2=on. What comes after is my reasoning.

Only suggestion I'm making is the other thing, which is mitigations=auto,nosmt, and it is separate.

@adrelanos
Copy link
Member Author

adrelanos commented Dec 12, 2023

mitigations=auto,nosmt is nice and we can enable it. Quote:

                        auto,nosmt
                                Mitigate all CPU vulnerabilities, disabling SMT
                                if needed.  This is for users who always want to
                                be fully mitigated, even if it means losing SMT.
                                Equivalent to: l1tf=flush,nosmt [X86]
                                               mds=full,nosmt [X86]
                                               tsx_async_abort=full,nosmt [X86]
                                               mmio_stale_data=full,nosmt [X86]
                                               retbleed=auto,nosmt [X86]

but mitigations=auto,nosmt kernel manual does not mention spec_rstack_overflow. Explicitly setting spec_rstack_overflow might be more secure.

Note,

It's not the same. Both have their own dedicated kernel manual page.

@raja-grewal
Copy link
Contributor

Comparing to we already have in our configuration, we seem to be explicitly missing:

  1. retbleed=auto,nosmt: This should be added.

  2. spec_rstack_overflow=ibpb or spec_rstack_overflow=safe-ret: I am not certain which one is superior for our explicit purposes. I think the first is probably better.

As you have stated, using mitigations=auto,nosmt is probably a good idea. While it would obsolete existing settings, I don’t think any harm will come since we use the same equivalent settings.

Perhaps we should place this kernel parameter at the top of the configuration so it is applied first and then reinforced by our already more fine-grained settings?

This way we will inherit any future mitigation from this kernel parameter that are not explicitly specified while also keeping our strict fine-grained settings fixed.

Finally , since spec_rstack_overflow is not explicitly mentioned in mitigations, I believe we should put it in directly just to be safe.

@adrelanos
Copy link
Member Author

  1. retbleed=auto,nosmt: This should be added.

Ok.

  1. spec_rstack_overflow=ibpb or spec_rstack_overflow=safe-ret: I am not certain which one is superior for our explicit purposes. I think the first is probably better.

If we don't know it, we should ask somewhere. Short of an obvious answer or expert opinion, without being sure, we better keep it as is.

The kernel built-in default setting might actually already be the most secure setting.

As you have stated, using mitigations=auto,nosmt is probably a good idea.

Ok.

Perhaps we should place this kernel parameter at the top of the configuration so it is applied first and then reinforced by our already more fine-grained settings?

Ok.

@adrelanos
Copy link
Member Author

The spec_rstack_overflow change in #197 I'd comment out after merge due to my above comment.

@raja-grewal
Copy link
Contributor

I have left that parameter as the default kernel mitigation: spec_rstack_overflow=safe-ret?

@adrelanos
Copy link
Member Author

If we don't know a more safe setting than the default, we could as well as leave the default?
Not sure that will ever happen here but generally if the kernel gets a better, more secure default, we wouldn't want to have hardcoded it.
So as a general rule I guess brevity as in not configuring things which have no effect should be preferred.

@raja-grewal
Copy link
Contributor

raja-grewal commented Jan 29, 2024

I'll leave it in for clarity but remove the hard-coding while providing an explanation.

Thoughts?

@monsieuremre
Copy link
Contributor

I would like to once again recommend otherwise, because I am not clear on a matter.

Steps to reproduce:

  • Run sudo dmesg | grep -i speculative
  • Returns Mitigation: safe RET, no microcode
  • Now set spec_rstack_overflow=ibpb
  • Again run the command
  • Returns 'Vulnerable: No microcode'

What this mean now? What I understand here is, when we set spec_rstack_overflow=ibpb, and the system does not support that, then no mitigation is applied, so there is no fallback to spec_rstack_overflow=safe-ret. And again, far as I understand from the documentation, this setting will automatically be applied if it is supported anyway, if mitigations are set to auto, which they are in our case. So not only does this not bring any benefit, but it does more harm.

Feel free to correct me if I'm missing something or not understanding correctly.

@adrelanos
Copy link
Member Author

The updated, linked PR (which no longer touched spec_rstack_overflow) looks good now.

Keeping that PR a few days longer open to see if there's any other feedback.

@adrelanos
Copy link
Member Author

Was merged and seems reflecting consensus.

Please re-open or create a new issue should there still be something to cover.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants