New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sometimes qubes don't start the first time - If Whonix/Kicksecure Hardening Features for Testers Enabled #7959
Comments
The qube seems to start very slowly, I see a few services that take a long time to start (over 10s), including |
Actually enabling these services does increase boot time, but there are
2 issues:
- Why not starting at the first place? and if i press on start again
then it will start normally (taking too long to start is ok but not
starting hmm thats i believe a bug)
- If about RAMs issue then the dynamic 400-4000 not really that dynamic
specially at boot time? (400 is for surely low but i dont rely on the
400 instead i believe in the dynamic increase to increase it up to the
max when needed, 4000 is more than enough)
Marek Marczykowski-Górecki:
… The qube seems to start very slowly, I see a few services that take a long time to start (over 10s), including `SUID, SGID, Capabd File Permission Hardening` and `Boot Clock Randomization` (names seems to be mangled). @adrelanos, do you seen anything in those that could potentially take a long time?
Some idea might be to increase initial memory for such qube.
|
There is a start timeout of 60s. You can increase it if you want (
Dynamic memory management kicks in when relevant service starts within the qube, before that it's at the initial amount. |
Looking at the log and ignoring everything before I didn't see any slowness during the boot process.
Is this an issue?
This seems to be a major issue? |
Since when does this issue happen?
SUID Disabler and Permission Hardener: At time of writing an opt-in, testers only feature. It might be tested here. In theory there could be an issue which makes it it run infinite (infinite loop) or very long but it could also be a follow-up issue due to above mount related issue?
Boot Clock Randomization: A stable feature that hasn't ever caused boot issues. This feature has very much less complexity than above script. The systemd unit runs bash code which is probably less than 200 lines of code I don't think there's potential to run very slow unless there's underlying major system issues running simple Linux utilities. For example it runs On handling corner cases: I am happy to make the source code more resilient in cases there are such system issues. For example, perhaps all systemd units could have timeout instead of no timeout. It's expected to finish within a second but if it takes let's say longer than 20 seconds (x20 times longer than expected), systemd could kill it. That wouldn't fix the underlying system issues, but might prevent the Qubes VM boot process from being broken. The user would end up with a broken system that isn't functional according to vendor specification. In this case, the user's system would lack Boot Clock Randomization. A fail-safe implementation could intentionally break networking if Boot Clock Randomization fails. (A systemd reverse dependency.) |
I think |
Maybe adding |
Could you please comment-in
in file
|
done
Patrick Schleizer:
…
Could you please comment-in
set -x
in file
`/usr/libexec/security-misc/permission-hardening`
? @TNTBOMBOM
|
The log should now be more verbose after reboot. If so, could you share please? |
|
I don't see any more details there. You need to make the change in the template, otherwise it gets reverted on reboot. And also, you may want to add |
Check this log: If nothing new shown up then its not been cached by this log method, and we need to use something else. |
I don't see anything wrong in that log. The enable debugging of SUID Disabler and Permission Hardener was probably wrong. More detailed instructions: In Template:
Change from
to
These instruction's won't survive updates by security-misc. For instructions that do surrive reboot (but log a bit less but probably still enough):
Add.
Save. Then please add a new log here. Boot Clock Randomization changes the clock as per log:
(- 64 seconds and 049155879 nanoseconds.) But that's happening for all Whonix users for years so I doubt that would be causing this issue. That however might confuse the logging of the timing. The journal timestamps however seem to be unchanged so either journal has a special way to ensuring timestamps (separate clock) or boot clock randomization isn't functional. The latter I will investigate but that's unrelated to this ticket. You could disable Boot Clock Randomization / sdwdate for testing. Instructions: |
Does this also happen for VMs based on other Templates other than Kicksecure or Whonix? Does this also happen when not using SUID Disabler and Permission Hardener? |
On Mon, Jan 30, 2023 at 03:25:39AM -0800, Patrick Schleizer wrote:
> Sometimes qube/app-vm doesnt start from the first time and telling me to check /var/log/xen/console
Does this also happen for VMs based on other Templates other than Kicksecure or Whonix?
Does this also happen when not using SUID Disabler and Permission Hardener?
I havent seen this using other templates.
|
Ok check now:
No, not yet.
yes. |
Thank you! There's nothing specifically wrong with SUID Disabler and Permission Hardener on your system that would require further debugging. I recommend to not use this testers-only feature for now. Documentation: https://www.kicksecure.com/w/index.php#Disable_SUID_Disabler_and_Permission_Hardener Undoing its modifications shouldn't be required. Just the systemd unit needs to be disabled so it won't add boot delay. SUID Disabler and Permission Hardener performance and Qubes integration issues will be tracked here: I'll post an update there once a solution was been implemented.
Therefore no need to concentrate too much on SUID Disabler and Permission Hardener in this ticket. I recommend to not use this testers-only feature for now as it might contribute to further boot delay until the root cause is found. Are you using an SSD or HDD? Does this happen with "stock" Whonix after a Template re-installation? Any other "unusual" modifications? Please mention any testers-only features or other "rare" modifications in bug reports as these might have a huge effect on ability to reproduce issues. To debug further, a new log with after disabling SUID Disabler and Permission Hardener or a "stock" Whonix is required should this issue still be happening. |
SSD
I dont remember happened with me
Nope
Ok will add. |
Problem doesnt happen with freshly installed whonix |
Closing as its determined which components reproducing this. |
How to file a helpful issue
Qubes OS release
4.1
Brief summary
Sometimes qube/app-vm doesnt start from the first time and telling me to check /var/log/xen/console
Steps to reproduce
Just start the qube/appvm
Expected behavior
Should start normally
Actual behavior
/var/log/xen/console
:whonix-ws.txt
Workaround
Just start it again.
The text was updated successfully, but these errors were encountered: