Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't upgrade to HAOS 12.3 on NUC Intel 5 #3363

Open
OZ1SEJ opened this issue May 12, 2024 · 5 comments
Open

Can't upgrade to HAOS 12.3 on NUC Intel 5 #3363

OZ1SEJ opened this issue May 12, 2024 · 5 comments
Labels
board/generic-x86-64 Generic x86-64 Boards (like Intel NUC) bug stable-kernel-regression Issue which appears to be an upstream stable kernel regression

Comments

@OZ1SEJ
Copy link

OZ1SEJ commented May 12, 2024

Describe the issue you are experiencing

When I try to install the upgrade from v. 12.2 to 12.3, it attempts to boot from slot A three times, after which it reverts to slot B.

What operating system image do you use?

generic-x86-64 (Generic UEFI capable x86-64 systems)

What version of Home Assistant Operating System is installed?

12.2

Did you upgrade the Operating System.

Yes

Steps to reproduce the issue

  1. Settings
  2. Home Assistant Operating System 12.3
  3. Install

Anything in the Supervisor logs that might be useful for us?

No.

Anything in the Host logs that might be useful for us?

No.

System information

System Information

version core-2024.5.3
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.12.2
os_name Linux
os_version 6.6.25-haos
arch x86_64
timezone Europe/Copenhagen
config_dir /config
Home Assistant Community Store
GitHub API ok
GitHub Content ok
GitHub Web ok
GitHub API Calls Remaining 5000
Installed Version 1.34.0
Stage running
Available Repositories 1395
Downloaded Repositories 9
Home Assistant Cloud
logged_in false
can_reach_cert_server ok
can_reach_cloud_auth ok
can_reach_cloud ok
Home Assistant Supervisor
host_os Home Assistant OS 12.2
update_channel stable
supervisor_version supervisor-2024.05.1
agent_version 1.6.0
docker_version 25.0.5
disk_total 116.7 GB
disk_used 26.4 GB
healthy true
supported true
board generic-x86-64
supervisor_api ok
version_api ok
installed_addons Terminal & SSH (9.14.0), CEC Scanner (3.0), Mosquitto broker (6.4.0), Zigbee2MQTT (1.37.1-1), Node-RED (17.0.12), Duck DNS (1.17.0), InfluxDB (5.0.0), Grafana (9.2.2), ZeroTier One (0.18.0), File editor (5.8.0), rtl_433 MQTT Auto Discovery (0.8.1)
Dashboards
dashboards 3
resources 5
views 21
mode storage
Recorder
oldest_recorder_run May 2, 2024 at 8:36 AM
current_recorder_run May 11, 2024 at 5:13 PM
estimated_db_size 3354.56 MiB
database_engine sqlite
database_version 3.44.2

Additional information

When it's trying to boot on slot A, this text is displayed on screen just before it reboots:

[  OK  ] Started containerd container runtime.
[    4.731515] BUG: scheduling while atomic: kworker/1:2/77/0x00000002
[ ***  ] A start job is running for Network Manager Wait Online (24s / no limit)
[   25.733573] rcu: INFO: rcu_preempt self-detected stall on CPU
[   25.734009] rcu: o1-....: (20999 ticks this GP) idle=86d4/1/0x4000000000000000 softirq=6601/6601 fqs=5250
[***   ] A start job is running for Network Manager Wait Online (52s / no limit)
[   54.313600] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 1-.... } 21265 jiffies s: 277 root: 0x2/.
[FAILED] Failed to start Network Manager Wait Online.
See 'systemctl status NetworkManager-wait-online.service' for details.
[  OK  ] Reached target Network is Online.
@OZ1SEJ OZ1SEJ added the bug label May 12, 2024
@sairon
Copy link
Member

sairon commented May 13, 2024

That looks very much like some upstream kernel regression. What Intel NUC model is that exactly? Can you get boot log from the failed boot, ideally with some kernel stack traces after booting back to the previous version using ha host logs -b-1 -n1000 (or replace -1 with lower value)?

@sairon sairon added the board/generic-x86-64 Generic x86-64 Boards (like Intel NUC) label May 13, 2024
@OZ1SEJ
Copy link
Author

OZ1SEJ commented May 13, 2024

It's a NUC5 i5 RYK Core i5-5250u @ 2.7 Ghz with 4 GB RAM and 120 GB SSD. I ran this exact command, and you can find the output on https://pastebin.com/n6eujWm1.

@OZ1SEJ
Copy link
Author

OZ1SEJ commented May 13, 2024

I've also uploaded the log file here, if that makes more sense:
host.log
Please let me know if there's anything more I can provide!

@sairon sairon added the stable-kernel-regression Issue which appears to be an upstream stable kernel regression label May 27, 2024
@sairon
Copy link
Member

sairon commented May 27, 2024

Unfortunately there was nothing helpful in the logs you provided (not your fault - it just wasn't persisted because of the kernel error) but now that we have more information about similar issues, it makes me believe it's another manifestation of #3368 - as your NUC also uses the Intel e1000e driver. So same goes for this issue - it should be fixed in the next release or latest dev, in the meantime you can only revert to 12.2 or switch to the dev channel (which I recommend to do only to test if it's fixed there and then switch back to beta/stable - while dev is currently "stable", it may break from day to day in the future).

@OZ1SEJ
Copy link
Author

OZ1SEJ commented May 27, 2024

Thank you for your kind reply. It certainly looks like it's the same, underlying cause of these problems. I'll stick with the 12.2 for the time being, and see if it's fixed in the next release. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
board/generic-x86-64 Generic x86-64 Boards (like Intel NUC) bug stable-kernel-regression Issue which appears to be an upstream stable kernel regression
Projects
None yet
Development

No branches or pull requests

2 participants