Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occasionally pins CPU to 100% #17

Closed
gidoBOSSftw5731 opened this issue Sep 30, 2023 · 3 comments
Closed

Occasionally pins CPU to 100% #17

gidoBOSSftw5731 opened this issue Sep 30, 2023 · 3 comments
Labels
bug Something isn't working needing further clarification Waiting for additional details or clarification

Comments

@gidoBOSSftw5731
Copy link
Contributor

Framework 13th gen, seems like sometimes after it suspends/hibernates or runs for a few days something goes haywire and it goes up to 100% cpu usage (as reported by top, so only one thread.) Restarting the service solves this. Next time it happens I'll try to collect some logs.

@gidoBOSSftw5731
Copy link
Contributor Author

Relevant journalctl except:

Sep 30 16:24:08 gframe systemd[1]: Started fw-fanctrl.service - FrameWork Fan Controller.
░░ Subject: A start job for unit fw-fanctrl.service has finished successfully
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░ 
░░ A start job for unit fw-fanctrl.service has finished successfully.
░░ 
░░ The job identifier is 147846.
Sep 30 16:25:53 gframe python3[1147900]: EC returned error result code 7
Sep 30 16:46:55 gframe python3[1154504]: EC response reserved != 0
Sep 30 17:02:09 gframe python3[1162689]: Missing Chromium EC memory map.
Sep 30 17:02:09 gframe python3[1162689]: Cannot find I2C adapter
Sep 30 17:02:09 gframe python3[1162689]: Unable to establish host communication
Sep 30 17:02:09 gframe python3[1162689]: Couldn't find EC
Sep 30 17:23:09 gframe python3[1173005]: EC returned error result code 7
Sep 30 18:14:11 gframe python3[1193186]: Missing Chromium EC memory map.
Sep 30 18:14:11 gframe python3[1193186]: Cannot find I2C adapter
Sep 30 18:14:11 gframe python3[1193186]: Unable to establish host communication
Sep 30 18:14:11 gframe python3[1193186]: Couldn't find EC
Sep 30 19:41:33 gframe python3[1224615]: EC response reserved != 0
Sep 30 19:41:33 gframe python3[1224615]: EC returned error result code 7
Sep 30 19:45:59 gframe python3[1225688]: EC response version mismatch
Sep 30 19:45:59 gframe python3[1225688]: EC returned error result code 12
Sep 30 19:45:59 gframe python3[1225688]: EC returned error result code 1
Oct 01 16:58:59 gframe python3[1385517]: EC response reserved != 0
Oct 01 17:40:48 gframe python3[1409681]: EC response reserved != 0
Oct 01 18:12:20 gframe python3[1417193]: EC returned error result code 7
Oct 01 18:12:20 gframe python3[1417193]: EC returned error result code 1
Oct 01 18:18:08 gframe python3[1418430]: EC response reserved != 0
Oct 01 18:18:08 gframe python3[1418430]: EC returned error result code 7
Oct 01 18:23:10 gframe python3[1419717]: EC response reserved != 0
Oct 01 18:23:10 gframe python3[1419717]: EC returned error result code 7
Oct 01 19:51:12 gframe python3[1434358]: EC returned error result code 12
Oct 01 20:34:12 gframe python3[1441772]: EC returned error result code 7
Oct 01 21:06:12 gframe python3[1447207]: EC response reserved != 0
Oct 01 22:14:00 gframe python3[1462609]: EC returned error result code 7
Oct 01 22:14:00 gframe python3[1462609]: EC returned error result code 12
Oct 01 23:25:24 gframe python3[1474541]: EC response has invalid checksum
Oct 02 00:30:50 gframe python3[1621628]: EC response has invalid checksum
Oct 02 00:30:50 gframe python3[1621628]: EC returned error result code 7
Oct 02 02:50:58 gframe python3[2534277]: EC response reserved != 0
Oct 02 05:38:14 gframe python3[2576837]: EC response reserved != 0
Oct 02 05:57:24 gframe python3[2579945]: EC response has invalid checksum
Oct 02 06:49:24 gframe python3[2589184]: EC response version mismatch
Oct 02 07:00:28 gframe python3[2591025]: EC response reserved != 0
Oct 02 07:17:50 gframe python3[2593827]: EC response has invalid checksum
Oct 02 07:47:48 gframe python3[2598590]: Missing Chromium EC memory map.
Oct 02 07:47:48 gframe python3[2598590]: Cannot find I2C adapter
Oct 02 07:47:48 gframe python3[2598590]: Unable to establish host communication
Oct 02 07:47:48 gframe python3[2598590]: Couldn't find EC

dmesg output starting at the suspension:

[312001.582152] PM: suspend entry (deep)
[312001.592483] Filesystems sync: 0.010 seconds
[312001.592670] (NULL device *): firmware: direct-loading firmware iwlwifi-ty-a0-gf-a0.pnvm
[312001.592806] (NULL device *): firmware: direct-loading firmware regulatory.db
[312001.592827] (NULL device *): firmware: direct-loading firmware regulatory.db.p7s
[312001.592965] (NULL device *): firmware: direct-loading firmware intel/ibt-0041-0041.sfi
[312001.593157] (NULL device *): firmware: direct-loading firmware iwlwifi-ty-a0-gf-a0-74.ucode
[312001.594923] (NULL device *): firmware: direct-loading firmware intel/ibt-0041-0041.ddc
[312001.594946] (NULL device *): firmware: direct-loading firmware i915/adlp_dmc.bin
[312001.595209] (NULL device *): firmware: direct-loading firmware i915/adlp_guc_70.bin
[312001.596124] (NULL device *): firmware: direct-loading firmware i915/tgl_huc.bin
[312001.599300] Freezing user space processes
[312001.601389] Freezing user space processes completed (elapsed 0.002 seconds)
[312001.601392] OOM killer disabled.
[312001.601393] Freezing remaining freezable tasks
[312001.602719] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[312001.602759] printk: Suspending console(s) (use no_console_suspend to debug)
[312001.985107] ACPI: EC: interrupt blocked
[312002.013286] ACPI: PM: Preparing to enter system sleep state S3
[312002.027711] ACPI: EC: event blocked
[312002.027712] ACPI: EC: EC stopped
[312002.027712] ACPI: PM: Saving platform NVS memory
[312002.027966] Disabling non-boot CPUs ...
[312002.029237] smpboot: CPU 19 is now offline
[312002.061798] ACPI: PM: Low-level resume complete
[312002.061969] ACPI: EC: EC started
[312002.061971] ACPI: PM: Restoring platform NVS memory
[312002.064740] Enabling non-boot CPUs ...
[312002.064815] smpboot: Booting Node 0 Processor 19 APIC 0x3e
[312002.085876] core: cpu_atom PMU driver: PEBS-via-PT 
[312002.085889] ... version:                5
[312002.085892] ... bit width:              48
[312002.085894] ... generic registers:      6
[312002.085896] ... value mask:             0000ffffffffffff
[312002.085898] ... max period:             00007fffffffffff
[312002.085900] ... fixed-purpose events:   3
[312002.085901] ... event mask:             000000070000003f
[312002.086766] CPU19 is up
[312002.087649] ACPI: PM: Waking up from system sleep state S3
[312002.092042] ACPI: EC: interrupt unblocked
[312002.235453] ACPI: EC: event unblocked
[312002.243947] i915 0000:00:02.0: [drm] GT0: GuC firmware i915/adlp_guc_70.bin version 70.5.1
[312002.243951] i915 0000:00:02.0: [drm] GT0: HuC firmware i915/tgl_huc.bin version 7.9.3
[312002.254246] iwlwifi 0000:aa:00.0: WRT: Invalid buffer destination
[312002.257423] nvme nvme0: Shutdown timeout set to 10 seconds
[312002.258356] i915 0000:00:02.0: [drm] GT0: HuC: authenticated for all workloads
[312002.258887] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
[312002.258888] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
[312002.259516] i915 0000:00:02.0: [drm] GT0: GUC: RC enabled
[312002.263766] nvme nvme0: 20/0/0 default/read/poll queues
[312002.417759] iwlwifi 0000:aa:00.0: WFPM_UMAC_PD_NOTIFICATION: 0x1f
[312002.417779] iwlwifi 0000:aa:00.0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
[312002.417792] iwlwifi 0000:aa:00.0: WFPM_AUTH_KEY_0: 0x80
[312002.417803] iwlwifi 0000:aa:00.0: CNVI_SCU_SEQ_DATA_DW9: 0x0
[312002.499587] usb 1-9: reset full-speed USB device number 3 using xhci_hcd
[312005.197185] OOM killer enabled.
[312005.197186] Restarting tasks ... 
[312005.197203] pcieport 0000:00:1d.0: PCI bridge to [bus aa]
[312005.197238] pcieport 0000:00:1d.0:   bridge window [mem 0x7a200000-0x7a2fffff]
[312005.207472] mei_hdcp 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_ops [i915])
[312005.208828] mei_pxp 0000:00:16.0-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:00:02.0 (ops i915_pxp_tee_component_ops [i915])
[312005.213964] done.
[312005.213970] random: crng reseeded on system resumption
[312005.215338] usb 1-2: USB disconnect, device number 76
[312005.215342] usb 1-2.2: USB disconnect, device number 79
[312005.215403] usb 4-1: USB disconnect, device number 37
[312005.215405] usb 4-1.4: USB disconnect, device number 38
[312005.215430] cdc_ncm 4-1.4:2.0 eth0: unregister 'cdc_ncm' usb-0000:00:0d.0-1.4, CDC NCM (NO ZLP)
[312005.261320] PM: suspend exit
[312005.275451] smpboot: Booting Node 0 Processor 1 APIC 0x1
[312005.411897] usb 1-2.3: USB disconnect, device number 78
[312005.494873] smpboot: Booting Node 0 Processor 2 APIC 0x8
[312006.051414] usb 1-4: USB disconnect, device number 59
[312006.195305] usb 1-5: USB disconnect, device number 60
[312006.703678] smpboot: Booting Node 0 Processor 3 APIC 0x9
[312006.999304] smpboot: Booting Node 0 Processor 4 APIC 0x10
[312007.068159] iwlwifi 0000:aa:00.0: WRT: Invalid buffer destination
[312007.155824] smpboot: Booting Node 0 Processor 5 APIC 0x11
[312007.231141] iwlwifi 0000:aa:00.0: WFPM_UMAC_PD_NOTIFICATION: 0x1f
[312007.231188] iwlwifi 0000:aa:00.0: WFPM_LMAC2_PD_NOTIFICATION: 0x1f
[312007.231205] iwlwifi 0000:aa:00.0: WFPM_AUTH_KEY_0: 0x80
[312007.231218] iwlwifi 0000:aa:00.0: CNVI_SCU_SEQ_DATA_DW9: 0x0
[312007.411422] smpboot: Booting Node 0 Processor 6 APIC 0x18
[312007.639698] smpboot: Booting Node 0 Processor 7 APIC 0x19
[312007.819266] smpboot: Booting Node 0 Processor 8 APIC 0x20
[312008.019275] smpboot: Booting Node 0 Processor 9 APIC 0x21
[312008.191317] smpboot: Booting Node 0 Processor 10 APIC 0x28
[312008.379282] smpboot: Booting Node 0 Processor 11 APIC 0x29
[312008.515284] smpboot: Booting Node 0 Processor 12 APIC 0x30
[312008.827300] smpboot: Booting Node 0 Processor 13 APIC 0x32
[312009.055303] smpboot: Booting Node 0 Processor 14 APIC 0x34
[312009.215606] usb 1-9: reset full-speed USB device number 3 using xhci_hcd
[312009.283353] smpboot: Booting Node 0 Processor 15 APIC 0x36
[312009.523301] smpboot: Booting Node 0 Processor 16 APIC 0x38
[312009.811377] smpboot: Booting Node 0 Processor 17 APIC 0x3a
[312010.099419] smpboot: Booting Node 0 Processor 18 APIC 0x3c
[312010.171593] usb 1-9: reset full-speed USB device number 3 using xhci_hcd
[312010.703728] wlan0: authenticate with 34:fc:b9:06:ae:d2
[312010.714879] wlan0: send auth to 34:fc:b9:06:ae:d2 (try 1/3)
[312010.769746] wlan0: authenticated
[312010.771348] wlan0: associate with 34:fc:b9:06:ae:d2 (try 1/3)
[312010.773079] wlan0: RX AssocResp from 34:fc:b9:06:ae:d2 (capab=0x1411 status=0 aid=47)
[312010.807944] wlan0: associated
[312010.807999] wlan0: Limiting TX power to 30 (30 - 0) dBm as advertised by 34:fc:b9:06:ae:d2
[312026.698034] filter_write: 35 callbacks suppressed
[312026.698040] msr: Write to unrecognized MSR 0x38f by i7z (pid: 1216462).
[312026.698048] msr: See https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/about for details.
[312060.200578] filter_write: 83 callbacks suppressed
[312060.200580] msr: Write to unrecognized MSR 0x38f by i7z (pid: 1216462).
[312060.200583] msr: See https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/about for details.
[312093.978058] filter_write: 83 callbacks suppressed
[312093.978060] msr: Write to unrecognized MSR 0x38f by i7z (pid: 1216462).
[312093.978062] msr: See https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/about for details.
[312127.779405] filter_write: 83 callbacks suppressed
[312127.779408] msr: Write to unrecognized MSR 0x38f by i7z (pid: 1216462).
[312127.779412] msr: See https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/about for details.

Also attached is the file output for perf record on the process. I'm not totally sure if this would be helpful but it might be a good datapoint.
perf.data.tar.gz

@leopoldhub
Copy link
Collaborator

leopoldhub commented May 16, 2024

Hi @gidoBOSSftw5731 ,
We recently did a consequent refactoring of our codebase, is this issue still present in the latest version?

@leopoldhub leopoldhub added bug Something isn't working needing further clarification Waiting for additional details or clarification labels May 21, 2024
@leopoldhub
Copy link
Collaborator

Hi @gidoBOSSftw5731 ,
Since you did not reply, I assume that the buggy behaviour has been fixed with the latest version.
I will, close this issue, if the bug is still there, feel free to reopen it.
Have a nice day

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needing further clarification Waiting for additional details or clarification
Projects
None yet
Development

No branches or pull requests

2 participants