Skip to content
This repository has been archived by the owner on Sep 2, 2021. It is now read-only.

Warning at kernel/sched/core.c:4637 after booting #34

Closed
terencode opened this issue Jun 18, 2021 · 7 comments
Closed

Warning at kernel/sched/core.c:4637 after booting #34

terencode opened this issue Jun 18, 2021 · 7 comments

Comments

@terencode
Copy link

I observe the following warning happening shortly after booting linux 5.12.11 with CacULE v5.12-r2 (NO_HZ_FULL) :

WARNING: CPU: 0 PID: 227 at kernel/sched/core.c:4637 sched_tick_remote+0x123/0x180
Modules linked in: uas usb_storage iwlmvm nls_iso8859_1 vfat fat mac80211 intel_rapl_msr intel_rapl_common kvm_amd kvm libarc4 snd_hda_codec_realtek iwlwifi snd_hda_codec_generic ledtrig_audio snd_usb_audio crct10dif_pclmul crc32_pclmul snd_hda_intel ghash_clmulni_intel aesni_intel cfg80211 snd_intel_dspcfg snd_hda_codec crypto_simd cryptd igb rapl snd_usbmidi_lib snd_rawmidi snd_seq_device i2c_piix4 snd_hwdep snd_hda_core pcspkr ccp dca zenpower(OE) rfkill rng_core gpio_amdpt gpio_generic pinctrl_amd acpi_cpufreq mac_hid wmi_bmof mxm_wmi vendor_reset(OE) pkcs8_key_parser i2c_dev ledtrig_timer it87(OE) hwmon_vid ee1004 snd_aloop snd_pcm snd_timer snd soundcore v4l2loopback_dc(OE) videodev mc sg crypto_user asus_wmi_sensors(OE) wmi fuse ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 amdgpu crc32c_intel xhci_pci xhci_pci_renesas drm_ttm_helper ttm gpu_sched i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm agpgart vfio_pci irqbypass
 vfio_virqfd vfio_iommu_type1 vfio
CPU: 0 PID: 227 Comm: kworker/u64:12 Tainted: G           OE     5.12.11-168-tkg-cacule #1
Hardware name: System manufacturer System Product Name/CROSSHAIR VI HERO, BIOS 7901 07/31/2020
Workqueue: events_unbound sched_tick_remote
RIP: 0010:sched_tick_remote+0x123/0x180
Code: 00 00 41 5d e9 4e b1 fe ff 83 bd 18 0a 00 00 01 76 46 48 8b 85 40 0a 00 00 49 2b 85 20 01 00 00 ba 00 5e d0 b2 48 39 d0 76 90 <0f> 0b eb 8c 0f 0b 5b 5d 41 5c 41 5d c3 89 c2 e9 fd fe ff ff 48 c7
RSP: 0018:ffffac6b80947e68 EFLAGS: 00010082
RAX: fffffffffd9fac54 RBX: 0000000000000008 RCX: ffff8fc28ea00000
RDX: 00000000b2d05e00 RSI: 00000000239f5376 RDI: 0000000000000008
RBP: ffff8fc28ea2c640 R08: 00000000c127f591 R09: 000000000000000f
R10: 000000000000000f R11: fefefefefefefeff R12: ffff8fc28ea31688
R13: ffff8fbf8fc68000 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8fc28e800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb1a6874000 CR3: 00000001061b2000 CR4: 0000000000350ef0
Call Trace:
 process_one_work+0x217/0x3e0
 worker_thread+0x4d/0x3d0
 ? process_one_work+0x3e0/0x3e0
 kthread+0x134/0x160
 ? kthread_associate_blkcg+0xc0/0xc0
 ret_from_fork+0x22/0x30
@ptr1337
Copy link

ptr1337 commented Jun 19, 2021

Maybe a error from linux-tkg project.

https://seafile.ptr1337.dev/d/4046e0b9a4bf41d2abb0/

Here are some Genv3 Kernels precompiled, check if the problem also happens.

@hamadmarri
Copy link
Owner

Hi @terencode

http://lkml.iu.edu/hypermail/linux/kernel/1804.0/01103.html

If it is the same problem, I think it is related to the mainstream kernel. CacULE has no changes on no_hz_*

Thank you

@terencode
Copy link
Author

It's not coming from linux-tkg. I tried your build linux-cacule-5.12.12-1-x86_64.pkg.tar.zst and can still reproduce the warning.
There is no warning when using linux-tkg with NO_HZ_FULL too but using PDS instead of CACULE.

@ptr1337
Copy link

ptr1337 commented Jun 19, 2021

which bootloader youre using and which config options are you using there ?

which distro are using ?

your system specs should be also interesting.

@raykzhao
Copy link

Hi @terencode @ptr1337 @hamadmarri,

I believe this is the same issue as I previously mentioned here: #23 (reply in thread). Since this warning comes from a check and the computed delta is not actually used anywhere, currently I just comment out this check in order to suppress the warning:

--- a/kernel/sched/core.c	2021-05-07 20:53:26.000000000 +1000
+++ b/kernel/sched/core.c	2021-05-09 20:44:03.476618640 +1000
@@ -4619,14 +4619,14 @@
 
 	update_rq_clock(rq);
 
-	if (!is_idle_task(curr)) {
-		/*
-		 * Make sure the next tick runs within a reasonable
-		 * amount of time.
-		 */
-		delta = rq_clock_task(rq) - curr->se.exec_start;
-		WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
-	}
+	//if (!is_idle_task(curr)) {
+		///*
+		 //* Make sure the next tick runs within a reasonable
+		 //* amount of time.
+		 //*/
+		//delta = rq_clock_task(rq) - curr->se.exec_start;
+		//WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
+	//}
 	curr->sched_class->task_tick(rq, curr, 0);
 
 	calc_load_nohz_remote(rq);

@terencode
Copy link
Author

which bootloader youre using and which config options are you using there ?

Not sure why the bootloader would matter here but I'm using refind.
When you say "config options" I assume you mean kernel parameters in the context of bootloader?
Here they are : audit=0 root=UUID=a30ceffc-8940-46b1-96b0-952e9642799a rw initrd=amd-ucode.img initrd=initramfs-linux-tkg-cacule.img quiet splash loglevel=3 vt.global_cursor_default=0 rd.systemd.show_status=auto rd.udev.log-priority=3 libahci.ignore_sss=1 scsi_mod.use_blk_mq=1 threadirqs amdgpu.ppfeaturemask=0xffffffff nowatchdog vfio-pci.ids=1002:6610 xhci_hcd.quirks=270336 nohz_full=1-5,7-11

which distro are using ?

ArchLinux

your system specs should be also interesting.

Here are some :

  • CPU: ryzen 5 3600
  • GPU : vega 56
  • motherboard: crosshair VI hero
  • RAM: 2x8GB

Hi @terencode @ptr1337 @hamadmarri,

Hey!

I believe this is the same issue as I previously mentioned here: #23 (reply in thread). Since this warning comes from a check and the computed delta is not actually used anywhere, currently I just comment out this check in order to suppress the warning:

Ah dang I searched the thread for someone having a similar issue using CTR+F but forgot that after a thread reaches a certain length, messages are hidden and you need to hit the "view more" button...

I see hamadmarri asked some questions about your input at the end of the thread, maybe it's the occasion to continue here?

@hamadmarri
Copy link
Owner

Hi @terencode @ptr1337 @hamadmarri,

I believe this is the same issue as I previously mentioned here: #23 (reply in thread). Since this warning comes from a check and the computed delta is not actually used anywhere, currently I just comment out this check in order to suppress the warning:

--- a/kernel/sched/core.c	2021-05-07 20:53:26.000000000 +1000
+++ b/kernel/sched/core.c	2021-05-09 20:44:03.476618640 +1000
@@ -4619,14 +4619,14 @@
 
 	update_rq_clock(rq);
 
-	if (!is_idle_task(curr)) {
-		/*
-		 * Make sure the next tick runs within a reasonable
-		 * amount of time.
-		 */
-		delta = rq_clock_task(rq) - curr->se.exec_start;
-		WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
-	}
+	//if (!is_idle_task(curr)) {
+		///*
+		 //* Make sure the next tick runs within a reasonable
+		 //* amount of time.
+		 //*/
+		//delta = rq_clock_task(rq) - curr->se.exec_start;
+		//WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
+	//}
 	curr->sched_class->task_tick(rq, curr, 0);
 
 	calc_load_nohz_remote(rq);

Hi @raykzhao
cc: @terencode @ptr1337

I will add your patch to cacule soon. I was planning to add it that time but I got busy and forgot about it.

Thank you so much

hamadmarri pushed a commit that referenced this issue Jun 28, 2021
hamadmarri pushed a commit that referenced this issue Jun 28, 2021
@ghost ghost mentioned this issue Aug 1, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants