Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mythbackend hangs on startup with kernel 6.3.7 and above, and is not killable. #761

Closed
Jpilk opened this issue Jun 16, 2023 · 27 comments
Closed
Labels
upstream Upstream Issue

Comments

@Jpilk
Copy link

Jpilk commented Jun 16, 2023

  • Platform:el7 with elrepo mainline kernel; Fedora 37; also reported with Debian/Testing

  • MythTV version:32 fixes as of Jan 2023; current master

  • Package version: rpms built from gtb script

  • Component: mythbackend

What steps will reproduce the bug?

Start mythbackend from a kde konsole. as I normally do.

How often does it reproduce? Is there a required condition?

Every time, with kernal 6.3.7 and above. Works as expected with 6.3.6

What is the expected behaviour?

What do you see instead?

[root@HP_Box john]# journalctl -S -8h | grep -v dracut | grep -A 10 "task mythbackend:"
Jun 15 09:34:44 HP_Box kernel: INFO: task mythbackend:2889 blocked for more than 122 seconds.
Jun 15 09:34:44 HP_Box kernel: Tainted: G E 6.3.8-1.el7.elrepo.x86_64 #1
Jun 15 09:34:44 HP_Box kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 15 09:34:44 HP_Box kernel: task:mythbackend state:D stack:0 pid:2889 ppid:1 flags:0x00004004
Jun 15 09:34:44 HP_Box kernel: Call Trace:
Jun 15 09:34:44 HP_Box kernel:
Jun 15 09:34:44 HP_Box kernel: __schedule+0x357/0x9f0

Reported on the users list. See this thread. http://lists.mythtv.org/pipermail/mythtv-users/2023-June/412025.html

Additional information

@rungitringit
Copy link

rungitringit commented Jun 19, 2023

After my Fedora 37 machine upgraded to kernel 6.3.7 I started having the backend service fail on start. I'd been investigating mariadb with help from the Mythtv forums as I noticed messages like this from it's log:

[Warning] Aborted connection 3 to db: 'unconnected' user: 'unauthenticated' host: 'mediapc' (This connection closed normally without authentication)

However I cannot fault the database at all. Also once mythbackend service starts it cannot be stopped, getting stuck in a 'D' state - also an indicator that the problem may be kernel related.

FWIW I don't use KDE, I'm using LXDE.

Jun 18 21:35:12 mediapc kernel: sysrq: Show Blocked State
Jun 18 21:35:12 mediapc kernel: task:mythbackend     state:D stack:0     pid:4666  ppid:1      flags:0x00004002
Jun 18 21:35:12 mediapc kernel: Call Trace:
Jun 18 21:35:12 mediapc kernel:  <TASK>
Jun 18 21:35:12 mediapc kernel:  __schedule+0x449/0x1480
Jun 18 21:35:12 mediapc kernel:  ? sysvec_apic_timer_interrupt+0xe/0x90
Jun 18 21:35:12 mediapc kernel:  schedule+0x5e/0xd0
Jun 18 21:35:12 mediapc kernel:  schedule_preempt_disabled+0x15/0x30
Jun 18 21:35:12 mediapc kernel:  __mutex_lock.constprop.0+0x399/0x700
Jun 18 21:35:12 mediapc kernel:  dvb_frontend_stop+0x3b/0x1e0 [dvb_core]
Jun 18 21:35:12 mediapc kernel:  dvb_frontend_open+0x1ac/0x5c0 [dvb_core]
Jun 18 21:35:12 mediapc kernel:  ? avc_has_perm+0x65/0xf0
Jun 18 21:35:12 mediapc kernel:  dvb_device_open+0xba/0x120 [dvb_core]
Jun 18 21:35:12 mediapc kernel:  chrdev_open+0xc5/0x250
Jun 18 21:35:12 mediapc kernel:  ? __pfx_chrdev_open+0x10/0x10
Jun 18 21:35:12 mediapc kernel:  do_dentry_open+0x1e2/0x410
Jun 18 21:35:12 mediapc kernel:  path_openat+0xae4/0x1110
Jun 18 21:35:12 mediapc kernel:  ? __wake_up_common+0x73/0x180
Jun 18 21:35:12 mediapc kernel:  ? avc_has_extended_perms+0x250/0x560
Jun 18 21:35:12 mediapc kernel:  do_filp_open+0xb3/0x160
Jun 18 21:35:12 mediapc kernel:  do_sys_openat2+0xaf/0x170
Jun 18 21:35:12 mediapc kernel:  __x64_sys_openat+0x6e/0xa0
Jun 18 21:35:12 mediapc kernel:  do_syscall_64+0x5c/0x90
Jun 18 21:35:12 mediapc kernel:  ? dvb_frontend_ioctl+0x1f/0x40 [dvb_core]
Jun 18 21:35:12 mediapc kernel:  ? __x64_sys_ioctl+0xac/0xd0
Jun 18 21:35:12 mediapc kernel:  ? syscall_exit_to_user_mode+0x1b/0x40
Jun 18 21:35:12 mediapc kernel:  ? do_syscall_64+0x6b/0x90
Jun 18 21:35:12 mediapc kernel:  ? exc_page_fault+0x74/0x170
Jun 18 21:35:12 mediapc kernel:  entry_SYSCALL_64_after_hwframe+0x72/0xdc
Jun 18 21:35:12 mediapc kernel: RIP: 0033:0x7ff59671dfb0
Jun 18 21:35:12 mediapc kernel: RSP: 002b:00007ffda0090880 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
Jun 18 21:35:12 mediapc kernel: RAX: ffffffffffffffda RBX: 0000000000000802 RCX: 00007ff59671dfb0
Jun 18 21:35:12 mediapc kernel: RDX: 0000000000000802 RSI: 000055d05c33a9e8 RDI: 00000000ffffff9c
Jun 18 21:35:12 mediapc kernel: RBP: 000055d05c33a9e8 R08: 0000000000000000 R09: 0000000000000010
Jun 18 21:35:12 mediapc kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 000055d05c317a51
Jun 18 21:35:12 mediapc kernel: R13: 000055d05c317910 R14: 000055d05c32ce08 R15: 0000000000000001
Jun 18 21:35:12 mediapc kernel:  </TASK>

@rungitringit
Copy link

Hopefully this bug is redundant and a kernel fix is coming, as commented by John in the mailing list: https://lists.mythtv.org/pipermail/mythtv-users/2023-June/412038.html

@Jpilk
Copy link
Author

Jpilk commented Jun 19, 2023

And I received this link via the elrepo team:

https://bugzilla.kernel.org/show_bug.cgi?id=217566

with an expectation that Fedora packages will be available later this week.

@Jpilk
Copy link
Author

Jpilk commented Jun 19, 2023

After I had restored the rpmfusion nvidia akmod packages erased while aiming to prevent unwanted kernel updates, mythbackend appears to be happy with the 6.3.9 release-candidate kernel:

uname -r
6.3.9-0.rc1.20230619gtc4f2a2d8.250.vanilla.fc37.x86_64

The mythbackend is a recent master with PR 752. It calls itself v34-Pre-287-78171a7dcf

Note that Gary's comment on the mythtv users list suggests that this is probably not the final dvb-related kernel fix.

@rungitringit
Copy link

I realize this problem is caused by the kernel but would it be possible for mythbackend to handle situations like this gracefully? It's a pain that you have to reboot in order to kill it.

@Jpilk
Copy link
Author

Jpilk commented Jun 22, 2023

This issue appears to have been fixed with the release of kernel 6.3.9, on my fedora 37 aand 'el7' boxes.

https://mirrors.edge.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.3.9

Fix applied: Revert "media: dvb-core: Fix use-after-free on race condition at dvb_frontend"

@Jpilk Jpilk closed this as completed Jun 22, 2023
@rungitringit
Copy link

@Jpilk how did you get kernel 6.3.9 on Fedora 37? Are you running beta packages for Fedora?

My system updated to 6.3.8 which also has the problem and now is having issues (possibly NVIDIA driver related) going back to booting 6.3.6 which leaves me with no working kernel.

Thanks!

@Jpilk
Copy link
Author

Jpilk commented Jun 24, 2023

I don't usually run test kernels. The one I have now came as an update of the one from here.

https://bugzilla.kernel.org/show_bug.cgi?id=217566#c13

I haven't yet seen 6.3.9 in the fedora updates repo.

[john@HPFed ~]$ uname -rsvp
Linux 6.3.9-250.vanilla.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Jun 21 14:48:09 UTC 2023 x86_64

@garybuhrmaster
Copy link
Contributor

I don't usually run test kernels. The one I have now came as an update of the one from here.

https://bugzilla.kernel.org/show_bug.cgi?id=217566#c13

I haven't yet seen 6.3.9 in the fedora updates repo.

kernel 6.3.9 is currently in the test-updates repo (it has not yet pushed to updates as it has not completed the entire testing criteria) for Fedora.

You can use something of the form

   dnf --enablerepo=updates-testing upgrade kernel

to get the kernel from the updates-testing repo now.

You can also pull the rpms directly from the fedora koji build system, or use the command suggested on the bodhi kernel update gating pages for your release (or just do your own mock build from the fedora sources, as usual).

@rungitringit
Copy link

rungitringit commented Jun 25, 2023

Thanks @garybuhrmaster

I was eventually able to get kernel 6.3.6 working again, however it seems there's flickering in new recordings. Specifically the picture flickers a few pixels up and down (on the vertical axis only) rapidly during all recordings.

It could be something specific to my setup but could you please keep an eye out for recording quality losses or glitches when testing new kernels?

EDIT: A cold boot of my system again seems to have resolved it. I guess it was a playback problem and probably unrelated to this issue.

@Jpilk
Copy link
Author

Jpilk commented Jun 25, 2023

It looks to me as if the current holdup is with ryzen in F38.

https://bodhi.fedoraproject.org/updates/?search=kernel

@Jpilk Jpilk reopened this Jun 25, 2023
@Jpilk
Copy link
Author

Jpilk commented Jun 26, 2023

The package-manager ikon just offered a new test kernel for Fedora 37. mythbackend looks fine with it.

The repo list was from https://copr.fedorainfracloud.org/coprs/g/kernel-vanilla/stable-rc/
and I would expect that the packages will get into the 'updates repo 'soon'.

[john@HPFed ~]$ uname -a
Linux HPFed 6.3.10-0.rc1.20230626gt3d494887.250.vanilla.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Jun 26 20:08:09 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

@Jpilk
Copy link
Author

Jpilk commented Jun 27, 2023

And 6.4.0 seems ok too, in this build from elrepo :-)

[john@HP_Box ~]$ uname -a
Linux HP_Box 6.4.0-1.el7.elrepo.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Jun 26 18:48:41 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux

2023-06-27 10:09:56.844823 C mythbackend version: HEAD [v32.0-749472ba33] www.mythtv.org
2023-06-27 10:09:56.844834 C Qt version: compile: 5.9.7, runtime: 5.9.7
2023-06-27 10:09:56.844888 I Scientific Linux 7.9 (Nitrogen) (x86_64)

@Jpilk
Copy link
Author

Jpilk commented Jun 29, 2023

I'm confused about the Fedora release process, but again packagekit has installed an update which is working fine for me. And that included a successful DVB-T/T2 rescan.

[john@HPFed ~]$ uname -a
Linux HPFed 6.3.10-250.vanilla.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Jun 28 10:12:45 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

2023-06-29 10:49:40.938983 C mythbackend version: HEAD [v34-Pre-45de13278b] www.mythtv.org
2023-06-29 10:49:40.938990 C Qt version: compile: 5.15.9, runtime: 5.15.9
2023-06-29 10:49:40.939018 I Fedora Linux 37 (Thirty Seven) (x86_64)

@Jpilk
Copy link
Author

Jpilk commented Jun 30, 2023

A dnf upgrade today has installed kernel-6.4.1-0.rc3.20230630gt94976aa9.758.vanilla.fc37.x86_64

This has failed to complete booting. Booting pauses at 'Terminate Plymouth Boot Screen', then fails to start abrtd.service and hangs citing Dependency failures for ABRT kernel log watcher, kernel panic detection, and others.

I waited for what seemed a long time without any change. Then the long-press on the power button. REISUB said all except S is disabled.

Rebooting 6.3.10-250.vanilla got MythTV working again, but only after going through the same steps, including the Dependency failure warnings. Eventually it moved on.

@Jpilk
Copy link
Author

Jpilk commented Jun 30, 2023

This system has an nvidia card using the rpmfusion 470xx drivers. The build by akmods failed after kernel installation and before the attempted reboot. 'sudo akmods --force' under 6.3.10 also fails to build for 6.4.1 now.

@rungitringit
Copy link

Give this repo a try. I've been using it for years instead of rpmfusion: https://negativo17.org/nvidia-driver/

@Jpilk
Copy link
Author

Jpilk commented Jul 1, 2023

Thanks for the suggestion :-)

I suspect that rpmfusion is ok on 6.3.x and I'll try staying with that for now. They had a hiccup on 6.2 -> 6.3 and maybe 6.3 -> 6.4 is similar.

@Jpilk
Copy link
Author

Jpilk commented Jul 1, 2023

I have installed kernel-6.3.10-100.fc37 as shown at https://bodhi.fedoraproject.org/updates/?search=kernel using the one-time command line quoted in its link there. It's working fine for me with the rpmfusion 470xx nvidia driver in another 4-core x86_64 HP box.

Problems are still being reported for the fc38 version, mostly Ryzen-related, and both kernels are still marked as 'testing', but this seems to me likely to be the current best near-mainstream option.

@Jpilk
Copy link
Author

Jpilk commented Jul 2, 2023

That was yesterday, on a system untouched for several weeks. On trying to do it with the HPFed system above, dnf is showing some strange akmods/kernel related dependencies, and the intended updates-testing kernel doesn't boot. Still on 6.3.10 vanilla...

@Jpilk
Copy link
Author

Jpilk commented Jul 3, 2023

https://bodhi.fedoraproject.org/updates/?packages=kernel

[john@HPFed ~]$ uname -rsvp
Linux 6.3.11-100.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Jul 2 13:18:29 UTC 2023 x86_64

has now reached the fedora updates-testing repo and I have MythTV running as normal under it. Initially the nvidia driver was not built, but the build was successful after installing (again) the appropriate packages as described in the rpmfusion nvida howto. Then 'sudo akmods --force'. I usually watch build progress with atop, and this time it was different, seeing the CMD as akmods rather than the individual stages (make, cc1, depmod etc) when run automatically.. But it works, and reboots seem fine.

sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-83d5a4c7ea ## gets this kernel
sudo dnf --enablerepo=updates-testing install xorg-x11-drv-nvidia-470xx akmod-nvidia-470xx ## from rpmfusion
sudo akmods --force ## builds driver

@Jpilk
Copy link
Author

Jpilk commented Jul 6, 2023

The fc38 version of 6.3.11 has been pushed to stable because it includes a security fix (StackRot (CVE-2023-3269)), although there are still problems with amd-Ryzen hardware. 6.3.11 in fc37 still looks good to me on amd-free systems.

@Jpilk
Copy link
Author

Jpilk commented Jul 7, 2023

Apparently 6.3.11 for fc37 didn't include the StackRot fix, but 6.3.12 does. MythTV seems fine with it.

@Jpilk
Copy link
Author

Jpilk commented Jul 8, 2023

This thread looks related, too. OpenSuSE Leap 15.4.

http://lists.mythtv.org/pipermail/mythtv-users/2023-July/412183.html

and a firmware download failure ( not in MythTV ) in 6.3.11-200.fc38.x86_64

https://bugzilla.kernel.org/show_bug.cgi?id=217566#c27

@Jpilk
Copy link
Author

Jpilk commented Jul 23, 2023

kernel 6.4.4 is now in the Fedora 37 updates stable repo, and an updated nvidia 470xx driver from rpmfusion.

MythTV, recent master, seems happy with these.

Linux 6.4.4-100.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Jul 19 17:06:05 UTC 2023 x86_64
xorg-x11-drv-nvidia-470xx-470.199.02-1.fc37.x86_64
akmod-nvidia-470xx-470.199.02-1.fc37.x86_64
xorg-x11-drv-nvidia-470xx-cuda-470.199.02-1.fc37.x86_64

@garybuhrmaster
Copy link
Contributor

I do not have the power to close this issue, but given the upstream kernels have been fixed, I recommend this issue be closed.

@billmeek
Copy link
Contributor

Thanks Gary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
upstream Upstream Issue
Projects
None yet
Development

No branches or pull requests

5 participants