Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seg fault on fresh 18.04.5 install #59

Closed
Rid opened this issue Sep 10, 2020 · 11 comments
Closed

Seg fault on fresh 18.04.5 install #59

Rid opened this issue Sep 10, 2020 · 11 comments
Labels
bug Something isn't working

Comments

@Rid
Copy link

Rid commented Sep 10, 2020

Thanks for taking on the project!

I'm getting a seg fault when running --check or --install

Here's the gdb output with strace:

write(1, "Running kernel: 4.15.0-117-gener"..., 35Running kernel: 4.15.0-117-generic
) = 35
futex(0x7ffff72f4f38, FUTEX_WAKE_PRIVATE, 2147483647) = 0
pipe2([3, 4], 0)                        = 0
pipe2([5, 6], 0)                        = 0
pipe2([7, 8], O_CLOEXEC)                = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ffff7fe5750) = 214546
close(4)                                = 0
close(6)                                = 0
close(8)                                = 0
read(3, "", 8)                          = 0
close(3)                                = 0
select(8, [5 7], NULL, NULL, NULL)      = 2 (in [5 7])
read(5, "", 4096)                       = 0
close(5)                                = 0
read(7, "", 4096)                       = 0
close(7)                                = 0
wait4(214546, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 214546
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=214546, si_uid=0, si_status=0, si_utime=2, si_stime=0} ---
mmap(NULL, 8392704, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7ffff3b7f000
mprotect(0x7ffff3b80000, 8388608, PROT_READ|PROT_WRITE) = 0
clone(child_stack=0x7ffff437ee70, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7ffff437f9d0, tls=0x7ffff437f700, child_tidptr=0x7ffff437f9d0) = 214547
futex(0x7ffff72f4f18, FUTEX_WAKE_PRIVATE, 1D: query_thread() App.show_prev_majors: 0
D: query_thread() App.hide_unstable: true
) = 1
nanosleep({tv_sec=0, tv_nsec=500000000}, D: highest_maj = 5
D: check_installed()
----------------------------------------------------------------------
D: query_installed_packages()
D: dir_create(/tmp/.mainline_6lZDkZDJ)
D: Created directory: /tmp/.mainline_6lZDkZDJ
0x7fffffffe450) = 0
nanosleep({tv_sec=0, tv_nsec=500000000}, 0x7fffffffe450) = 0
nanosleep({tv_sec=0, tv_nsec=500000000}, D: file_write(/tmp/.mainline_6lZDkZDJ/15997310454283254965)
D: file_parent(/tmp/.mainline_6lZDkZDJ/15997310454283254965)
D: dir_create(/tmp/.mainline_6lZDkZDJ)
D: File saved:/tmp/.mainline_6lZDkZDJ/15997310454283254965
D: File deleted: /tmp/.mainline_6lZDkZDJ/15997310454283254965
D: dir_delete():Deleted: /tmp/.mainline_6lZDkZDJ
Found installed: 4.15.0.117.104
D: Package: linux-headers-generic
D: Package: linux-image-generic
D: Package: linux-generic
Found installed: 4.15.0-117.118
D: Package: linux-modules-extra-4.15.0-117-generic
D: Package: linux-image-4.15.0-117-generic
D: Package: linux-libc-dev
D: Package: linux-headers-4.15.0-117
D: Package: linux-modules-4.15.0-117-generic
D: Package: linux-headers-4.15.0-117-generic
----------------------------------------------------------------------
----------------------------------------------------------------------
D: check_updates()
 <unfinished ...>) = ?
+++ killed by SIGSEGV +++

Thread 2 "mainline" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff437f700 (LWP 222817)]
0x0000555555562d25 in linux_kernel_check_updates ()

Here's the stack:

#0  0x0000555555562d25 in linux_kernel_check_updates ()
#1  0x0000555555560465 in linux_kernel_query_thread ()
#2  0x000055555555eb17 in _linux_kernel_query_thread_gthread_func ()
#3  0x00007ffff7053175 in  () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#4  0x00007ffff6a296db in start_thread (arg=0x7ffff437f700) at pthread_create.c:463
#5  0x00007ffff6752a3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
@Rid
Copy link
Author

Rid commented Sep 10, 2020

Strangely, if I update from 4.15.0-117.118 to 5.5.6 using ukuu, I am then able to update to 5.8.7 using mainline.

Can somebody confirm if this is reproducible on 4.15.0-117.118?

@WallyZambotti
Copy link

Yes I'm getting the same on 18.04.5 running on Arm64.

$ gdb ./mainline
GNU gdb (Ubuntu 8.1-0ubuntu3.2) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "aarch64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
---Type <return> to continue, or q <return> to quit---
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./mainline...done.
(gdb) run --list
Starting program: /home/odroid/SourcePackages/mainline/mainline --list
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
mainline 1.0.12
[New Thread 0x7fb76a40a0 (LWP 20378)]
[New Thread 0x7fb6ea30a0 (LWP 20379)]
Distribution: Ubuntu 18.04.5 LTS
Architecture: arm64
Running kernel: 4.9.230-76
[New Thread 0x7fb66a20a0 (LWP 20385)]
----------------------------------------------------------------------
----------------------------------------------------------------------
----------------------------------------------------------------------

Thread 4 "mainline" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fb66a20a0 (LWP 20385)]
0x0000005555561e3c in linux_kernel_check_updates () at /home/odroid/SourcePackages/mainline/src/Common/LinuxKernel.vala:530
530                             if (k.version_maj > kernel_latest_installed.version_maj) {
(gdb) bt
#0  0x0000005555561e3c in linux_kernel_check_updates () at /home/odroid/SourcePackages/mainline/src/Common/LinuxKernel.vala:530
#1  0x000000555555ffb4 in linux_kernel_query_thread () at /home/odroid/SourcePackages/mainline/src/Common/LinuxKernel.vala:321
#2  0x000000555555ed64 in _linux_kernel_query_thread_gthread_func (self=0x0) at /home/odroid/SourcePackages/mainline/src/Common/LinuxKernel.vala:197
#3  0x0000007fb7c5a94c in  () at /usr/lib/aarch64-linux-gnu/libglib-2.0.so.0
#4  0x0000007fffffe758 in  ()
(gdb) 

@WallyZambotti
Copy link

WallyZambotti commented Sep 27, 2020

Confirmed seg fault on 18.04.5 both Arm64 4.9.230-76 and AMD64 4.15.0-117.118

@WallyZambotti
Copy link

I think the logic bug is in check_installed:

It finds what is installed and prints that out and creates a temp pkern which has is_installed set to true.
If the pkern is not found in the kernel list then it is added to that list with the is_installed correctly set. However if the pkern is found in the kernel list it is not added to it. That is fine but the is_installed value of the already existing entry in the kernel list is not set to true. Which if it is fixes the problem.

					log_msg("Found installed" + ": %s".printf(pkg.version_installed));

					string pkern_name = "%s".printf(pkg.version_installed);
					var pkern = new LinuxKernel(pkern_name, false);
					pkern.is_installed = true;
					pkern.set_apt_pkg_list();

					bool found = false;
					foreach(var k in kernel_list){
						if (k.version_main == pkern.version_main){
							found = true;
							k.apt_pkg_list = pkern.apt_pkg_list;
							// WZ - either the pkern is added with is_installed set to true or
							// WZ - it isn't added and then we must set the is_installed of the already existing k
							k.is_installed = true; // WZ - fix
							break;
						}
					}

					if (!found) kernel_list.add(pkern);

@WallyZambotti
Copy link

WallyZambotti commented Sep 27, 2020

The next problem is again in check_installed which does not provide a vendor neutral method of determining the installed linux kernels via the package manager. check_install makes a call to the following:

pkg_list_installed = Package.query_installed_packages();

And then a subsequent match for 'linux-image' identifies the correct packages like so :

if (pkg.pname.contains("linux-image")){
Which can be tested from a shell with this command :

$ aptitude search --disable-columns -F '%p|%v|%M|%d' '?installed' | grep 'linux-image'
$

On Odroid N2 Ubuntu this command returns nothing because the package naming convention is different.

On the Ubuntu 18.04.5 for the Arm64 Odroid N2 the linux kernel packages are actually named linux-odroid-n2

		// if (pkg.pname.contains("linux-image")){
		if (pkg.pname.contains("linux-odroid-n2")){  // cheap fix for Odroid N2 Ubuntu

So for a vendor independent method a completely different approach will be needed!

@bkw777
Copy link
Owner

bkw777 commented Sep 28, 2020

There really is no good reliable vendor-neutral way to tell what kernel packages or versions are installed. There is a function that tries (poorly) to parse a few different kinds of kernel and package version strings which come from a few different sources and which adhere to a few different conventions or naming rules.

It's very fragile and only works by luck and only as long as nothing changes. There are really no actual rules that you can count on about what might appear in a version string from apt/dpkg, or a deb filename, or a directory name from the mainline-ppa website.

I have a re-write of that I started some time ago and never finished, which splits that up into a few separate parsers that are simpler and more reliable by being more llimited in scope. IE, one just parses package names from dpkg/apt, not filenames, not mainline website directory names, not the output of uname... then another function only parses uname, etc, and whenever a kernel is added to the list of packages and versions, another piece of associated info is the source (did this kernel entry come from the output of apt? or uname? or wget? etc.). And they aren't simple regex's but actual logic to break up the string into sections and make decisions about what they mean based on where it came from and what the other sections of the string contain.

If we are going to get wierd new packages that are still "ubuntu", then probably we should move the regexes, or some other way to express some kind of recognition rules, into the config file where it can be a larger list of possible patterns to match, which can be more easily maintained, and which the user can fiddle with themself if needed without compiling.

A large part of this will simply always be dumb manual maintenance to keep updating a dumb list of known strings. For instance there was never any way to predict, or any generic pattern or rule that would have allowed for the odroid kernels to be properly recognized when they appeared. There is no such thing as a designated spot in the string where "distribution name goes here" and we can recognize and parse it by some rule like "if there's a dash after 3 dots, then between the dash and the next dot is the distribution name" and by knowing that rule we could recognize any distro without having to match the literal string, and when a new distro shows up, it already works as long as it adhered to the rule... there is no such nice rule like that. There isn't even a rule that there must always be 3 numbers, or the the the version number parts won't still have letters in with them, let alone all the unpredictable free-form junk that comes before the main numbers and after the main numbers which are totally unpredictable except within narrow contexts.

By narrow context I mean, If you know a given string came from say, the mainline-ppa website, and you know it was from after some year and before some other year, THEN you can tell how to interpret the string reliably. If you just get the string with no context, you really can't say almost anything about it for sure. You can't even get even the most basic main version numbers for sure, because people put all kinds of dates and git hashes and build numbers and other sorts of revision tags and other stuff in there which are also numbers, and by itself, a 5 is just a 5. A program has no way to know that it is or is not supposed to mean kernel version 5.x.x, because version strings are not database fields with defined meanings.

That doesn't mean we can't still have a nice convenient kernel package installer, it just means you can't be suprized and outraged that it isn't magic, and breaks at the slightest new deviation, until the code is updated to handle that specific new string or pattern. That's why I said maybe the best way to deal with it is just make the pattern-matching into a config setting so that it can be fiddled with by the user at run-time when they need to.

It's been a while since I last looked at it so i am handwaving a bit, sorry.

@bkw777 bkw777 added the bug Something isn't working label Sep 28, 2020
@WallyZambotti
Copy link

@ bkw777 I had a look as well and spoke to a few other groups and have to agree with you there doesn't seem to be a reliable way to determine what kernel packages are installed.

@leopck
Copy link

leopck commented Dec 22, 2020

I can confirm this bug as well, I am able to reproduce this bug using a fresh install of 18.04.5, any resolution for this?

@xxxajk
Copy link

xxxajk commented Feb 3, 2021

Same bug here too. Because I must stay at a stable 4.15.0 kernel, but want to test every now and again with newer until the video card works. For now I can do this manually, but that's a touch painful.

@KenSharp
Copy link

Is this the same bug I tried reporting?

@bkw777
Copy link
Owner

bkw777 commented Aug 13, 2021

I believe this is fixed with d3b91b0

@bkw777 bkw777 closed this as completed Aug 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants