Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash reported in 64-bit pvm-0.1.17 #5

Closed
peabee opened this issue Oct 29, 2018 · 41 comments
Closed

Crash reported in 64-bit pvm-0.1.17 #5

peabee opened this issue Oct 29, 2018 · 41 comments

Comments

@peabee
Copy link

peabee commented Oct 29, 2018

See:
http://murga-linux.com/puppy/viewtopic.php?p=1008421#1008421

64-bit pvm-0.1.17 crashes on insertion of external drive with:
pool[7353]: segfault at 40 ip 00007f2127bb54f0 sp 00007f21263b6dd8 error 4 in libpupvm.so.0.0.0[7f2127ba9000+13000]
Code: 40 00 48 8b 05 11 6b 20 00 48 8b 00 48 85 c0 74 e4 ff d0 48 89 df 5b e9 de aa ff ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 <8b> 47 40 85 c0 74 09 48 8b 07 ff a0 a0 00 00 00 c3 0f 1f 44 00 00

@peabee
Copy link
Author

peabee commented Jul 29, 2020

This is still a problem in July 2020....
Here are the steps to reproduce using a pristine frugal install of FossaPup64-9.0.4-rc2:
Download the 0.1.17 pvm .pet built on FossaPup64:
https://u.pcloud.link/publink/show?code=XZL4iakZ15tj5jAoIkXVHSpPIcfsEJz2NqUX
Install the .pet
System -> Puppy Event Manager -> ROX Icons -> tick 'Auto launch handler...'
In a terminal (do not close):
pup-volume-monitor-admin -s
Plug in a usb drive, when pmount window appears mount the usb drive

dmesg shows crash:
pool-pup-volume[13330]: segfault at 40 ip 00007f745fce9e34 sp 00007f745ed9fe08 error 4 in libpupvm.so.0.0.0[7f745fce3000+a000]
[ 340.923433] Code: 48 85 c0 74 ec 48 83 ec 18 48 89 7c 24 08 ff d0 48 8b 7c 24 08 48 83 c4 18 e9 e8 a3 ff ff 0f 1f 84 00 00 00 00 00 f3 0f 1e fa <8b> 47 40 85 c0 74 0d 48 8b 07 ff a0 a0 00 00 00 0f 1f 40 00 c3 0f

@peabee
Copy link
Author

peabee commented Jul 29, 2020

Maybe @wdlkmpx can help as he forked pvm - but his version has same crash??? @01micko

@wdlkmpx
Copy link
Contributor

wdlkmpx commented Jul 30, 2020

The p-v-m implements classes and stuff and I probably deleted a few more lines than I should have.

Does 0.1.15 work ok?

I was planning to rewrite the pvm using the same logic as ... I don't even recall the name of some scripts. Without the gobject stuff.

Although now I understand gtk and glib a bit more and I guess I'm in the position to fix issues.

@wdlkmpx
Copy link
Contributor

wdlkmpx commented Jul 31, 2020

What I'll do is rename the project and change the whole codebase, but not now.

Meanwhile I can try to see what's wrong, but I need a link to a ISO file + a working devx.

I must compile pvm with debug symbols and perform a git bisect to trace the origin of the bug,

@peabee
Copy link
Author

peabee commented Jul 31, 2020

Does 0.1.15 work ok?

I don't seem to have either a compiled 64-bit compatible package or the 0.1.15 sources to compile a new version so unable to confirm.....

@01micko
Copy link
Owner

01micko commented Jul 31, 2020

@peabee

git clone https://github.com/01micko/pup-volume-monitor.git
cd pup-volume-monitor
git checkout d6a3bfd00e36e4615d0c82a69123d9fac5fbefad

You can checkout any other sha hash the same way. That commit is the last one before the day that pvm was bumped to 0.1.16

@wdlkmpx
Copy link
Contributor

wdlkmpx commented Jul 31, 2020

Ok I performed a git bisect and found that apparently the bug is present in all git revisions. That doesn't look good.

The segfault happens in libpupvm/pupvm-classes.c, I fixed that and another segfault happened in another place. So this requires a proper understanding of the codebase, something I don't have at the moment.

The app is unusually big with duplicate code and convoluted glib - gobject stuff that requires reading a 600-page book first.

I guess it could be like 10 times simpler.. I guess that's why I was just thinking of rewriting it.

The PVM doesn't work with 'native' glib2 .deb packages, that is something really disturbing, I have some questions:

  • Does the bug occur in fossapup32?
  • Does the bug occur in slacko64 current?
  • Does the bug occur in lxpup? I need another iso with stuff I'm more comfortable with

I say this because It might not be worth fixing the bug.

@peabee
Copy link
Author

peabee commented Aug 1, 2020

As far as I know, the bug only affects 64-bit systems so the answer is:
No (there is no fossapup32 but there is UPupEF which has both Eoan and Fossa components)
Yes
No for 32-bit, yes for 64-bit

But I will check and report back.

@peabee
Copy link
Author

peabee commented Aug 1, 2020

@peabee
Copy link
Author

peabee commented Aug 1, 2020

slacko64-6.9.9.9 has the problem:
http://distro.ibiblio.org/puppylinux/puppy-slacko-7.0/testing/64/

@peabee
Copy link
Author

peabee commented Aug 1, 2020

BionicPup64 with pvm-0.1.15 - ok, no segfault

BionicPup64 with pvm-0.1.17 - segfaults

@wdlkmpx
Copy link
Contributor

wdlkmpx commented Aug 1, 2020

pup-volume-monitor-0.1.15.tar.gz

I compiled, installed and tested pvm several times and found the same segfault. It was probably executing the same binary that was hidden behind the aufs layer, I didn't reboot.

It probably has to do with the "updates" (GTask vs GSimplesyncResult), but in my tests everything failed. But why does it only happen in 64 bit builds?

Learning GIO, GObject and so on will take a month at the very least. It took me half a year to understand things to be able to edit gftp and this presents a similar challenge. But I think it's easier to rewrite it without udev using basic glib stuff, and busybox as the library to get info about partitions.

There are scripts and sample c files (hotplug2stdout.c) that provide some insight on how to things without udev.

@peabee
Copy link
Author

peabee commented Aug 2, 2020

Thanks for 0.1.15
Tried to build in fossapup64 and got:

root# ./setup-build-system.sh
OK
root# ./configure --prefix=/usr --sysconfdir=/etc
OK
root# make DESTDIR=/root/out install
Making install in libpupvm
make[1]: Entering directory '/mnt/sda3/lxde64/pup-volume-monitor64/pvm-0.1.15/pup-volume-monitor-0.1.15/libpupvm'
make[1]: *** No rule to make target 'install'.  Stop.
make[1]: Leaving directory '/mnt/sda3/lxde64/pup-volume-monitor64/pvm-0.1.15/pup-volume-monitor-0.1.15/libpupvm'
make: *** [Makefile:404: install-recursive] Error 1

@wdlkmpx
Copy link
Contributor

wdlkmpx commented Aug 2, 2020

Hmmm it was not the correct revision, the old sources fail to pass the autoreconf stuff.

I did test everything, assuming a different binary was being executed. This is what I'll do:

  • run git bisect again and perform installs and tests
  • undo some updates or changes to see if anything happens
  • remove duplicated code and update stuff
  • try to fix the bug

That will result in pvm 0.2.0 or 0.2.0b

It may take a week or a month or maybe 2 months, patience is the key as I'm updating other projects as well. Currently the main problem is the weather, it's too cold, even my thoughts are freezing

@peabee
Copy link
Author

peabee commented Aug 2, 2020

No worry - your efforts are appreciated - we've had the bug since 2018 at least so a few more months will soon pass ;-)
When it's cold in one part of the world it's hot in the other part - well not exactly hot here but nicely warm in the usual mixed British summer :-))

@wdlkmpx
Copy link
Contributor

wdlkmpx commented Aug 15, 2020

No matter how many changes I undo, I always get the same crash at the same place

Thread 3 (Thread 0x7ffff6967700 (LWP 6844)):
#0  pup_device_clear_data (dev=0x0) at pupvm-classes.c:465
#1  0x000000000040596a in pup_drive_process_event (monitor=0x418590, dev=0x7ffff0005190, process_change=1) at monitor.c:275
#2  0x0000000000405b98 in pup_server_monitor_probe_thread_func (monitor=0x418590, dev=0x7ffff0005190) at monitor.c:318
#3  pup_server_monitor_probe_thread_func (dev=0x7ffff0005190, monitor=0x418590) at monitor.c:292

Compiling and testing with ScPup64_20.06, I see more gcc warnings, invalid casts and stuff. To be able to compile pvm, the devx and kernel headers are required.

gdb requires libpython 3.8, but it's not available on slackware, salix provides python 3.5. I installed python3.8 from fedora 30
https://download-ib01.fedoraproject.org/pub/fedora/linux/updates/testing/30/Everything/x86_64/Packages/p/python38-3.8.3-1.fc30.x86_64.rpm

So it will take some time.

@peabee
Copy link
Author

peabee commented Aug 16, 2020

Thank you for your efforts - it sounds like it must be a very obscure problem given that it is specific to 64-bit systems. I have reverted to 0.1.15 for 64-bit while still using 0.1.17 for 32-bit but I doubt that end-users will be able to detect any difference.....

@wdlkmpx
Copy link
Contributor

wdlkmpx commented Aug 16, 2020

I edited the pvm because I was seeing a weird behavior sometimes, and it crashed once.

I know there were some glib critical errors in xerrs.log. But other than that it was working fine.

The pvm requires a deep understanding of the gobject system, it creates classes, signals, events, threads, hash tables, etc. The whole glib package.

Probably a race condition happens or maybe something else, 'cause I added debug strings that don't seem to be printed for some reason.

I compiled pvm in lucid pup and it doesn't show the drives, the fact that it doesn't support the debian patched glib makes it less appealing to me.

But I'll be investigating how to fix this, it's the only volume monitor I use and maybe I'll find a way to fix the bug in the coming weeks or months.

@wdlkmpx
Copy link
Contributor

wdlkmpx commented Aug 17, 2020

I see the app was created with an Integrated Development Environment, which makes it twice as hard to understand. Some gtk apps create GTK_TYPE_* derived classes, I mostly don't understand those apps.

I guess the IDE created the classes and stuff and automatically generated headers or something, I only use geany.

So I need to reorganize the code a bit to be able to understand it, and more importantly I have to read the GObject reference manual thoroughly.

The thing is, this is 100% gnome DE code, it needs to be simplified a bit
wdlkmpx@2487c74

The only required part is the GIO module, the pvm could be a very simple app that is triggered by a udev rule and sends the info to the GIO module.

When an app is called by udev it also receives info about the partition through environment variables, so there's no need to use libblkid.

Overall I know how to reimplement everything in a different way, in a very simple way, I've already done that with pup_event / probedisk / probepart. But this is a technical challenge I hope to overcome one day.

@peabee
Copy link
Author

peabee commented Nov 9, 2020

@peabee
Copy link
Author

peabee commented Nov 10, 2020

Clarification:
pup-volume-monitor-0.1.17-amd64.pet.gz from https://skamilinux.hu/phpBB3/viewtopic.php?p=9107#p9107
segfaults
but
http://distro.ibiblio.org/puppylinux/pet_packages-fossa64/pup-volume-monitor-0.1.17-fossapup64.pet
doesn't......
Differences noted between the two are:

  1. the skamilinux executables are much smaller than the fossa64 (they are stripped, fossa64 are not stripped) and they are pie executables whereas the fossa64 are just plain executables
  2. the fossa64 pet has a udev rule that the skamilinux doesn't have

@wdlkmpx
Copy link
Contributor

wdlkmpx commented Jan 14, 2021 via email

@peabee
Copy link
Author

peabee commented Jan 14, 2021

Hi @wdlkmpx
Thanks for the update - don't worry about time, the workarounds seem to be doing the job pro-tem.
Bit worried about your "issues" with LxPupSc.....

  • don't even know what gmrun does! LXDE has quite a nice run program in the menu.....
  • geany - happy to use other versions in the build if they have improvements and are available.
  • no Qt5 either nor Python or lots of other "large" dependencies - not sure I understand what lack of gtk3 has to do with downloading though?? I have no problems with downloads.....
  • will have a look at "7z" when I have a moment.

Thanks
PeeBee

@peabee
Copy link
Author

peabee commented Jan 17, 2021

Hi @wdlkmpx

p7zip is a .pet from common64 uploaded 20-Aug-2019:
:p7zip:|pet|Packages-puppy-common64-official|p7zip-16.02-x86_64_rev2|p7zip|16.02-x86_64_rev2||BuildingBlock|4764||p7zip-16.02-x86_64_rev2.pet||BuildingBlock|slackware64|14.0||
If a change is needed it needs to be done to that .pet....
The links are all to /usr/libexec/7z so should work unless something expects them not to be links....
Screenshot

geany is also a .pet from Slacko64-14.2 - version 1.35 uploaded 25-Nov-2020 so pretty recent....
:geany:|pet|Packages-puppy-slacko6414.2-official|geany-1.35-x86_64_s702|geany|1.35-x86_64_s702||Document|6484||geany-1.35-x86_64_s702.pet|+gtk+2|Light weight powerful gtk text editor and IDE|slackware64|14.2||

@wdlkmpx
Copy link
Contributor

wdlkmpx commented Jan 17, 2021 via email

@01micko
Copy link
Owner

01micko commented Jan 18, 2021

IIRC @ninaholic reported that p7zip was broken in slackoes so I recompiled it. Worked a treat. Some 'common' stuff is not suitable for all. It has to be really generic and probably built statically, which brings in a bloat problem if using gcc. @peabee I mentioned this to you before.

@wdlkmpx
Copy link
Contributor

wdlkmpx commented Jan 18, 2021 via email

@wdlkmpx
Copy link
Contributor

wdlkmpx commented Jan 31, 2021

Some further tests revealed something that was obvious from the beginning, a NULL check leads to a wrong code path, and it doesn't make sense at all... how that is even possible. I'm going crazy, this should not be happening.

pup_server_monitor_udev_event, subsystem: block
pup_server_monitor_probe_thread_func: sdc
pup_drive_process_event, process: 1
dev: 0x7ffff0004fc0
drive: (nil)
NULL CHECK: drive is not NULL, nil, 0
ERROR: drive: (nil)
ERROR: drive: 0
ULTRA ERROR: drive: (nil)

Thread 3 "pool-/usr/bin/p" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff6967700 (LWP 13033)]
pup_device_clear_data (dev=0x0) at pupvm-classes.c:465
465	pupvm-classes.c: No such file or directory.

Both if (drive) and if (!drive) evaluate to TRUE. This is disconcerting, I've never seen something like this before. I'm questioning my sanity.

I'm still using lxpup sc 20.06. Something is wrong, but I must go out somewhere to download a new iso, and it better have gtk3 (locales and docs are ignored), and probably python3 in the devx.

I should test with an older kernel, but it's somewhere in the world wide web.

I guess gcc is miscompiling stuff or something, this is crazy.

@peabee
Copy link
Author

peabee commented Jan 31, 2021

@wdlkmpx
Copy link
Contributor

wdlkmpx commented Jan 31, 2021 via email

@wdlkmpx
Copy link
Contributor

wdlkmpx commented Jan 31, 2021 via email

@wdlkmpx
Copy link
Contributor

wdlkmpx commented Feb 11, 2021 via email

@peabee
Copy link
Author

peabee commented Feb 11, 2021

Excellent @wdlkmpx - many thanks - looking good....
Compiled and .pet made (tweaks before making .pet = changing /usr/lib to /usr/lib64; linking libpupvm.so to /usr/lib; removing .a and .la)
http://smokey01.com/peebee/downloads/test/pup-volume-monitor-0.1.17-110221-x86_64.pet

Tested on: ScPup64-21.01, LxPupSc64-21.01; BionicPup64 and FossaPup64 - all seem OK.

Should it remain as a 0.1.17 variant or be 0.1.18?
Can you push the changes to the main tree?

@peabee
Copy link
Author

peabee commented Feb 11, 2021

@peabee
Copy link
Author

peabee commented Feb 13, 2021

0.2w pets (32 & 64-bit) available via forum. Many thanks @wdlkmpx

@peabee peabee closed this as completed Feb 13, 2021
@wdlkmpx
Copy link
Contributor

wdlkmpx commented Feb 13, 2021 via email

@wdlkmpx
Copy link
Contributor

wdlkmpx commented Feb 13, 2021 via email

@peabee
Copy link
Author

peabee commented Feb 14, 2021

:-))
0.2w pets now Tested AOK on:
ScPup32, Slacko32-7.0, BionicPup32, UPupEF, UPupFF+D, UPupGG+D, UPupHH+D
ScPup64, Slacko64-7.0, BionicPup64, FossaPup64

@wdlkmpx
Copy link
Contributor

wdlkmpx commented Feb 15, 2021 via email

@peabee
Copy link
Author

peabee commented Feb 15, 2021

Well as an example, UPupHH+D uses:

:glib:|compat|Packages-ubuntu-hirsute-main|libglib2.0-0_2.66.4-1|libglib2.0-0|2.66.4-1||BuildingBlock|4500K|pool/main/g/glib2.0|libglib2.0-0_2.66.4-1_i386.deb|+libc6&ge2.32,+libffi8ubuntu1&ge3.4,+libmount1&ge2.35.2-7,+libpcre3,+libselinux1&ge3.1,+zlib1g&ge1.2.2|GLib library of C routines|ubuntu|hirsute||
:glib:|compat|Packages-ubuntu-hirsute-main|libglib2.0-bin_2.66.4-1|libglib2.0-bin|2.66.4-1||BuildingBlock|326K|pool/main/g/glib2.0|libglib2.0-bin_2.66.4-1_i386.deb|+libglib2.0-data,+libc6&ge2.4,+libelf1&ge0.142,+libglib2.0-0&eq2.66.4-1|Programs for the GLib library|ubuntu|hirsute||

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants