Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calamares crashes when cycling through partition modes #1061

Closed
highvoltage opened this issue Dec 18, 2018 · 20 comments

Comments

Projects
None yet
5 participants
@highvoltage
Copy link
Contributor

commented Dec 18, 2018

I tested this on both Debian weekly live images, and on the latest Manjaro Xfce image to rule out a distribution specific issue.

On 64 bit Calamares (untested on 32 bit variants), Calamares crashes when clicking on each partitioning option for 3-5 cycles.

To Reproduce
Steps to reproduce the behavior:

  1. Install Debian or Manjaro
  2. Reboot back to Calamares
  3. On the partitioning screen, click on each partition type. Usually after around 3 cycles of this, Calamares will quit printing "Segmentation Fault" to the terminal.

Expected behavior
No Segfaults ever! ♥

Screenshots and Logs
Attached is a gif where it happens after just one cycle in Manjaro.
peek 2018-12-18 14-18

@abucodonosor

This comment has been minimized.

Copy link
Contributor

commented Jan 2, 2019

@stikonas @adriaandegroot @cjlcarvalho

The crash seems to be caused by LVM* code ..

It crashes when at least one partition has an FS on it.
It does not crash when you have unpartitioned disk or partitioned but without FS.

from BT is see the following:


Thread 1 "calamares" received signal SIGSEGV, Segmentation fault.
0x00007fffcc027640 in ?? ()
(gdb) bt full
#0  0x00007fffcc027640 in ?? ()
No symbol table info available.
#1  0x00007fffd1de63bc in PartitionPage::updateButtons (this=0x7fffcc065aa0) at /var/tmp/fst/src/3.2.2-187/src/modules/partition/gui/PartitionPage.cpp:169
        deviceIndex = {r = 0, c = 0, i = 0, m = 0x555555ad5820}
        device = 0x7fffcc062da0
        create = false
        createTable = false
        edit = false
        del = false
        currentDeviceIsVG = false
        isDeactivable = false
        isRemovable = false
        isVGdeactivated = false
        index = {r = -1, c = -1, i = 0, m = 0x0}
...

L169 is here:

if ( device->type() != Device::Type::LVM_Device )
..

So on != LVM we set createTable = true; but BT indicates is crashing with createTable = false;
wich so far I see is only set on == LVM or RAID?

For an quick test I reverted the code to what it was before LVM* merge in that function() and it does not
crash any more here.

@cjlcarvalho

This comment has been minimized.

Copy link
Contributor

commented Jan 3, 2019

These boolean variables are used only for the activation of the buttons.

The problem in this case probably is related to the device reference, which needs an assertion to check if it is null. The use of Q_ASSERT(device) just before the device->type() != Device::Type::LVM_Device condition can help us to see if this is the cause of this problem.

I couldn't reproduce the error with kpmcore on version 3.3.0 and neither with it built directly from master branch, using an USB stick as my device.

I'll try to test it on Calamares version from Manjaro XFCE to see if I can get it.

UPDATE: I just tested Calamares version from Manjaro XFCE and the error occurred.

@philmmanjaro

This comment has been minimized.

Copy link
Member

commented Jan 3, 2019

@cjlcarvalho: let us know if this issue is present on v18.0.2 ISOs of Manjaro.

@cjlcarvalho

This comment has been minimized.

Copy link
Contributor

commented Jan 3, 2019

@philmmanjaro I tested with Manjaro XFCE v18.0.2.

@abucodonosor

This comment has been minimized.

Copy link
Contributor

commented Jan 4, 2019

@cjlcarvalho

Q_ASSERT(device); does not trigger here.

maybe device->type() is confused at some point

@adriaandegroot

This comment has been minimized.

Copy link
Contributor

commented Jan 4, 2019

So it needs more chasing. @abucodonosor @cjlcarvalho can you two continue debugging this? paste stuff into the issue tracker, not into IRC, to preserve it for others to follow along. I'm working on other bits and pieces right now.

@cjlcarvalho

This comment has been minimized.

Copy link
Contributor

commented Jan 4, 2019

Yes, I will. I'm building Calamares version from Manjaro repositories and debugging it.

@philmmanjaro

This comment has been minimized.

Copy link
Member

commented Jan 4, 2019

@cjlcarvalho:On Manjaro we have two branches and therefore also own PKGBUILDs to compile the packages:

You find our git-repo here.

@philmmanjaro

This comment has been minimized.

Copy link
Member

commented Jan 13, 2019

@abucodonosor: what is your hackish workaround for this issue?

@philmmanjaro

This comment has been minimized.

Copy link
Member

commented Jan 13, 2019

@cjlcarvalho: these are the differences from upstream 3.2.3 partition module and my 3.2.2.6 version. I've now merged some more branches into calamares-git to see how there the code is.

@abucodonosor

This comment has been minimized.

Copy link
Contributor

commented Jan 15, 2019

@philmmanjaro

#if 0 LVM code in that function() .. device->type() seem to become NULL at some point and all these LVM/RAID checks breaks.

I need retest against kpmcore master without that hack soon.

@philmmanjaro

This comment has been minimized.

Copy link
Member

commented Jan 19, 2019

The issues I had with #1072 seem to be fixed with current master code, however this issue is now active when cycling thru the swap options.

@adriaandegroot

This comment has been minimized.

Copy link
Contributor

commented Feb 11, 2019

Running in valgrid, after some clicking around I hit this:

17:34:17 [6]: Updating partitioning preview widgets. 
==29901== Invalid read of size 8
==29901==    at 0x1CEBED25: PartitionPage::updateButtons() (in /home/adridg/src/calamares/build/src/modules/partition/libcalamares_viewmodule_partition.so)
==29901==    by 0x1CEC1435: PartitionPage::onPartitionModelReset() (in /home/adridg/src/calamares/build/src/modules/partition/libcalamares_viewmodule_partition.so)
==29901==    by 0x1CEC779B: QtPrivate::FunctorCall<QtPrivate::IndexesList<>, QtPrivate::List<>, void, void (PartitionPage::*)()>::call(void (PartitionPage::*)(), PartitionPage*, void**) (in /home/adridg/src/calamares/build/src/modules/partition/libcalamares_viewmodule_partition.so)
==29901==    by 0x1CEC763C: void QtPrivate::FunctionPointer<void (PartitionPage::*)()>::call<QtPrivate::List<>, void>(void (PartitionPage::*)(), PartitionPage*, void**) (in /home/adridg/src/calamares/build/src/modules/partition/libcalamares_viewmodule_partition.so)
==29901==    by 0x1CEC7533: QtPrivate::QSlotObject<void (PartitionPage::*)(), QtPrivate::List<>, void>::impl(int, QtPrivate::QSlotObjectBase*, QObject*, void**, bool*) (in /home/adridg/src/calamares/build/src/modules/partition/libcalamares_viewmodule_partition.so)
==29901==    by 0x7DE9111: QObject::event(QEvent*) (in /usr/lib/libQt5Core.so.5.10.1)
==29901==    by 0x59BE8BB: QWidget::event(QEvent*) (in /usr/lib/libQt5Widgets.so.5.10.1)
==29901==    by 0x597DFEB: QApplicationPrivate::notify_helper(QObject*, QEvent*) (in /usr/lib/libQt5Widgets.so.5.10.1)
==29901==    by 0x59859C5: QApplication::notify(QObject*, QEvent*) (in /usr/lib/libQt5Widgets.so.5.10.1)
==29901==    by 0x7DB7D9F: QCoreApplication::notifyInternal2(QObject*, QEvent*) (in /usr/lib/libQt5Core.so.5.10.1)
==29901==    by 0x7DBAA05: QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) (in /usr/lib/libQt5Core.so.5.10.1)
==29901==    by 0x7E14D03: ??? (in /usr/lib/libQt5Core.so.5.10.1)
==29901==  Address 0x1af15b20 is 0 bytes inside a block of size 104 free'd
==29901==    at 0x403208B: operator delete(void*, unsigned long) (vg_replace_malloc.c:585)
==29901==    by 0x1CE8E87D: QScopedPointerDeleter<Device>::cleanup(Device*) (in /home/adridg/src/calamares/build/src/modules/partition/libcalamares_viewmodule_partition.so)
==29901==    by 0x1CE8E4EC: QScopedPointer<Device, QScopedPointerDeleter<Device> >::reset(Device*) (in /home/adridg/src/calamares/build/src/modules/partition/libcalamares_viewmodule_partition.so)
==29901==    by 0x1CE8A89C: PartitionCoreModule::revertDevice(Device*) (in /home/adridg/src/calamares/build/src/modules/partition/libcalamares_viewmodule_partition.so)
==29901==    by 0x1CEA0C5A: ChoicePage::applyActionChoice(ChoicePage::InstallChoice)::{lambda()#3}::operator()() const (in /home/adridg/src/calamares/build/src/modules/partition/libcalamares_viewmodule_partition.so)
==29901==    by 0x1CEAB5E5: QtConcurrent::StoredFunctorCall0<void, ChoicePage::applyActionChoice(ChoicePage::InstallChoice)::{lambda()#3}>::runFunctor() (in /home/adridg/src/calamares/build/src/modules/partition/libcalamares_viewmodule_partition.so)
==29901==    by 0x1CE8C274: QtConcurrent::RunFunctionTask<void>::run() (in /home/adridg/src/calamares/build/src/modules/partition/libcalamares_viewmodule_partition.so)
==29901==    by 0x7BC7B21: ??? (in /usr/lib/libQt5Core.so.5.10.1)
==29901==    by 0x7BCAB4C: ??? (in /usr/lib/libQt5Core.so.5.10.1)
==29901==    by 0x987708B: start_thread (in /usr/lib/libpthread-2.26.so)
==29901==    by 0x8C02E7E: clone (in /usr/lib/libc-2.26.so)
==29901==  Block was alloc'd at
==29901==    at 0x4030DEF: operator new(unsigned long) (vg_replace_malloc.c:334)
==29901==    by 0x1DC587E9: ??? (in /usr/lib/qt/plugins/libpmlibpartedbackendplugin.so)
==29901==    by 0x1CE8A885: PartitionCoreModule::revertDevice(Device*) (in /home/adridg/src/calamares/build/src/modules/partition/libcalamares_viewmodule_partition.so)
==29901==    by 0x1CEA0C5A: ChoicePage::applyActionChoice(ChoicePage::InstallChoice)::{lambda()#3}::operator()() const (in /home/adridg/src/calamares/build/src/modules/partition/libcalamares_viewmodule_partition.so)
==29901==    by 0x1CEAB5E5: QtConcurrent::StoredFunctorCall0<void, ChoicePage::applyActionChoice(ChoicePage::InstallChoice)::{lambda()#3}>::runFunctor() (in /home/adridg/src/calamares/build/src/modules/partition/libcalamares_viewmodule_partition.so)
==29901==    by 0x1CE8C274: QtConcurrent::RunFunctionTask<void>::run() (in /home/adridg/src/calamares/build/src/modules/partition/libcalamares_viewmodule_partition.so)
==29901==    by 0x7BC7B21: ??? (in /usr/lib/libQt5Core.so.5.10.1)
==29901==    by 0x7BCAB4C: ??? (in /usr/lib/libQt5Core.so.5.10.1)
==29901==    by 0x987708B: start_thread (in /usr/lib/libpthread-2.26.so)
==29901==    by 0x8C02E7E: clone (in /usr/lib/libc-2.26.so)
==29901== 
17:34:18 [6]: PCM::setBootLoaderInstallPath "/dev/sdb" 
@adriaandegroot

This comment has been minimized.

Copy link
Contributor

commented Feb 12, 2019

Closing this now that smooth-partition-crash has been merged. The race looks like it is gone.

@highvoltage

This comment has been minimized.

Copy link
Contributor Author

commented Feb 13, 2019

Yep, I can confirm that it's gone for me now on 3.2.4 on Debian.

@philmmanjaro

This comment has been minimized.

Copy link
Member

commented Mar 26, 2019

It still happens on 3.2.4 on some cases if you circle thru the swap options when erasing disk. Therefore we should take another look at the matter.

@philmmanjaro philmmanjaro reopened this Mar 26, 2019

@abucodonosor

This comment has been minimized.

Copy link
Contributor

commented Mar 26, 2019

I was about to request reopen for it since is back in master anyway

@adriaandegroot

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2019

It's definitely in 3.2.4 (slightly less common than it was). It should have been fixed in master in fdb4311, and I can't reproduce it -- @abucodonosor I need more information.

@adriaandegroot adriaandegroot modified the milestones: v3.2.4, v3.2.5 Mar 29, 2019

@adriaandegroot

This comment has been minimized.

Copy link
Contributor

commented Mar 30, 2019

Doesn't trigger if there are no partitions defined on the target disk; after creating some outside of Calamares, and then running through this, I do get crashes. There's several places where invalid data from the models is still being used.

@adriaandegroot adriaandegroot modified the milestones: v3.2.5, v3.2.6 Apr 4, 2019

@adriaandegroot

This comment has been minimized.

Copy link
Contributor

commented Apr 25, 2019

Went through another round of tightening-up-the-models. Now with a test-VM with three disks and multiple partitions on each, I couldn't trigger the problem. So closed again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.