Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial workaround for Qt5.10 instability #6301

Merged
merged 1 commit into from
Feb 19, 2018
Merged

Initial workaround for Qt5.10 instability #6301

merged 1 commit into from
Feb 19, 2018

Conversation

manisandro
Copy link
Member

After staring at valgrind a bit I noticed that QgsCollapsibleGroupBox was always involved in all those crashes when closing dialogs when compiled against Qt5.10. Gradually expanding from a valilla QGroupBox to the QgsCollapsibleGroupBox revealed that

setFocusProxy( mCollapseButton );

is somehow responsible for all those crashes when closing dialogs. I'd say for a start we can just comment it out.

@nyalldawson There is a second source of crashes which I believe affects Qt5 in general:

All QgsLayoutViewTool are constructed with QgsLayoutView as parent, and in their destructor have

QgsLayoutViewTool::~QgsLayoutViewTool()
{ 
  if ( mView )
    mView->unsetTool( this );
}

with mView being a QPointer<QgsLayoutView>. When QgsLayoutView is destroyed, all the child QgsLayoutViewTool get destroyed as well, as per standard QObject behaviour. The problem is that at the time ~QgsLayoutViewTool is called, the mView isn't yet reset to null becase, according to the QPointer documentation, starting with Qt5+:

When using QPointer on a QWidget (or a subclass of QWidget), previously the QPointer would be cleared by the QWidget destructor. Now, the QPointer is cleared by the QObject destructor (since this is when QWeakPointer objects are cleared). Any QPointers tracking a widget will NOT be cleared before the QWidget destructor destroys the children for the widget being tracked.

So this means that in ~QgsLayoutViewTool, mView->unsetTool is called on a half-destroyed mView.

@nyalldawson
Copy link
Collaborator

Wow, nice catch @manisandro!

I'm wondering - does calling

setFocusProxy( nullptr );

In the destructor avoid the hangs?

I'll rework the layout tool to avoid the second issue

@manisandro
Copy link
Member Author

Yep clearing the focus proxy in the destructor also seems to work, commit updated.

@slarosa
Copy link
Member

slarosa commented Feb 9, 2018

@manisandro not for me under macos :-(, previous workaround was working pretty fine

@manisandro
Copy link
Member Author

Oh, ok... So back to the first attempt I suppose ;)

@nyalldawson
Copy link
Collaborator

@slarosa what about deleting the focusProxy() in the destructor?

@slarosa
Copy link
Member

slarosa commented Feb 9, 2018

@nyalldawson the following does not work for me.

QgsCollapsibleGroupBoxBasic::~QgsCollapsibleGroupBoxBasic()
{
  delete focusProxy();
}

@nyalldawson
Copy link
Collaborator

Ok, next check

Remove the parent from mCollapseButton, and in the destructor call

mCollapseButton->deleteLater()

Also try with just

delete mCollapseButton;

@slarosa
Copy link
Member

slarosa commented Feb 9, 2018

I had already tried that :-)

@slarosa
Copy link
Member

slarosa commented Feb 9, 2018

wait...I tried the delete mCollapseButton not the deleteLater()....testing now

@nyalldawson
Copy link
Collaborator

And having no parent changes nothing?

@slarosa
Copy link
Member

slarosa commented Feb 10, 2018

I'm testing with a bottle of coffee, it's one at night :-)

@slarosa
Copy link
Member

slarosa commented Feb 10, 2018

And having no parent changes nothing?

Yeah, this has worked!

@nirvn
Copy link
Contributor

nirvn commented Feb 10, 2018

This is such a relief to see this being addressed. Keep up the great work.

@nyalldawson
Copy link
Collaborator

@slarosa

So no parent, and the deleteLater?

@slarosa
Copy link
Member

slarosa commented Feb 10, 2018

@nyalldawson just using no parent it works fine, it seems like the deleting is not necessary.

There is only a problem: the QToolButton schermata 2018-02-10 alle 08 42 32 is not visible:

schermata 2018-02-10 alle 08 22 15

So, under macOS:

Tested patch Crash?
delete mCollapseButton; on destructor Yes
mCollapseButton->deleteLater(); on destructor Yes
setFocusProxy( nullptr )on destructor Yes
reimplementing the closeEvent #6301 (comment) Yes
setting no parent to QToolButton No, but the button is not visible
@manisandro's hack No

@nyalldawson
Copy link
Collaborator

Damn. No parent no delete would be a leak, so it's not an option anyway.

New test:

  • Leave the parent
  • Implement a closeEvent override, and inside it call setFocusProxy( nullptr )
  • no delete

@slarosa
Copy link
Member

slarosa commented Feb 10, 2018

New test:
Leave the parent
Implement a closeEvent override, and inside it call setFocusProxy( nullptr )
no delete

the crash is there yet :-(

@manisandro
Copy link
Member Author

@slarosa Can you get a valgrind output of setFocusProxy( nullptr )? Just to check that the crash is the same?

@slarosa
Copy link
Member

slarosa commented Feb 10, 2018

@manisandro I will do, not matter if on destructor or closeEvent, right?

@slarosa
Copy link
Member

slarosa commented Feb 10, 2018

at the moment I am struggle with this valgrind issue https://bugs.kde.org/show_bug.cgi?id=382998
which should to be was fixed in 3.14 version (not available on homebrew formula yet).

@3nids
Copy link
Member

3nids commented Feb 12, 2018

also....why would main options and vector properties are freezing while raster properties not???

@manisandro
Copy link
Member Author

Judging from valgrind there are lots of reads to freed memory going on (without the workaround of disabling the focus proxy). That is undefined behaviour, what happens depends on what memory you have close to the blocks being read etc. If the memory segment is in a block not assigned to the application, the OS will terminate the application with a SIGSEGV, but if the access is still in a block assigned to the application, then the application can very well continue working, with potentially random side effects.

I'd still be interested in a valgrind trace of "setFocusProxy( nullptr )" in the destructor, since on my machine this makes valgrind happy, so I wonder if people still experiencing crashes with that fix are hitting a different issue.

In general, I'd recomment fixes for this issue to be tested with valgrind.

@3nids
Copy link
Member

3nids commented Feb 12, 2018

thanks for your explanations. I will give a try at valgrind on mac but I have no experience there.

@3nids
Copy link
Member

3nids commented Feb 12, 2018

I have it in place, but I would need some guidelines how to use. Or I could share my screen if one ants to do it directly.

@manisandro
Copy link
Member Author

Just

valgrind --log-file=valgrind.log /path/to/qgis

then trigger what you want to investigate, and if memory errors occur, the log file will contain corresponding warnings. You'll need a qgis build with debug infos to have a readable output. Note that valgrind measurably slows down the application - this is normal.

Another memory debugging tool is Address Sanitizer, which is built into recentish gcc and clang compilers. To use it, recompile the entire code with -fsanitize=address and you'll get diagnostic messages printed out to stderr when running the application.

@3nids
Copy link
Member

3nids commented Feb 12, 2018

Thanks, I have no luck with build type.

valgrind: m_debuginfo/debuginfo.c:452 (void discard_or_archive_DebugInfo(DebugInfo *)): Assertion 'is_DebugInfo_active(di)' failed.

It's a RelWithDebugInfo, I just compiled with Debug and it did not change anything.
Here is what I have in CMakeCache.txt:

//No help, variable specified on the command line.
CMAKE_BUILD_TYPE:STRING=Debug

//No help, variable specified on the command line.
CMAKE_CXX_COMPILER:STRING=/usr/local/opt/ccache/libexec/clang++

//Flags used by the compiler during all build types.
CMAKE_CXX_FLAGS:STRING=

//Flags used by the compiler during debug builds.
CMAKE_CXX_FLAGS_DEBUG:STRING=-g

//Flags used by the compiler during release builds for minimum
// size.
CMAKE_CXX_FLAGS_MINSIZEREL:STRING=-Os -DNDEBUG

//Flags used by the compiler during release builds.
CMAKE_CXX_FLAGS_RELEASE:STRING=-O3 -DNDEBUG

//Flags used by the compiler during release builds with debug info.
CMAKE_CXX_FLAGS_RELWITHDEBINFO:STRING=-O2 -g -DNDEBUG

@manisandro
Copy link
Member Author

Hmm any luck with -fsanitize=address instead of valgrind?

@nyalldawson
Copy link
Collaborator

There was an ancient attempt to get asan on the Travis builds by @m-kuhn.

I usually cherry-pick some of the commits from

https://github.com/nyalldawson/QGIS/commits/asan2

When testing with it (ignore the failed Travis clang commits)

@nirvn
Copy link
Contributor

nirvn commented Feb 17, 2018

Our release is fast approaching (Feb 23), if we can't come with a perfect solution by then, could we push something temporary targeting qt 5.10, #if QT_VERSION..., to prevent crash until we have a better solution?

@elpaso
Copy link
Contributor

elpaso commented Feb 17, 2018

+1 : I guess we don't loose much leaving out setFocusProxy

@nirvn
Copy link
Contributor

nirvn commented Feb 17, 2018

Since QGIS OSX is served through homebrew and it runs Qt 5.10, let's temporarily loose this bit to keep mac users happy.

@manisandro
Copy link
Member Author

Added corresponding #if guards. Qt5.10.1 should be ready any moment in Fedora rawhide, I'll give it a try later to see whether it changes anything.

@nirvn
Copy link
Contributor

nirvn commented Feb 19, 2018

Travis failure unrelated.

@nyalldawson , @3nids , should we merge this now?

@3nids
Copy link
Member

3nids commented Feb 19, 2018

would be great to add a link to this discussion too in the comment.
otherwise +1 to merge

@manisandro
Copy link
Member Author

FWIW, issue is also present with 5.10.1.

I've added a reference to this ticket as a comment.

@manisandro manisandro merged commit b47eb87 into qgis:master Feb 19, 2018
@manisandro manisandro deleted the qt5.10 branch February 27, 2018 14:36
@slarosa
Copy link
Member

slarosa commented May 24, 2018

Issue is also present in 5.11.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants