Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to OGRE 1.10.12 to get OSX fix for Sierra #380

Merged
merged 1 commit into from
May 10, 2019

Conversation

emersonknapp
Copy link
Contributor

@emersonknapp emersonknapp commented Mar 1, 2019

Upgrade to 1.10.12 from 1.10.11, don't risk breaking any APIs by going to 1.11.x

From 1.10.12 release notes: "Fix crash when embedding OgreOSXCocoaWindow in external window"

@tfoote tfoote added the in review Waiting for review (Kanban column) label Mar 1, 2019
@emersonknapp emersonknapp changed the title Upgrade to OGRE 1.10.12 to get OSX fix but not break any APIs by upgr… Upgrade to OGRE 1.10.12 to get OSX fix Mar 1, 2019
@wjwwood
Copy link
Member

wjwwood commented Mar 1, 2019

Which issue is this? (it currently runs on high sierra)

That's definitely been an issue in the past, but I thought it was already fixed.

Separately would be to see if a newer version fixes building on Mojave (there are some lingering issues).

@emersonknapp
Copy link
Contributor Author

The draft PR feature is new to me, didn't realize it would go to you! I haven't tested this on any platform but OSX yet.

I am running on Sierra, (incomplete High Sierra support on internal machines here). All 232 packages successfully built, including rviz2, but I was seeing a consistent segfault crash on startup in the CocoaWindow layer of OGRE

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libobjc.A.dylib               	0x00007fff952aa75d objc_msgSend_stret + 29
1   RenderSystem_GL.dylib         	0x000000011b19ab00 Ogre::CocoaWindow::windowMovedOrResized() + 208 (OgreOSXCocoaWindow.mm:603)
2   librviz_rendering.dylib       	0x00000001107a7333 rviz_rendering::RenderWindowImpl::resize(unsigned long, unsigned long) + 115
3   librviz_rendering.dylib       	0x00000001107bcef1 rviz_rendering::RenderWindow::exposeEvent(QExposeEvent*) + 97
4   org.qt-project.QtGui          	0x000000011117f11b QWindow::event(QEvent*) + 699
5   librviz_rendering.dylib       	0x00000001107bcd12 rviz_rendering::RenderWindow::event(QEvent*) + 434
6   org.qt-project.QtWidgets      	0x0000000110abb87d QApplicationPrivate::notify_helper(QObject*, QEvent*) + 269
7   org.qt-project.QtWidgets      	0x0000000110abcc8b QApplication::notify(QObject*, QEvent*) + 555
8   org.qt-project.QtCore         	0x0000000111852eb8 QCoreApplication::notifyInternal2(QObject*, QEvent*) + 168
9   org.qt-project.QtGui          	0x0000000111173748 QGuiApplicationPrivate::processExposeEvent(QWindowSystemInterfacePrivate::ExposeEvent*) + 296
10  org.qt-project.QtGui          	0x00000001111532f3 bool QWindowSystemInterfacePrivate::handleWindowSystemEvent<QWindowSystemInterface::SynchronousDelivery>(QWindowSystemInterfacePrivate::WindowSystemEvent*) + 99
11  org.qt-project.QtGui          	0x00000001111596e0 void QWindowSystemInterface::handleExposeEvent<QWindowSystemInterface::SynchronousDelivery>(QWindow*, QRegion const&) + 192
12  libqcocoa.dylib               	0x0000000115cd37ac 0x115cb9000 + 108460
13  libqcocoa.dylib               	0x0000000115cdaa5c 0x115cb9000 + 137820
14  com.apple.AppKit              	0x00007fff7e04df99 -[NSView _drawRect:clip:] + 2276
15  com.apple.AppKit              	0x00007fff7e09df2f -[NSView _recursiveDisplayAllDirtyWithLockFocus:visRect:] + 1753
16  com.apple.AppKit              	0x00007fff7e09e39a -[NSView _recursiveDisplayAllDirtyWithLockFocus:visRect:] + 2884
17  com.apple.AppKit              	0x00007fff7e09e39a -[NSView _recursiveDisplayAllDirtyWithLockFocus:visRect:] + 2884
18  com.apple.AppKit              	0x00007fff7e04bad2 -[NSView _recursiveDisplayRectIfNeededIgnoringOpacity:isVisibleRect:rectIsVisibleRectForView:topView:] + 837
19  com.apple.AppKit              	0x00007fff7e04b2af -[NSThemeFrame _recursiveDisplayRectIfNeededIgnoringOpacity:isVisibleRect:rectIsVisibleRectForView:topView:] + 334
20  com.apple.AppKit              	0x00007fff7e0496d8 -[NSView _displayRectIgnoringOpacity:isVisibleRect:rectIsVisibleRectForView:] + 2452
21  com.apple.AppKit              	0x00007fff7e044fca -[NSView displayIfNeeded] + 1748
22  com.apple.AppKit              	0x00007fff7e0448db -[NSWindow displayIfNeeded] + 230
23  com.apple.AppKit              	0x00007fff7e7a4cb4 ___NSWindowGetDisplayCycleObserver_block_invoke.6228 + 277
24  com.apple.AppKit              	0x00007fff7e0443b9 __37+[NSDisplayCycle currentDisplayCycle]_block_invoke + 454
25  com.apple.QuartzCore          	0x00007fff85f2fc26 CA::Transaction::run_commit_handlers(CATransactionPhase) + 46
26  com.apple.QuartzCore          	0x00007fff860398a0 CA::Context::commit_transaction(CA::Transaction*) + 160
27  com.apple.QuartzCore          	0x00007fff85f2e701 CA::Transaction::commit() + 475
28  com.apple.AppKit              	0x00007fff7e3278b1 __37+[NSDisplayCycle currentDisplayCycle]_block_invoke.31 + 323
29  com.apple.CoreFoundation      	0x00007fff8041fc57 __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__ + 23
30  com.apple.CoreFoundation      	0x00007fff8041fbc7 __CFRunLoopDoObservers + 391
31  com.apple.CoreFoundation      	0x00007fff804005f9 __CFRunLoopRun + 873
32  com.apple.CoreFoundation      	0x00007fff80400034 CFRunLoopRunSpecific + 420
33  com.apple.HIToolbox           	0x00007fff7f960ebc RunCurrentEventLoopInMode + 240
34  com.apple.HIToolbox           	0x00007fff7f960bf9 ReceiveNextEventCommon + 184
35  com.apple.HIToolbox           	0x00007fff7f960b26 _BlockUntilNextEventMatchingListInModeWithFilter + 71
36  com.apple.AppKit              	0x00007fff7def5a54 _DPSNextEvent + 1120
37  com.apple.AppKit              	0x00007fff7e6717ee -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 2796
38  com.apple.AppKit              	0x00007fff7deea3db -[NSApplication run] + 926
39  libqcocoa.dylib               	0x0000000115ce71fd 0x115cb9000 + 188925
40  org.qt-project.QtCore         	0x000000011184ea2e QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) + 398
41  org.qt-project.QtCore         	0x00000001118535b1 QCoreApplication::exec() + 369
42  rviz2                         	0x000000010faf9c3d main + 2061

With this update, rviz2 happily runs here now. I'll need to test it on Linux and Windows though before we call this good.

@wjwwood
Copy link
Member

wjwwood commented Mar 1, 2019

No worries, I think draft pr only avoids notifications when using "code owners", but I'm not sure. I wasn't going to review until you changed it to a non-draft, I was just curious.

@emersonknapp emersonknapp changed the title Upgrade to OGRE 1.10.12 to get OSX fix Upgrade to OGRE 1.10.12 to get OSX fix for Sierra Mar 4, 2019
@thomas-moulard thomas-moulard added this to In progress in AWS Robotics Mar 5, 2019
@Karsten1987
Copy link
Contributor

Karsten1987 commented Apr 3, 2019

I am running in the same problems with the segfault on startup. I am running Sierra as well. However, when trying to compile this patch I get a compiler error:

/Users/karsten/workspace/osrf/ros2_visualization/build/rviz_ogre_vendor/ogre-master-ca665a6-prefix/src/ogre-master-ca665a6-build/Components/Python/CMakeFiles/_OgreOverlay.dir/OgreOverlayPYTHON_wrap.cxx:8925:61: error: too few arguments to function call, expected 2, have 1
      result = (arg1)->getByName((Ogre::String const &)*arg2);
               ~~~~~~~~~~~~~~~~~                            ^
/Users/karsten/workspace/osrf/ros2_visualization/build/rviz_ogre_vendor/ogre-master-ca665a6-prefix/src/ogre-master-ca665a6/Components/Overlay/include/OgreFontManager.h:59:9: note: 'getByName' declared here
        FontPtr getByName(const String& name, const String& groupName OGRE_RESOURCE_GROUP_INIT);
        ^
1 error generated.

does that tell you anything?


EDIT:
I somehow managed to get this to work with DOGRE_RESOURCEMANAGER_STRICT:BOOL=FALSE

@emersonknapp
Copy link
Contributor Author

I remember when I did this that the DOGRE_RESOURCEMANAGER_STRICT:BOOL=ON was an unhappy value because it was used directly in a CPP source file https://gist.githubusercontent.com/emersonknapp/75df97702d35ef941742db8bd0f84174/raw/2391ac7887abd36240fe8a9e0e564f2b43f0b65c/ros2.repos

I did some digging and found that the type of the OGRE_RESOURCEMANAGER_STRICT variable changed (in this commit OGRECave/ogre@f13749c between 1.10.11 and 1.10.12), I'm going to update this PR to reflect that

@emersonknapp emersonknapp marked this pull request as ready for review April 4, 2019 00:53
@emersonknapp
Copy link
Contributor Author

@thomas-moulard - please run the following CI job:

@Karsten1987
Copy link
Contributor

I went ahead and triggered a build:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@Karsten1987
Copy link
Contributor

@emersonknapp do you have an idea on what exactly changed within Sierra that this error occured? I was told the current CI build machines are running Sierra as well and the builds didn't break on their end - saying all the nightlies look good so far. Does the error occur due to an update to one of Ogre's dependencies, such as cocoa?

Also, I am not having a Mojave machine around, but do you think you could give this a shot on Mojave to see if this works there as well?

@emersonknapp
Copy link
Contributor Author

Well - the build wasn't failing on Sierra, it was only the runtime error. I'm not familiar with the automated testing being done on rviz, does the frontend get launched on those Sierra build machines?

I'm not sure if the problem has always existed on Sierra or if it's new - it was my first attempt to use rviz2 on OSX. My macbook at home may be running Mojave, I can check tonight and give it a try if so.

@wjwwood
Copy link
Member

wjwwood commented Apr 4, 2019

There are fairly extensive tests for rviz in ROS 2, but I don't know if the gui is launched by default. That's why I asked @Karsten1987 in our triage meeting if he had tried running it on one of the CI machines to see if the problem is there too.

I know for a fact that only a few months ago my old laptop running High Sierra worked, though maybe that's not helpful.

@emersonknapp
Copy link
Contributor Author

Yeah - from my understanding it has worked consistently on High Sierra.

I'm not familiar with Cocoa at all - but for even more context this is where the crash was fixed OGRECave/ogre@8b6fe05#diff-67b8d7c5b917628f037cdff0a91fa136R607. Maybe it has something to do with subtle differences between the Cocoa APIs between the platforms?

@emersonknapp emersonknapp moved this from In progress to Needs review in AWS Robotics Apr 4, 2019
@Karsten1987
Copy link
Contributor

I just tried to launch RViz2 on one of the Sierra CI machines and indeed it's segfaulting as well. It looks like the tests are not fully covering the behavior of opening RViz completely.

When opening RViz, it also complains about not finding rviz_rendering_tests folder. So maybe the tests are not correctly setup to run on CI?

@wjwwood What do you recommend how we proceed?

@wjwwood
Copy link
Member

wjwwood commented Apr 13, 2019

It seems necessary then. So I guess move forward with this pr but manually test that it fixes the issue on the CI machine and on your personal machines.

@Karsten1987
Copy link
Contributor

@emersonknapp any chance you could test this on either a High Sierra or Mojave machine? I tried to install this patch on a fresh high sierra VM but it still segfaults. But I am also not sure if I just missed something on this VM and would like to have a second opinion on it.

@emersonknapp
Copy link
Contributor Author

Will try - I have found a potential machine to use. Hope to have results by the end of the week.

…ading to 1.11

Signed-off-by: Emerson Knapp <eknapp@amazon.com>
@emersonknapp
Copy link
Contributor Author

emersonknapp commented May 9, 2019

I have built rviz2 on a High Sierra Macbook Pro with the latest master plus this patch and it runs successfully. @wjwwood Is this sufficient data to decide to merge?

My force-push below is just a rebase to latest.

@wjwwood
Copy link
Member

wjwwood commented May 9, 2019

This pr is in @Karsten1987's hands, I don't have time to follow up on it.

@Karsten1987
Copy link
Contributor

Karsten1987 commented May 10, 2019

I've triggered a new CI for it.

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

Hopefully, I can log into one of the OSX CI machines and verify that it's works.

@Karsten1987 Karsten1987 merged commit ea1a390 into ros2:ros2 May 10, 2019
AWS Robotics automation moved this from Needs review to Done May 10, 2019
@Karsten1987 Karsten1987 removed the in review Waiting for review (Kanban column) label May 10, 2019
@emersonknapp
Copy link
Contributor Author

Woohoo, glad we finally got this in!

@emersonknapp emersonknapp deleted the ogre-1.10.12 branch May 10, 2019 17:54
@j-rivero
Copy link

On Mojave I found startup problems when launching rviz2 (segfault) using the latest Crystal p4 ros2.repos. I can confirm that using the latest ros2.repos (which includes this fix) have fixed the problem for me and now rviz2 appears without a problem.

@emersonknapp
Copy link
Contributor Author

@j-rivero Thanks for confirming on Mojave! Good to know this improved things across the board.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
AWS Robotics
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

5 participants