Exiting Cadence software "RTL Compiler" causes vncserver/vncviewer to crash #368

Open
jolmichr opened this Issue Oct 12, 2016 · 22 comments

Projects

None yet

4 participants

@jolmichr
jolmichr commented Oct 12, 2016 edited

Originally saw this on TigerVNC 1.6. We now have 1.7 installed and problem remains. Other Cadence tools do not cause this crash, only RTL Compiler. If I run the tool in non-GUI mode, there is no issue. But if I do not specify "-nogui" in the rc run the crash occurs everytime. I will post the vnc log -- this may as well be in a different language. Hoping someone can shed some light on log content, and/or propose things/setting I might try to resolve the crash.
vnc_log.txt

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/38341563-exiting-cadence-software-rtl-compiler-causes-vncserver-vncviewer-to-crash?utm_campaign=plugin&utm_content=tracker%2F3557444&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F3557444&utm_medium=issues&utm_source=github).
@bphinz
Contributor
bphinz commented Oct 12, 2016

Can you tell me what version of RTL Compiler you are using? I have access
to older versions and might be able to get temp keys to debug.

Thanks,
-brian

On Wed, Oct 12, 2016 at 5:04 PM jolmichr notifications@github.com wrote:

Originally saw this on TigerVNC 1.6. We now have 1.7 installed and problem
remains. Other Cadence tools do not cause this crash, only RTL Compiler. If
I run the tool in non-GUI mode, there is no issue. But if I do not specify
"-nogui" in the rc run the crash occurs everytime. I will post the vnc log
-- this may as well be in a different language. Hoping someone can shed
some light on log content, and/or propose things/setting I might try to
resolve the crash.
vnc_log.txt
https://github.com/TigerVNC/tigervnc/files/525639/vnc_log.txt


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#368, or mute the thread
https://github.com/notifications/unsubscribe-auth/AHnWbdW8XaCO2vhbkQfrtORsDXba9Rtkks5qzUtsgaJpZM4KVOAY
.

@jolmichr

Hi Brian,
I've tested across many versions. With our setup the problem persists across these RTL Compiler versions:
v12.10-s012_1
v14.10-p008_1
v14.20-s064_1
v14.20-s016_1

I have an open ticket with Cadence on this also. According to them, the problem can be replicated in TigerVNC 1.6. But they say I should not see the problem in versions 1.5 / 1.7. I've tested to on 1.7 but I see no difference. If you like I can share the same testcase I passed them -- its pretty small.

Thanks -joel

@CendioOssman
Member

It might have nothing to do with TigerVNC code, but rather which version of Xorg that has been used. Check with Cadence what Xorg they have used compared to your builds.

@bphinz bphinz was assigned by CendioOssman Oct 17, 2016
@jolmichr

Thanks for the suggestion. Unfortunately, they are not using TigerVNC and have not reproduced my issue. The Cadence AE I am working with was able find previous service requests where their customers complained about the same issue. Nonetheless this is their setup:

X Window System Version 7.1.1
Release Date: 12 May 2006
X Protocol Version 11, Revision 0, Release 7.1.1
Build Operating System: Linux 2.6.18-348.4.1.el5 x86_64 Red Hat, Inc.
Current Operating System: Linux vlno-ankurgup 2.6.18-371.el5 #1 SMP Thu Sep 5 21:21:44 EDT 2013 x86_64
Build Date: 29 May 2013

Build ID: xorg-x11 -server 1.1.1-48.101.el5

We are using

X.Org X Server 1.17.4
Release Date: 2015-10-28

@bphinz
Contributor
bphinz commented Oct 19, 2016

Joel - Can you go ahead and send me the testcase? I'll see if I can get some temp keys.

@jolmichr

Attaching the testcase. There is a README in there with instructions but this should do it:
testcase_for_TigerVNC.tar.gz

gtar xfvz testcase_for_TigerVNC.tar.gz
cd LAB1
rc -vdi -f run.tcl

Thanks -joel

@ercanal
ercanal commented Nov 16, 2016

FYI
I sent this problem to user forum and there is a backtrace with debug packages. (27 June 2016)

Copying the backtrace from user forum

Program received signal SIGSEGV, Segmentation fault.
DamageUnregister (pDamage=0x0) at damage.c:1762

(gdb) bt
#0 DamageUnregister (pDamage=0x0) at damage.c:1762
#1 0x000000000054c7a1 in compSetParentPixmap (pWin=0xe6cf20) at compalloc.c:641
#2 0x000000000054d0d1 in compFreeClientWindow (pWin=0xe6cf20, id=) at compalloc.c:286
#3 0x0000000000548039 in FreeCompositeClientWindow (value=, ccwid=) at compext.c:85
#4 0x00000000006b3863 in doFreeResource (res=0xe6d0c0, skip=0) at resource.c:895
#5 0x00000000006b43a0 in FreeResource (id=246, skipDeleteFuncType=0) at resource.c:925
#6 0x000000000054cb73 in compUnredirectWindow (pClient=0x9f08c0, pWin=, update=0) at compalloc.c:331
#7 0x00000000005492e4 in compCheckBackingStore (pWin=0xe6cf20, mask=) at compinit.c:123
#8 compChangeWindowAttributes (pWin=0xe6cf20, mask=) at compinit.c:144
#9 0x000000000054b013 in compDestroyWindow (pWin=0xe6cf20) at compwindow.c:660
#10 0x00000000004fc9ad in damageDestroyWindow (pWindow=0xe6cf20) at damage.c:1559
#11 0x00000000004acfba in DbeDestroyWindow (pWin=0xe6cf20) at dbe.c:1325
#12 0x00000000004f7e06 in present_destroy_window (window=0xe6cf20) at present_screen.c:122
#13 0x00000000006bfaa4 in FreeWindowResources (pWin=0xe6cf20) at window.c:910
#14 0x00000000006bfb77 in CrushTree (value=0xe69630, wid=4194347) at window.c:943
#15 DeleteWindow (value=0xe69630, wid=4194347) at window.c:970
#16 0x00000000006b3863 in doFreeResource (res=0xe69610, skip=0) at resource.c:895
#17 0x00000000006b43a0 in FreeResource (id=4194347, skipDeleteFuncType=0) at resource.c:925
#18 0x000000000068b40f in ProcDestroyWindow (client=0xe637d0) at dispatch.c:716
#19 0x000000000068f76e in Dispatch () at dispatch.c:430
#20 0x000000000069315a in dix_main (argc=, argv=0x7fffffffd6a8, envp=) at main.c:298
#21 0x000000344601ed5d in __libc_start_main () from /lib64/libc.so.6
#22 0x000000000046a529 in _start ()

@CendioOssman
Member

At first glance this looks like a bug in Xorg rather than TigerVNC. Most likely in the Damage code.

@bphinz
Contributor
bphinz commented Nov 16, 2016
@jolmichr

Sorry for the delay, I have new findings on this issue. All of our machines I typically use have Redhat OS. We do have 1 that is running Ubuntu and that one does not see the problem. We have not traced the problem to its source but I can provide this info:

Ubuntu:

(trusty-vnc-server)acook@nhblade12:~$ lsb_release -a
LSB Version: core-2.0-amd64:core-2.0-noarch:core-3.0-amd64:core-3.0-noarch:core-3.1-amd64:core-3.1-noarch:core-3.2-amd64:core-3.2-noarch:core-4.0-amd64:core-4.0-noarch:core-4.1-amd64:core-4.1-noarch:cxx-3.0-amd64:cxx-3.0-noarch:cxx-3.1-amd64:cxx-3.1-noarch:cxx-3.2-amd64:cxx-3.2-noarch:cxx-4.0-amd64:cxx-4.0-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-3.1-amd64:desktop-3.1-noarch:desktop-3.2-amd64:desktop-3.2-noarch:desktop-4.0-amd64:desktop-4.0-noarch:desktop-4.1-amd64:desktop-4.1-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.0-amd64:graphics-3.0-noarch:graphics-3.1-amd64:graphics-3.1-noarch:graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch:graphics-4.1-amd64:graphics-4.1-noarch:languages-3.2-amd64:languages-3.2-noarch:languages-4.0-amd64:languages-4.0-noarch:languages-4.1-amd64:languages-4.1-noarch:multimedia-3.2-amd64:multimedia-3.2-noarch:multimedia-4.0-amd64:multimedia-4.0-noarch:multimedia-4.1-amd64:multimedia-4.1-noarch:printing-3.2-amd64:printing-3.2-noarch:printing-4.0-amd64:printing-4.0-noarch:printing-4.1-amd64:printing-4.1-noarch:qt4-3.1-amd64:qt4-3.1-noarch:security-4.0-amd64:security-4.0-noarch:security-4.1-amd64:security-4.1-noarch
Distributor ID: Ubuntu
Description: Ubuntu 14.04.3 LTS
Release: 14.04
Codename: trusty

RHEL

LSB Version: :base-4.0-amd64:base-4.0-ia32:base-4.0-noarch:core-4.0-amd64:core-4.0-ia32:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-ia32:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-ia32:printing-4.0-noarch
Distributor ID: RedHatEnterpriseServer
Description: Red Hat Enterprise Linux Server release 6.8 (Santiago)
Release: 6.8
Codename: Santiago

The problem is there in GENUS as well -- Cadence should be aware of that as it is documented in SR 46025268.

Thanks -joel

@bphinz
Contributor
bphinz commented Nov 17, 2016
@bphinz
Contributor
bphinz commented Nov 17, 2016

Confirmed that I can reproduce the same thing on a fully patched CentOS 6.8 box with TigerVNC 1.6.80, and that the issue does NOT occur on the same box with TigerVNC 1.4.1. All of the relevant linked libraries between the two versions of Xvnc are the same, so I think Pierre's suggestion is probably the right place to start looking. I'll take a look at what has changed in both the xorg-x11-server-source, and our RPM between the two versions tomorrow.

@bphinz
Contributor
bphinz commented Nov 18, 2016

I think this may be related to a recent Xorg bug. Joel, can you please confirm that starting Xvnc with the -bs option prevents the crash (it does for me)? The patched version of xorg-x11-server-source was released in May, so the 1.7.0 releases should have included the fix but the 1.6.0 would not. The build servers should be set to run yum update before every build, but I'll confirm tonight that the xorg-x11-server-source package was updated prior to the 1.7.0 builds.

@jolmichr

Awesome! That option works on my end as well. Thanks for tracking this down for us.

@bphinz
Contributor
bphinz commented Nov 18, 2016

Great. I'll keep digging and see if we can resolve the issue. Can you please update the Cadence SR with the -bs workaround in case they need to help other customers that are affected?

@jolmichr

will do.

@jolmichr

what does the -gs switch do? I don't see it in documentation.

@jolmichr

sorry, meant -bs

@bphinz
Contributor
bphinz commented Nov 18, 2016

Disables support for backing store.

@CendioOssman
Member

Nice work @bphinz . :)

@bphinz
Contributor
bphinz commented Nov 21, 2016

Thanks. I checked and the el6 builds had been using the latest xorg-x11-server-source for several months prior to the 1.7.0 release. I haven't looked at the details of what was changed to address the bug that I referenced, but obviously there is still something affecting Xvnc.

@bphinz
Contributor
bphinz commented Nov 21, 2016

This is the best source of info that I can find regarding these changes. Additionally, as far as I can tell, this is completely specific to RHEL/CentOS 6 - I can't find any trace of these changes in EL7 (despite both using xorg 1.17), nor can I find any evidence that any of the changes ever made their way into the xorg source tree. So I'm not sure what the best way to proceed here is (maybe disabling backing store at the build level?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment