Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

eglGetDisplay crashes the node.js #112

Closed
cheery opened this Issue · 18 comments

4 participants

@cheery

Reproduce the bug here: https://gist.github.com/3968372

Some earlier discussion about it on node.js: https://groups.google.com/forum/?fromgroups=#!topic/nodejs/MGCj_y4VL6w

@popcornmix
Owner

You will need to call bcm_host_init() before using EGL.

@cheery

I found out bcm_host_init() won't fix this particular thing.

I did the same workaround in https://github.com/cheery/node-video which that other author did - I preload a library that calls eglGetDisplay and eglInitialize at load, before the node.js gets to run. This is just plain ugly though.

@cheery

I've written few demos over node-video this far by using that lousy hack. The programs run okay for seconds or minutes, then they disappear from the screen and keep running. I could try put a white dispmanx-resource on the screen to see whether dispmanx or egl goes down. But I'm pretty sure it's egl there going down.

Is there other programs that have had trouble with egl?

@cheery

It seems to be something related to my program. node-openvg isn't doing black screens on my system.

@cheery

node-video killed it's videocontext because I misunderstood few things in node.js API. So it was unrelated to this issue.

@cheery

Perhaps it's something in the libvcos. As this works too:

LD_PRELOAD="libvcos.so" node tutorial/tutorial-2.js

I also checked what's the effect in gdb. vcos_platform_init () gets to run before node.js starts.

@popcornmix
Owner

I wonder if there is a function defined in both libvcos and another library, and the when it's crashing, you are getting the wrong one. The LD_PRELOAD is forcing the right one to be used.

@cheery

In that case, here's the ldd output from node and libvcos.so:

pi@raspberrypi ~/ $ ldd `which node`
    /usr/lib/arm-linux-gnueabihf/libcofi_rpi.so (0x40331000)
    librt.so.1 => /lib/arm-linux-gnueabihf/librt.so.1 (0x40076000)
    libdl.so.2 => /lib/arm-linux-gnueabihf/libdl.so.2 (0x400e8000)
    libstdc++.so.6 => /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 (0x4011d000)
    libm.so.6 => /lib/arm-linux-gnueabihf/libm.so.6 (0x401f6000)
    libgcc_s.so.1 => /lib/arm-linux-gnueabihf/libgcc_s.so.1 (0x40267000)
    libpthread.so.0 => /lib/arm-linux-gnueabihf/libpthread.so.0 (0x40086000)
    libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0x4033a000)
    /lib/ld-linux-armhf.so.3 (0x400f6000)
pi@raspberrypi ~/ $ ldd /opt/vc/lib/libvcos.so
    /usr/lib/arm-linux-gnueabihf/libcofi_rpi.so (0x40217000)
    libpthread.so.0 => /lib/arm-linux-gnueabihf/libpthread.so.0 (0x40108000)
    libdl.so.2 => /lib/arm-linux-gnueabihf/libdl.so.2 (0x40056000)
    librt.so.1 => /lib/arm-linux-gnueabihf/librt.so.1 (0x40061000)
    libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0x40220000)
    /lib/ld-linux-armhf.so.3 (0x40018000)

node-egl-bagu might be worthwhile to peek as well:

pi@raspberrypi ~/node-egl-bagu $ ldd build/Release/bagu.node
    /usr/lib/arm-linux-gnueabihf/libcofi_rpi.so (0x401dc000)
    libEGL.so => /opt/vc/lib/libEGL.so (0x40139000)
    libGLESv2.so => /opt/vc/lib/libGLESv2.so (0x40109000)
    libstdc++.so.6 => /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 (0x401e5000)
    libm.so.6 => /lib/arm-linux-gnueabihf/libm.so.6 (0x4016b000)
    libgcc_s.so.1 => /lib/arm-linux-gnueabihf/libgcc_s.so.1 (0x402b2000)
    libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0x402da000)
    libvchiq_arm.so => /opt/vc/lib/libvchiq_arm.so (0x400aa000)
    libvcos.so => /opt/vc/lib/libvcos.so (0x40126000)
    libbcm_host.so => /opt/vc/lib/libbcm_host.so (0x40407000)
    libpthread.so.0 => /lib/arm-linux-gnueabihf/libpthread.so.0 (0x40045000)
    libdl.so.2 => /lib/arm-linux-gnueabihf/libdl.so.2 (0x40420000)
    librt.so.1 => /lib/arm-linux-gnueabihf/librt.so.1 (0x4007f000)
    /lib/ld-linux-armhf.so.3 (0x400e2000)

I'll do a script which looks up whether symbol tables of these libraries collide. Before that I'll try whether I can narrow this down any further though.

@cheery

Not much of success on this front. I also tried the LD_PRELOAD -trick on libraries that libvcos is loading. There were no effect.

I'll try compile debug build of node.js and v8, and look at it with gdb. Some of my example programs don't just hang but crash because of this error.

@popcornmix
Owner

I was thinking of a collision of globals in "objdump -t /opt/vc/lib/libvcos.so" and one of the other /opt/vc/lib libraries.

@cheery

If vcos_platform_init () runs after node.js is executing a script, calling the eglGetDisplay causes node.js to hang. Otherwise everything works.

It feels oddly specific. Things break only in single case that happens.

Some debug output:

pi@raspberrypi ~/node-egl-bagu $ LD_PRELOAD="libvcos.so" gdb node
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/local/bin/node...done.
(gdb) break vcos_platform_init
Function "vcos_platform_init" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y

Breakpoint 1 (vcos_platform_init) pending.
(gdb) run demo.js
Starting program: /usr/local/bin/node demo.js
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".

Breakpoint 1, vcos_platform_init ()
    at /home/dc4/raspberry/arm_linux_revert/vc4/interface/vcos/pthreads/vcos_pthreads.c:326
326     /home/dc4/raspberry/arm_linux_revert/vc4/interface/vcos/pthreads/vcos_pthreads.c: No such file or directory.
(gdb) continue
Continuing.
loading bagu
Init()
Init().exit
bagu loaded
{ demonstrate: [Function] }
attempt to initialize egl
Demonstrate()
[New Thread 0x40cff460 (LWP 2185)]
[New Thread 0x414ff460 (LWP 2186)]
[New Thread 0x41cff460 (LWP 2187)]
[New Thread 0x424ff460 (LWP 2188)]
pthread_self == 1073874208
pthread_self == 1073874208
Demonstrate().exit
egl returned, result
1
[Thread 0x424ff460 (LWP 2188) exited]
[Thread 0x41cff460 (LWP 2187) exited]
[Thread 0x414ff460 (LWP 2186) exited]
[Thread 0x40cff460 (LWP 2185) exited]
[Inferior 1 (process 2182) exited normally]
(gdb) 

pi@raspberrypi ~/node-egl-bagu $ gdb node
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/local/bin/node...done.
(gdb) break vcos_platform_init
Function "vcos_platform_init" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y

Breakpoint 1 (vcos_platform_init) pending.
(gdb) run demo.js
Starting program: /usr/local/bin/node demo.js
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
loading bagu

Breakpoint 1, vcos_platform_init ()
    at /home/dc4/raspberry/arm_linux_revert/vc4/interface/vcos/pthreads/vcos_pthreads.c:326
326     /home/dc4/raspberry/arm_linux_revert/vc4/interface/vcos/pthreads/vcos_pthreads.c: No such file or directory.
(gdb) continue
Continuing.
Init()
Init().exit
bagu loaded
{ demonstrate: [Function] }
attempt to initialize egl
Demonstrate()
[New Thread 0x40bce460 (LWP 2201)]
[New Thread 0x413ce460 (LWP 2202)]
[New Thread 0x41bce460 (LWP 2203)]
[New Thread 0x423ce460 (LWP 2204)]
pthread_self == 1073873392
pthread_self == 1073873392
Demonstrate().exit
^C
Program received signal SIGINT, Interrupt.
0x00393204 in v8::internal::JSReceiver::Lookup(v8::internal::String*, v8::internal::LookupResult*) ()
(gdb) 

And here's the demo.js yet once:

console.log("loading bagu");
var bagu = require("./build/Release/bagu");
console.log("bagu loaded");
console.log(bagu);
console.log("attempt to initialize egl");
var egl = bagu.demonstrate();
console.log("egl returned, result");
console.log(egl);
@cheery

Maybe this could be figured out by debugging eglGetDisplay more granular.

@cheery

On both valid and invalid runs, the eglGetDisplay invokes:

eglGetDisplay(display_id=0x0) at khronos/egl/egl_client_cr.c
172        CLIENT_THREAD_STATE_T *thread = CLIENT_GET_CHECK_THREAD_STATE();
CLIENT_GET_CHECK_THREAD_STATE() at khronos/common/khrn_client.h
170        return (CLIENT_THREAD_STATE_T *)platform_tls_get_check(client_tls);
platform_tls_get_check(tls=0) at khronos/common/linux/khrn_client_platform_linux.c
118        return vcos_tls_get(tls);
vcos_tls_get(tls=0) at vcos/vcos_platform.h
625        return pthread_getspecific(tls);
eglGetDisplay(display_id=0x0) at khronos/egl/egl_client_cr.c
173        if (thread)
174           thread->error = EGL_SUCCESS;
176        return khrn_platform_set_display_id(display_id);
khrn_client_platform_linux.c
465        if (display_id == EGL_DEFAULT_DISPLAY)
466           return (EGLDisplay)1;

It doesn't appear do anything interesting except this:
thread = (CLIENT_THREAD_STATE*)pthread_getspecific(0)
thread->error = EGL_SUCCESS;

Lets see who call pthread_setspecific, shall we? :)

@cheery

This is output when everything works fine:

vcos_platform_init () at vcos_pthreads.c
381        pst = pthread_setspecific(_vcos_thread_current_key, &vcos_thread_main);
(gdb) print _vcos_thread_current_key
$1 = 0

The output when things break down:

vcos_platform_init () at vcos_pthreads.c
381        pst = pthread_setspecific(_vcos_thread_current_key, &vcos_thread_main);
(gdb) print _vcos_thread_current_key
$1 = 4

And snap! Now we see that the eglGetDisplay gets entirely different thread-specific field of data than it is supposed to get.

@popcornmix Apparently the 'client_tls' -variable in libEGL isn't what it ought be.

@cheery

This makes it to work, but I'm not sure if it's the desired behavior: raspberrypi/userland#6

@sdroege

I think this is the same as #99

@popcornmix popcornmix closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.