Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal signal 11 received on Mac OS #840

Closed
bitstein opened this issue Jan 29, 2018 · 22 comments · Fixed by #1258
Closed

Fatal signal 11 received on Mac OS #840

bitstein opened this issue Jan 29, 2018 · 22 comments · Fixed by #1258

Comments

@bitstein
Copy link

Issue and Steps to Reproduce

I am running Mac OS X 10.13.2. With the new libwally update, I was able to build c-lightning just fine this morning. However, after I got my bitcoind testnet daemon synced and booted lightningd, upon making a command, I received this output in the lightningd log:

lightningd(68725): Connected json input
lightningd(68725): FATAL SIGNAL 11 RECEIVED

getinfo output

getinfo hangs, no output.

@cdecker
Copy link
Member

cdecker commented Jan 29, 2018

Can you run with a debugger? gdb lightningd might work, and once it breaks type in bt to see a backtrace. The copy paste it here :-)

@cdecker cdecker changed the title Fatal signal 11 received Fatal signal 11 received on Mac OS Jan 29, 2018
@yashbhutwala
Copy link
Contributor

@cdecker I'm having the same issues as @bitstein on a Mac. I tried to run debug using lightningd, but for me both the daemon and the client hang upon any request, so I'm not getting a backtrace.

@adam-hudspith
Copy link

adam-hudspith commented Jan 31, 2018

@bitstein , @cdecker @yashbhutwala I'm getting the same problem. I managed to get this with lightning-cli help :

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    frame #0: 0x00007fff73cf041a libsystem_kernel.dylib`read + 10
  * frame #1: 0x0000000104730b1a lightningd`check_with_child(batch=0x00007ffeeb540608, p=0x00000001046be000, size=8, is_write=false) at ptr_valid.c:258
    frame #2: 0x000000010473087b lightningd`ptr_valid_batch(batch=0x00007ffeeb540608, p=0x00000001046be000, alignment=1, size=8, write=false) at ptr_valid.c:289
    frame #3: 0x000000010471cf8f lightningd`autodata_make_table(example=0x00000001047d0d70, name="json_command", nump=0x00000001047d1b20) at autodata.c:33
    frame #4: 0x00000001046caedf lightningd`get_cmdlist at jsonrpc.c:281
    frame #5: 0x00000001046cc335 lightningd`find_cmd(buffer="{ \"method\" : \"help\", \"id\" : \"lightning-cli-68421\", \"params\" : [ ] }", tok=0x00007fc51d009978) at jsonrpc.c:312
    frame #6: 0x00000001046cc0de lightningd`parse_request(jcon=0x00007fc51d000760, tok=0x00007fc51d009950) at jsonrpc.c:574
    frame #7: 0x00000001046cbb3c lightningd`read_json(conn=0x00007fc51d0084c0, jcon=0x00007fc51d000760) at jsonrpc.c:667
    frame #8: 0x0000000104726e83 lightningd`next_plan(conn=0x00007fc51d0084c0, plan=0x00007fc51d0084f0) at io.c:59
    frame #9: 0x0000000104727b6c lightningd`do_plan(conn=0x00007fc51d0084c0, plan=0x00007fc51d0084f0, idle_on_epipe=false) at io.c:387
    frame #10: 0x00000001047279c4 lightningd`io_ready(conn=0x00007fc51d0084c0, pollflags=1) at io.c:397
    frame #11: 0x0000000104729397 lightningd`io_loop(timers=0x00007fc51bc028d0, expired=0x00007ffeeb540a00) at poll.c:305
    frame #12: 0x00000001046ccc56 lightningd`main(argc=4, argv=0x00007ffeeb540a68) at lightningd.c:347
    frame #13: 0x00007fff73b9f115 libdyld.dylib`start + 1
    frame #14: 0x00007fff73b9f115 libdyld.dylib`start + 1

@yashbhutwala
Copy link
Contributor

Is it possible to run a lightning node in a container but bitcoind on the host? I've tried but to no avail.

@cdecker
Copy link
Member

cdecker commented Feb 1, 2018

@yashbhutwala yes, you just need to make sure that bitcoin-cli from the docker container can talk to bitcoind on the host. Something like the following should do it:

docker run -v $HOME/.bitcoin:/root/.bitcoin --net=host docker/image

That'll mount the hosts .bitcoin directory in the docker image so it can read the RPC config and it'll allow bitcoin-cli to connect to bitcoind over the loopback interface.

@yashbhutwala
Copy link
Contributor

@cdecker I've already tried that but container is still not able to talk to the bitcoind server on my machine. --net=host (docker host) seems to have a different ip than my machine.

image

@yashbhutwala
Copy link
Contributor

Upon some research, it turns out Docker on Mac works such that Docker host is not the same as localhost, there is no docker0 bridge on macOS.

image

@Sjors
Copy link
Contributor

Sjors commented Feb 3, 2018

I get the same reader FATAL SIGNAL 11 RECEIVED on MacOS 10.13.3

gdb --args lightningd/lightningd --network=testnet --port=9000 --log-level=debug
start
During startup program terminated with signal ?, Unknown signal.

I'm not familiar with gdb. How do I make it spit out more useful information?

@Sjors
Copy link
Contributor

Sjors commented Feb 3, 2018

Note that lightningd/lightningd --help doesn't crash.

Also, if bitcoind isn't running, I get FATAL SIGNAL 6 RECEIVED and the lightningd process stays active. I then have to kill it with killall -9 lightningd as cli/lightning-cli stop will complain Connecting to 'lightning-rpc': No such file or directory.

@Sjors
Copy link
Contributor

Sjors commented Feb 3, 2018

@benharold have you encountered this error during your work on Voltage? Or do you always run c-lightning on a remote linux server?

@benharold
Copy link

@Sjors I have always run on a remote linux server.

@laanwj
Copy link
Contributor

laanwj commented Feb 6, 2018

I get similar backtraces in autodata_make_tablerun_child on FreeBSD, though surprisingly it seems the process keeps running, and working. It does create core files every time though (maybe no more after #922).

While running getinfo, in lightningd:

#5  <signal handler called>
#6  run_child (infd=17, outfd=20) at ccan/ccan/ptr_valid/ptr_valid.c:190
#7  0x00000000004760fc in create_child (batch=0x7fffffffe618) at ccan/ccan/ptr_valid/ptr_valid.c:216
#8  0x0000000000475ce5 in check_with_child (batch=0x7fffffffe618, p=0x733000, size=8, is_write=false) at ccan/ccan/ptr_valid/ptr_valid.c:245
#9  0x0000000000475afb in ptr_valid_batch (batch=0x7fffffffe618, p=0x733000, alignment=1, size=8, write=false) at ccan/ccan/ptr_valid/ptr_valid.c:289
#10 0x000000000046283b in autodata_make_table (example=0x733950, name=0x4f6a6a "json_command", nump=0x735830) at ccan/ccan/autodata/autodata.c:38
#11 0x000000000040e949 in get_cmdlist () at lightningd/jsonrpc.c:286
#12 0x000000000040fdd5 in find_cmd (buffer=0x801a367a0 "{ \"method\" : \"getinfo\", \"id\" : \"lightning-cli-82041\", \"params\" : [ ] }\034", tok=0x801d408c8) at lightningd/jsonrpc.c:346
#13 0x000000000040fbbb in parse_request (jcon=0x801b7d060, tok=0x801d408a0) at lightningd/jsonrpc.c:608
#14 0x000000000040f5ef in read_json (conn=0x801d40600, jcon=0x801b7d060) at lightningd/jsonrpc.c:794
#15 0x000000000046c576 in next_plan (conn=0x801d40600, plan=0x801d40630) at ccan/ccan/io/io.c:59
#16 0x000000000046d219 in do_plan (conn=0x801d40600, plan=0x801d40630, idle_on_epipe=false) at ccan/ccan/io/io.c:387
#17 0x000000000046d084 in io_ready (conn=0x801d40600, pollflags=1) at ccan/ccan/io/io.c:397
#18 0x000000000046ea3b in io_loop (timers=0x801a20108, expired=0x7fffffffe9d0) at ccan/ccan/io/poll.c:305
#19 0x0000000000410717 in main (argc=5, argv=0x7fffffffea78) at lightningd/lightningd.c:35

While running connect, in gossipd:

#5  <signal handler called>
#6  run_child (infd=7, outfd=10) at ccan/ccan/ptr_valid/ptr_valid.c:190
#7  0x000000000043da0c in create_child (batch=0x7fffffffe7a8) at ccan/ccan/ptr_valid/ptr_valid.c:216
#8  0x000000000043d5f5 in check_with_child (batch=0x7fffffffe7a8, p=0x6dd000, size=8, is_write=false) at ccan/ccan/ptr_valid/ptr_valid.c:245
#9  0x000000000043d40b in ptr_valid_batch (batch=0x7fffffffe7a8, p=0x6dd000, alignment=1, size=8, write=false) at ccan/ccan/ptr_valid/ptr_valid.c:289
#10 0x000000000042a14b in autodata_make_table (example=0x6dd6c0, name=0x4b8e01 "type_to_string", nump=0x6decc8) at ccan/ccan/autodata/autodata.c:38
#11 0x0000000000416c01 in type_to_string_ (ctx=0x801a15120, typename=0x4b56ad "struct pubkey", u={}) at common/type_to_string.c:24
#12 0x00000000004069da in connection_out (conn=0x801a1c1e0, reach=0x801be6020) at gossipd/gossip.c:1565
#13 0x0000000000433e86 in next_plan (conn=0x801a1c1e0, plan=0x801a1c240) at ccan/ccan/io/io.c:59
#14 0x0000000000434b29 in do_plan (conn=0x801a1c1e0, plan=0x801a1c240, idle_on_epipe=false) at ccan/ccan/io/io.c:387
#15 0x0000000000434a03 in io_ready (conn=0x801a1c1e0, pollflags=4) at ccan/ccan/io/io.c:403
#16 0x000000000043634b in io_loop (timers=0x801bc60d0, expired=0x7fffffffea08) at ccan/ccan/io/poll.c:305
#17 0x0000000000402c21 in main (argc=1, argv=0x7fffffffeaa8) at gossipd/gossip.c:2029

I'll try to debug, but I couldn't figure out quickly what this code was supposed to do in the first place.

@Sjors
Copy link
Contributor

Sjors commented Feb 6, 2018

Possibly thanks to #922 I'm now getting a more useful crash log:

lightningd/lightningd --network=testnet --port=10000
2018-02-06T13:20:45.309Z lightningd(71192): Creating database
2018-02-06T13:20:45.449Z lightningd(71192): Server started with public key 027085167b840886f5239e10e8082c944d7b611f329a42b375ee472afe5938f0eb, alias YELLOWWHISPER (color #027085) and lightningd v0.5.2-2016-11-21-1859-gb3534462-dirty
2018-02-06T13:21:15.098Z lightningd(71192): FATAL SIGNAL 11 RECEIVED
2018-02-06T13:21:15.099Z lightningd(71192): error getting backtrace: lightningd/lightningd (2)
2018-02-06T13:21:15.099Z lightningd(71192): error getting backtrace: failed to read executable information (-1)
...
Log dumped in crash.log
2018-02-06T13:21:15.105Z lightningd(71192): FATAL SIGNAL 10 RECEIVED
2018-02-06T13:21:15.106Z lightningd(71192): error getting backtrace: lightningd/lightningd (2)
...
Log dumped in crash.log

That FATAL SIGNAL 11 error happened when I did:

cli/lightning-cli help

It did however return the help text. Not sure why it threw that error twice.

crash.log

@Sjors
Copy link
Contributor

Sjors commented Feb 6, 2018

In fact, it just keeps going. I was able to generate an address and it detects those funds.

I did get a Fatal signal 11 when I tried to connect to another node, unfortunately nothing in crash.log this time.

2018-02-06T13:31:47.985Z lightning_gossipd(72181): TRACE: req: type WIRE_GOSSIPCTL_PEER_ADDRHINT len 42
2018-02-06T13:31:47.985Z lightning_gossipd(72181): TRACE: req: type WIRE_GOSSIPCTL_REACH_PEER len 35
lightning_gossipd: Fatal signal 11
0x10888a8f9 ???
	???:0
0x7fff79b39f59 ???
	???:0
0x1088b20e1 ???
	???:0
0x1088b1f8b ???
	???:0
0x1088b1b64 ???
	???:0
0x1088b197a ???
	???:0
0x10889e11e ???
	???:0
0x10888ab46 ???
	???:0
0x10887a598 ???
	???:0
0x1088a8012 ???
	???:0
0x1088a8cfb ???
	???:0
0x1088a8bc2 ???
	???:0
0x1088aa526 ???
	???:0
0x108876524 ???
	???:0
2018-02-06T13:31:48.205Z lightning_gossipd(72181): TRACE: backtrace: (null):0 ((null)) 0x10888a91c
2018-02-06T13:31:48.205Z lightning_gossipd(72181): TRACE: backtrace: (null):0 ((null)) 0x7fff79b39f59
2018-02-06T13:31:48.205Z lightning_gossipd(72181): TRACE: backtrace: (null):0 ((null)) 0x1088b20e1
2018-02-06T13:31:48.205Z lightning_gossipd(72181): TRACE: backtrace: (null):0 ((null)) 0x1088b1f8b
2018-02-06T13:31:48.206Z lightning_gossipd(72181): TRACE: backtrace: (null):0 ((null)) 0x1088b1b64
2018-02-06T13:31:48.206Z lightning_gossipd(72181): TRACE: backtrace: (null):0 ((null)) 0x1088b197a
2018-02-06T13:31:48.206Z lightning_gossipd(72181): TRACE: backtrace: (null):0 ((null)) 0x10889e11e
2018-02-06T13:31:48.206Z lightning_gossipd(72181): TRACE: backtrace: (null):0 ((null)) 0x10888ab46
2018-02-06T13:31:48.206Z lightning_gossipd(72181): TRACE: backtrace: (null):0 ((null)) 0x10887a598
2018-02-06T13:31:48.206Z lightning_gossipd(72181): TRACE: backtrace: (null):0 ((null)) 0x1088a8012
2018-02-06T13:31:48.206Z lightning_gossipd(72181): TRACE: backtrace: (null):0 ((null)) 0x1088a8cfb
2018-02-06T13:31:48.206Z lightning_gossipd(72181): TRACE: backtrace: (null):0 ((null)) 0x1088a8bc2
2018-02-06T13:31:48.206Z lightning_gossipd(72181): TRACE: backtrace: (null):0 ((null)) 0x1088aa526
2018-02-06T13:31:48.206Z lightning_gossipd(72181): TRACE: backtrace: (null):0 ((null)) 0x108876524
2018-02-06T13:31:48.206Z lightning_gossipd(72181): STATUS_FAIL_INTERNAL_ERROR: FATAL SIGNAL 11
lightning_gossipd: Fatal signal 10
0x10888a8f9 ???
	???:0
0x7fff79b39f59 ???
	???:0
0x1088b20e1 ???
	???:0
0x1088b1f8b ???
	???:0
0x1088b1b64 ???
	???:0
0x1088b197a ???
	???:0
0x10889e197 ???
	???:0
0x10888ab46 ???
	???:0
0x10887a598 ???
	???:0
0x1088a8012 ???
	???:0
0x1088a8cfb ???
	???:0
0x1088a8bc2 ???
	???:0
0x1088aa526 ???
	???:0
0x108876524 ???
	???:0
lightningd: lightning_gossipd failed (exit status 2), exiting.

@isghe
Copy link
Contributor

isghe commented Feb 8, 2018

Just to inform you, that finally I succeded running c-ligthning on macOS, creating a node "Folgore ⚡️macOS" having a channel connected to "Eternity Wall" on mainnet:

isghe (compile_macOS_isghe_debug)*$ ./lightning-cli listpeers | jq
{
  "peers": [
    {
      "id": "0271d98f7dfe66198f045690f552a9126abb0aa1585f0061854e780b7e08e6dccd",
      "connected": true,
      "netaddr": [
        "163.172.139.73:9735"
      ],
      "channels": [
        {
          "state": "CHANNELD_NORMAL",
          "owner": "lightning_channeld",
          "short_channel_id": "508273:180:1",
          "funding_txid": "55a5982344ccb0934989ffd39ec5dd3df529d4a714bf9c0023cabb9b2f4fe903",
          "msatoshi_to_us": 50000000,
          "msatoshi_total": 50000000,
          "dust_limit_satoshis": 546,
          "max_htlc_value_in_flight_msat": 18446744073709552000,
          "channel_reserve_satoshis": 0,
          "htlc_minimum_msat": 0,
          "to_self_delay": 144,
          "max_accepted_htlcs": 483
        }
      ]
    }
  ]
}

In few days, I'll make some tests, clean the code, and I will create a PR :-)

@laanwj
Copy link
Contributor

laanwj commented Feb 9, 2018

I've looked somewhat closer at the ptr_valid code (which I found to cause crashes in my post above) and I found the reason. Apparently, it is using a child process to probe memory addresses for validity (whether they can be read/written). A crash in the child process is thus not an error, but an expected outcome. I am confused as to why it needs to do this, though.

In Linux it will parse /proc/self/maps instead to achieve this, but this is not available on most UNIXes, including FreeBSD and MacOSX.

Edit: autodata_make_table makes use of this to search (part of) the virtual memory space to find table record tags. This is used to build the table of JSON commands, among other things. It seems a somewhat heavy-handed way to define tables, if you ask me, as if the program is reverse-engineering itself.

@cdecker
Copy link
Member

cdecker commented Feb 9, 2018

Yep, autodata has been causing a few issues as of late, so maybe we'll just rip it out completely. Unless you have a fix that we can upstream @laanwj

@Sjors
Copy link
Contributor

Sjors commented Feb 24, 2018

@isghe any progress on your PR?

@isghe
Copy link
Contributor

isghe commented Feb 26, 2018

Hi @Sjors
my workaround is embarrassing, but for what I see, it looks, it is working: ignoring the crash, commenting the line, in the function crashdump:

	// status_failed(STATUS_FAIL_INTERNAL_ERROR, "FATAL SIGNAL %u", sig);

obviously, that is not a solution, but can make you running and debugging on macOS.

@Sjors
Copy link
Contributor

Sjors commented Feb 26, 2018

I'm not a fan of ignoring crashes on mainnet, but I suppose I can spin up a testnet node on macOS.

@cdecker
Copy link
Member

cdecker commented Mar 4, 2018

Yeah, definitely don't just ignore crashes, you might end up corrupting some persistent state and then you'll definitely lose your funds.

@conanoc
Copy link
Contributor

conanoc commented Mar 22, 2018

SIGSEGV and SIGBUS signal happens whenever jsonrpc is called on mac. Strange thing is those signals do not happen if we do not register handlers with sigaction(). Any idea?

I think we can work around this issue by deactivating crashlog. backtrace does not support OSX ,which means this crashlog is useless on mac.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants