Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zcashd exits with "Segmentation fault" shortly after starting while syncing #2787

Closed
lucasvo opened this issue Dec 2, 2017 · 7 comments
Closed
Labels
database corruption I-fail-to-run The zcashd binary fails to start, or crashes shortly after starting.

Comments

@lucasvo
Copy link

lucasvo commented Dec 2, 2017

Describe the issue

I set up a node inside of a docker container based on the the Dockerfile pasted below. I aborted the process a few times before by hitting Ctrl+C in the terminal and it was continuing the sync from where it left off until it got stuck in this state.

I backed up the datadir and started with a blank one and it runs as expected.

Can you reliably reproduce the issue?

With my corrupted data dir it segfaults reliably on startup.

Expected behaviour

It shouldn't crash.

Actual behaviour + errors

Application ends with the text "Segmentation fault" No other error.
The debug.log excerpt from the last normal shutdown doesn't indicate anything bad happened.

The version of Zcash you were using:

Zcash Daemon version v1.0.13 from https://apt.z.cash/ (jessie main)

Machine specs:

Any extra information that might be useful in the debugging process.

[...]
2017-12-02 13:20:11 UpdateTip: new best=00097d6c62a6434b7abcf5c0664c14a4aaf4d68a11ef619f1241955406b20eac  height=72982  log2_work=30.087289  tx=103453  date=2017-05-04 12:09:08 progress=0.302398  cache=0.4MiB(688tx)
2017-12-02 13:20:11 UpdateTip: new best=000ee0c96eff5fadaf81b419cde77dd71192017b9e8ea85de3306c7153455396  height=72983  log2_work=30.087294  tx=103455  date=2017-05-04 12:14:42 progress=0.302409  cache=0.4MiB(689tx)
2017-12-02 13:20:11 UpdateTip: new best=0008a5a5188f73aec29626604fae14c0346b555dd89d9aede4020aa7b09b1b6a  height=72984  log2_work=30.087299  tx=103456  date=2017-05-04 12:15:57 progress=0.302413  cache=0.4MiB(690tx)
2017-12-02 13:20:11 UpdateTip: new best=0010b4d02969b466d49c995f418cc9da7c7022ae1715051ca1d2f72ea0c4e057  height=72985  log2_work=30.087304  tx=103457  date=2017-05-04 12:16:00 progress=0.302416  cache=0.4MiB(691tx)
2017-12-02 13:20:11 UpdateTip: new best=000989d75b94cace04511b199fd4bb2ef8b1da144c4282591b34c72bd1fda419  height=72986  log2_work=30.087309  tx=103460  date=2017-05-04 12:17:13 progress=0.302426  cache=0.4MiB(694tx)
2017-12-02 13:20:11 tor: Thread interrupt
2017-12-02 13:20:11 torcontrol thread exit
2017-12-02 13:20:11 opencon thread interrupt
2017-12-02 13:20:11 addcon thread interrupt
2017-12-02 13:20:11 scheduler thread interrupt
2017-12-02 13:20:11 UpdateTip: new best=0007908ca4ba9c3654d5c6d362a18f2b56a3c83553074844a7508552c7c01039  height=72987  log2_work=30.087314  tx=103464  date=2017-05-04 12:20:55 progress=0.302442  cache=0.4MiB(697tx)
2017-12-02 13:20:11 msghand thread interrupt
2017-12-02 13:20:11 net thread interrupt
2017-12-02 13:20:11 Shutdown: In progress...
2017-12-02 13:20:11 StopRPC: waiting for async rpc workers to stop
2017-12-02 13:20:11 StopNode()
2017-12-02 13:20:11 Shutdown: done
2017-12-02 13:20:15 



















2017-12-02 13:20:15 Zcash version v1.0.13 (2017-11-20 13:48:14 -0800)
2017-12-02 13:20:15 Using OpenSSL version OpenSSL 1.1.0d  26 Jan 2017
2017-12-02 13:20:15 Using BerkeleyDB version Berkeley DB 6.2.23: (March 28, 2016)
2017-12-02 13:20:15 Default data directory /root/.zcash
2017-12-02 13:20:15 Using data directory /root/.zcash/testnet3
2017-12-02 13:20:15 Using config file /root/.zcash/zcash.conf
2017-12-02 13:20:15 Using at most 125 connections (1048576 file descriptors available)
2017-12-02 13:20:15 Using 4 threads for script verification
2017-12-02 13:20:15 scheduler thread start
2017-12-02 13:20:15 Loading verifying key from /root/.zcash-params/sprout-verifying.key
2017-12-02 13:20:15 Loaded verifying key in 0.004999s seconds.
2017-12-02 13:20:15 libevent: getaddrinfo: address family for nodename not supported
2017-12-02 13:20:15 Binding RPC on address :: port 18232 failed.
2017-12-02 13:20:15 HTTP: creating work queue of depth 16
2017-12-02 13:20:15 HTTP: starting 4 worker threads
2017-12-02 13:20:15 Using wallet wallet.dat
2017-12-02 13:20:15 CDBEnv::Open: LogDir=/root/.zcash/testnet3/database ErrorFile=/root/.zcash/testnet3/db.log
2017-12-02 13:20:15 Bound to [::]:18233
2017-12-02 13:20:15 Bound to 0.0.0.0:18233
2017-12-02 13:20:15 Cache configuration:
2017-12-02 13:20:15 * Using 2.0MiB for block index database
2017-12-02 13:20:15 * Using 32.5MiB for chain state database
2017-12-02 13:20:15 * Using 65.5MiB for in-memory UTXO set
2017-12-02 13:20:15 Opening LevelDB in /root/.zcash/testnet3/blocks/index
2017-12-02 13:20:16 Opened LevelDB successfully
2017-12-02 13:20:16 Opening LevelDB in /root/.zcash/testnet3/chainstate
2017-12-02 13:20:16 Opened LevelDB successfully
2017-12-02 13:20:25 
[output really ends here]

Do you have a back up of ~/.zcash directory and/or take a VM snapshot?

FROM ubuntu:16.04
 
 RUN apt-get update && apt-get install -y --no-install-recommends apt-transport-https ca-certificates wget
 RUN wget -O - https://apt.z.cash/zcash.asc | apt-key add -
 RUN echo "deb https://apt.z.cash/ jessie main" | tee /etc/apt/sources.list.d/zcash.list
 RUN apt-get update && apt-get install -y zcash
 RUN zcash-fetch-params
 EXPOSE 8232
 CMD zcashd
@daira daira added the I-fail-to-run The zcashd binary fails to start, or crashes shortly after starting. label Dec 16, 2017
@daira daira added this to Work Queue in Security and Stability via automation Dec 16, 2017
@ioptio
Copy link
Contributor

ioptio commented Dec 18, 2017

Seems like there might have been corruption of datadir due to abort attempts on Docker since it's running as expected on reinstall.

Using the stop command should insure proper shut down instead of CTRL+c or another kill method.

@daira any additional insight?

@ioptio ioptio added this to Collect Information in User Support Dec 18, 2017
@lucasvo
Copy link
Author

lucasvo commented Jan 7, 2018

@ioptio The line "2017-12-02 13:20:11 Shutdown: done" would lead me to believe that it did shut down properly before docker killed the container and the datadir got corrupted during a normal shutdown. But perhaps I am wrong and it was indeed docker that caused the corruption in which case I wouldn't expect zcash to recover from this.

@ianamunoz
Copy link
Contributor

@lucasvo How are you running your container? Are you mounting your datadir in/out to persist changes/inside outside the cointainer? Just trying to get a little more information on how you are running your container/daemon.

@lucasvo
Copy link
Author

lucasvo commented Mar 4, 2018

@ianamunoz Yes, I was mounting the volume with docker run -v /src:/dst ...

@ioptio
Copy link
Contributor

ioptio commented Mar 19, 2018

@ianamunoz any additional input/questions?

@ianamunoz
Copy link
Contributor

Couple thoughts here @lucasvo in case you are still trying to run with a docker container. One is your container logs indicate you are running a testnet node. I see it starting with 18232 and the default data directory /root/.zcash. Your Dockerfile exposes the mainnet port rather then the testnet port.

If you are trying to talk to the daemon from your host outside of the container your ~/.zcash/zcash.conf requires a parameter in the configuration to allow rpc requests from outside of the localhost. (i.e. #rpcallowip=1.2.3.4/24) Beware security...

Hard to debug the docker side without your exact run command. If you don't mount in the directory to the proper location you would see a new data directory being created inside the container. I have also seen permissions affect the runtime depending on your uid:gid of the running container. Running with defaults runs the container as root, which on OSX systems can do weird things inside of your home directory.

I can't pull your datadir from the link above to see your .conf file. Does it still exist somewhere?

@lindanlee
Copy link

We're closing this ticket. Feel free to reopen if the above comment didn't address your concerns. Hope it works out!

User Support automation moved this from Collect Information to Complete Apr 9, 2018
@daira daira removed this from Work Queue in Security and Stability Jul 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database corruption I-fail-to-run The zcashd binary fails to start, or crashes shortly after starting.
Projects
User Support
  
Complete
Development

No branches or pull requests

5 participants