-
Notifications
You must be signed in to change notification settings - Fork 236
Troubleshooting
This page describes how to identify and address common issues running the Aeternity node.
Sections at the bottom of the page deal with common issues in building the node or running development tests.
This section is applicable to version v0.4.1 or greater. It may apply to version v1.3.0 or greater accommodating for deviations.
(Note: the name of the miner executable were updated in v1.0.0-rc2)
(Note: the name of the log files changed in v2.0.0-rc.1)
(Note: the name of the operational script bin/epoch
changed to bin/aeternity
v1.3.0)
The following assumes that the node is deployed in directory /tmp/node
.
If the node attempts to mine though fails to do so, you shall read error log entries in /tmp/node/log/epoch_mining.log
.
You may read log entries in /tmp/node/log/epoch_mining.log
like the following...
2018-01-03 10:18:23.812 [info] <0.903.0>@aec_conductor:create_block_candidate:728 Creating block candidate
2018-01-03 10:18:23.815 [info] <0.903.0>@aec_conductor:handle_block_candidate_reply:744 Created block candidate and nonce (max 13078180597498667020, current 13078180597498667021).
2018-01-03 10:18:23.815 [info] <0.903.0>@aec_conductor:start_mining:643 Starting mining
2018-01-03 10:18:25.871 [error] <0.903.0>@aec_conductor:handle_mining_reply:670 Failed to mine block, runtime error; retrying with different nonce (was 13078180597498667021). Error: {execution_failed,{signal,sigkill,false}}
2018-01-03 10:18:25.872 [info] <0.903.0>@aec_conductor:start_mining:643 Starting mining
2018-01-03 10:18:26.230 [error] <0.903.0>@aec_conductor:handle_mining_reply:670 Failed to mine block, runtime error; retrying with different nonce (was 13078180597498667022). Error: {execution_failed,{signal,sigabrt,true}}
2018-01-03 10:18:26.230 [info] <0.903.0>@aec_conductor:start_mining:643 Starting mining
2018-01-03 10:18:26.371 [error] <0.903.0>@aec_conductor:handle_mining_reply:670 Failed to mine block, runtime error; retrying with different nonce (was 13078180597498667023). Error: {execution_failed,{signal,sigabrt,true}}
... - notice "signal,sigabrt" - and you may read corresponding log entries in /tmp/node/log/epoch_pow_cuckoo.log
like the following...
2018-01-03 10:18:23.816 [info] <0.913.0>@aec_pow_cuckoo:generate_int:156 Executing cmd: "env LD_LIBRARY_PATH=../lib:$LD_LIBRARY_PATH ./mean30s-generic -h uXkXZrU2tPmyYThehkTmZf6fqOuc6pvxCc87gv/BV8U=DWBQVvYHf7U= -t 5"
2018-01-03 10:18:25.859 [error] <0.913.0>@aec_pow_cuckoo:wait_for_result:362 OS process died: {signal,sigkill,false}
2018-01-03 10:18:25.880 [info] <0.1209.0>@aec_pow_cuckoo:generate_int:156 Executing cmd: "env LD_LIBRARY_PATH=../lib:$LD_LIBRARY_PATH ./mean30s-generic -h uXkXZrU2tPmyYThehkTmZf6fqOuc6pvxCc87gv/BV8U=DmBQVvYHf7U= -t 5"
2018-01-03 10:18:25.935 [error] <0.1209.0>@aec_pow_cuckoo:wait_for_result:347 ERROR: terminate called after throwing an instance of '
2018-01-03 10:18:25.938 [error] <0.1209.0>@aec_pow_cuckoo:wait_for_result:347 ERROR: std::bad_alloc
2018-01-03 10:18:25.939 [error] <0.1209.0>@aec_pow_cuckoo:wait_for_result:347 ERROR: '
2018-01-03 10:18:25.940 [error] <0.1209.0>@aec_pow_cuckoo:wait_for_result:347 ERROR: what():
2018-01-03 10:18:25.941 [error] <0.1209.0>@aec_pow_cuckoo:wait_for_result:347 ERROR: std::bad_alloc
2018-01-03 10:18:25.942 [error] <0.1209.0>@aec_pow_cuckoo:wait_for_result:347 ERROR:
2018-01-03 10:18:25.942 [debug] <0.1209.0>@aec_pow_cuckoo:parse_generation_result:420 Looking for 42-cycle on cuckoo30("uXkXZrU2tPmyYThehkTmZf6fqOuc6pvxCc87gv/BV8U=DmBQVvYHf7U=",0) with 50% edges
2018-01-03 10:18:26.229 [error] <0.1209.0>@aec_pow_cuckoo:wait_for_result:362 OS process died: {signal,sigabrt,true}
2018-01-03 10:18:26.230 [info] <0.1211.0>@aec_pow_cuckoo:generate_int:156 Executing cmd: "env LD_LIBRARY_PATH=../lib:$LD_LIBRARY_PATH ./mean30s-generic -h uXkXZrU2tPmyYThehkTmZf6fqOuc6pvxCc87gv/BV8U=D2BQVvYHf7U= -t 5"
2018-01-03 10:18:26.233 [error] <0.1211.0>@aec_pow_cuckoo:wait_for_result:347 ERROR: terminate called after throwing an instance of '
2018-01-03 10:18:26.234 [error] <0.1211.0>@aec_pow_cuckoo:wait_for_result:347 ERROR: std::bad_alloc
2018-01-03 10:18:26.235 [error] <0.1211.0>@aec_pow_cuckoo:wait_for_result:347 ERROR: '
2018-01-03 10:18:26.235 [error] <0.1211.0>@aec_pow_cuckoo:wait_for_result:347 ERROR: what():
2018-01-03 10:18:26.235 [error] <0.1211.0>@aec_pow_cuckoo:wait_for_result:347 ERROR: std::bad_alloc
2018-01-03 10:18:26.236 [error] <0.1211.0>@aec_pow_cuckoo:wait_for_result:347 ERROR:
2018-01-03 10:18:26.236 [debug] <0.1211.0>@aec_pow_cuckoo:parse_generation_result:420 Looking for 42-cycle on cuckoo30("uXkXZrU2tPmyYThehkTmZf6fqOuc6pvxCc87gv/BV8U=D2BQVvYHf7U=",0) with 50% edges
2018-01-03 10:18:26.371 [error] <0.1211.0>@aec_pow_cuckoo:wait_for_result:362 OS process died: {signal,sigabrt,true}
... - notice "bad_alloc". These are symptoms of memory allocation issues.
In presence of memory constrains, you can configure a less memory-intensive (though usually slower) algorithm than the default one.
Amend the mining
section in the user configuration file from:
mining:
autostart: true
... to ...
mining:
autostart: true
cuckoo:
miner:
executable: lean29-generic
extra_args: ""
edge_bits: 29
... then stop and start the node (( cd /tmp/node; bin/epoch stop; bin/epoch start; )
).
Note: until v1.0.0-rc2 the arguments would be:
...
executable: lean30
extra_args: ""
node_bits: 30
This section is applicable to all versions. It may apply to version v1.3.0 or greater accommodating for deviations.
The following assumes that the node is deployed in directory /tmp/node
.
If the node won't start check the log entries in /tmp/node/log/epoch.log
for similar errors:
2018-03-05 10:15:16.973 [error] <0.950.0>@gen_server:init_it:357 CRASH REPORT Process exec with 0 neighbours exited with reason: bad return value: "Port program /tmp/node/lib/erlexec-1.7.1/priv/x86_64-unknown-linux-gnu/exec-port with SUID bit set is not allowed to run without setting effective user!" in gen_server:init_it/6 line 357
2018-03-05 10:15:16.974 [error] <0.949.0> Supervisor exec_app had child exec started with exec:start_link([]) at undefined exit with reason bad return value: "Port program /tmp/node/lib/erlexec-1.7.1/priv/x86_64-unknown-linux-gnu/exec-port with SUID bit set is not allowed to run without setting effective user!" in context start_error
2018-03-05 10:15:16.974 [error] <0.947.0> CRASH REPORT Process <0.947.0> with 0 neighbours exited with reason: {{shutdown,{failed_to_start_child,exec,{bad_return_value,"Port program /tmp/node/lib/erlexec-1.7.1/priv/x86_64-unknown-linux-gnu/exec-port with SUID bit set is not allowed to run without setting effective user!"}}},{exec_app,start,[normal,[]]}} in application_master:init/4 line 134
2018-03-05 10:15:16.975 [info] <0.855.0> Application erlexec exited with reason: {{shutdown,{failed_to_start_child,exec,{bad_return_value,"Port program /tmp/node/lib/erlexec-1.7.1/priv/x86_64-unknown-linux-gnu/exec-port with SUID bit set is not allowed to run without setting effective user!"}}},{exec_app,start,[normal,[]]}}
- notice "SUID bit set is not allowed to run without setting effective user!"
These are symptoms of running the node with privileged user.
This can be confirmed by running the id
command:
root@localhost:~# id
uid=0(root) gid=0(root) groups=0(root)
- Notice the
uid=0
- Notice the
#
at the command prompt
You must run your node with non-privileged user for security reasons. Just create a new user (or use already existing one) and make sure the node files are owner by it.
For example:
useradd -m epoch
chown -R epoch:epoch /tmp/node
Change your current user to the newly created one either by relogin with it or using su
command if you're still logged in with privileged user:
su epoch
Verify that the user is non-privileged with the id
command:
epoch@localhost:~$ id
uid=1001(epoch) gid=1001(epoch) groups=1001(epoch)
- Notice the
uid=1001
(should not be 0) - Notice the
$
at the command prompt
This section is applicable to v 0.10 or later. It may apply to version v1.3.0 or greater accommodating for deviations.
The following assumes that the node is deployed in directory /tmp/node
.
If the node won't start check the log entries in /tmp/node/log/epoch.log
for similar errors:
18:32:06.799 [warning] Expected genesis block hash <<25,77,27,140,45,8,239,74,202,109,47,84,197,77,26,105,247,45,34,75,2,251,250,254,2,212,58,196,52,160,198,153>>, persisted genesis block hash <<173,144,161,143,82,71,247,64,21,169,150,78,58,247,121,152,208,163,240,137,125,185,215,180,252,4,43,172,40,52,188,92>>
18:32:06.799 [error] Persisted chain has a different genesis block than the one being expected. Aborting
{"Kernel pid terminated",application_controller,"{application_start_failure,aecore,{inconsistent_database,{aecore_app,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,aecore,{inconsistent_database,{aecore_app,start,[normal,[]]}}})
This is a symptom of running the node with an old persisted DB while there had been a non-backwards compatible change in the protocol including the genesis block.
You must start the node with a clean DB directory. This can be achieved by:
-
Deleting the contents of your current DB directory
-
Setting a new directory in your user configuration file
This section is applicable to all versions.
The node gets stuck at a different top height (most commonly close to the top of the chain) than the rest of the network and fail to synchronize.
In rare cases the node will end up in a steady state where it thinks it is fully synchronized, but it is not. Normally it will get out of this state by ping:ing another node but depending on network topology and timing this might fail.
Restarting the node will re-start the synchronization.
This section is applicable to v0.21.
Get block by hash doesn't return the list of transactions for micro blocks.
Instead of the WS API, it's possible to use the HTTP API. The /micro-blocks/hash/{hash}/transactions
endpoint can be used to retrieve the list of transactions.
This section is applicable to v0.25.0.
Compilation of contract fails with:
{undefined_function,{builtin,{map_get,{map,word,word}}}}`
Work around it by adding some unused code that does a map lookup (m[0]
or similar).
Cuda29 miner was compiled with newer nvidia cuda driver than the one installed in the system running miner. It is possible with prebuild cuckoo miners
Install newer cuda drivers
OR
Locally compile aecuckoo miner
This section is applicable to v3.*
State channel FSM does not yet support generalized accounts. When either one of the two participants of a state channel have upgraded their accounts to be using generalized accounts, their sign on-chain transactions attempts fail:
{
"jsonrpc":"2.0",
"method":"channels.error",
"params":{
"channel_id":null,
"data":{
"message":"not_create_tx"
}
},
"version":1
}
Note that this is an example message and a different message could contain a channel_id
. Also the message
could be:
-
not_create_tx
when trying to sign a channel create transaction by an upgraded account -
not_deposit_tx
when trying to sign a deposit transaction by an upgraded account -
not_withdraw_tx
when trying to sign a withdrawal transaction by an upgraded account -
not_offchain_tx
when trying to sign an off-chain transaction by an upgraded account -
not_close_mutual_tx
when trying to sign a mutual closing transaction by an upgraded account
On-chain protocol support is already present but support for GAs in FSM will be added in v 4.0.0. Until then participants can still use then new authentication methods but outside of the scope of the FSM.
This section is applicable to v3.*
State channel FSM does not yet support generalized accounts. When either one of the two participants of a state channel have upgraded their accounts to be using generalized accounts, the node will reject incoming WebSocket open requests. This is to protect users from using the not yet supported feature.
Do not use state channels' WebSocket API and FSM with generalized accounts. Support there will be provided in v 4.0.0
This section is applicable to versions v3.*
for (advanced) users building the node and having installed the libsodium
build dependency in a non-default path.
Such users export the enviroment variables CFLAGS
/LDFLAGS
before attempting to build the node e.g. like:
export CFLAGS="-I $(brew --prefix libsodium)/include"
export LDFLAGS="-L$(brew --prefix libsodium)/lib" # No space - [ref](https://gcc.gnu.org/onlinedocs/gcc/Directory-Options.html#Directory-Options).
From a clean clone of the repo of the node, you ran make prod-build
and you got an output like:
...
===> Compiling rocksdb
...
checking whether we are cross compiling... configure: error: in `/Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/deps/snappy':
configure: error: cannot run C compiled programs.
If you meant to cross compile, use `--host'.
See `config.log' for more details
make[2]: *** [/Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/deps/snappy/.libs/libsnappy.a] Error 1
make[2]: *** Waiting for unfinished jobs....
...
make[1]: [deps] Error 2 (ignored)
...
cc /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/c_src/refobjects.o /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/c_src/erocksdb_iter.o /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/c_src/erocksdb.o /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/c_src/erocksdb_db.o /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/c_src/erlang_merge.o /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/c_src/batch.o /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/c_src/erocksdb_snapshot.o /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/c_src/util.o /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/c_src/transactions.o /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/c_src/cache.o /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/c_src/rate_limiter.o /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/c_src/bitset_merge_operator.o /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/c_src/backup.o /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/c_src/erocksdb_column_family.o /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/c_src/counter_merge_operator.o /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/c_src/env.o -L /Users/ae/dev/kerl/installations/OTP-20.3.8.20/lib/erl_interface-3.10.2.1/lib -lerl_interface -lei -lstdc++ /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/deps/rocksdb/librocksdb.a /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/deps/snappy/.libs/libsnappy.a /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/deps/lz4/lib/liblz4.a -L/Users/ae/homebrew/opt/libsodium/lib -shared -o /Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/priv/rocksdb.so
clang: error: no such file or directory: '/Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/deps/rocksdb/librocksdb.a'
clang: error: no such file or directory: '/Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/deps/snappy/.libs/libsnappy.a'
make[1]: *** [/Users/ae/dev/ae/aeternity/_build/default/lib/rocksdb/priv/rocksdb.so] Error 1
===> Hook for compile failed!
make: *** [internal-compile-deps] Error 1
Unset CFLAGS
and LDFLAGS
and run make prod-build
so that _build/default/lib/rocksdb
gets built, then set again CFLAGS
and LDFLAGS
and run again make prod-build
if build not complete yet (e.g. for compilation error on enacl
for 'sodium.h' file not found
).
This section describes how to identify and address common issues running development tests for the Aeternity node.
When running make test The test fails in the Sync Suite, caused by different IP for localhost and HOSTNMAME. This problem has been seen on Ubuntu 18.04.
The test fails with
%%% aecore_sync_SUITE ==> all_nodes.two_nodes.start_second_node: FAILED
%%% aecore_sync_SUITE ==> {test_case_failed,
{retry_exhausted1,
The file /etc/hosts contains two lines like
127.0.0.1 localhost
127.0.1.1 myhostname
Make sure you use the same IP for localhost and the HOSTNAME by editing /etc/hosts to:
127.0.0.1 localhost myhostname
Note that this might not be what you want if your machine has a public IP address and hostname. In that case edit the files ...epoch/config/dev*/sys.config So that the peers entries points to your machine's host name instead of localhost:
{peers, [<<"aenode://pp$23YdvfRPQ1b1AMWmkKZUGk2cQLqygQp55FzDWZSEUicPjhxtp5@myhostname:3025">>,
<<"aenode://pp$2M9oPohzsWgJrBBCFeYi3PVT4YF7F2botBtq6J1EGcVkiutx3R@myhostname:3035">>]},
Tech
Development
Testing
- General Testing Guidelines
- Unit Testing
- Integration Testing
- System Testing
- Acceptance Testing
- Test Tips
Usage
Other