-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
monerod throws a lot of exceptions and slows down synchronization till almost complete halt. #8132
Comments
Debug build doesn't show names in stacktrace either, and works exactly as release build, with same problems. |
Seems like something related to FreeBSD. It definitely shows on Linux for both release and debug build. |
Yes, maybe Tough, my problem not incomplete stacktraces, but a lot of exceptions. |
@yekm What behavior exactly? Just exceptions? Can you post one of them? Sync slowing down at the end is normal. |
No symbols in default package.
Well, I guess my situation is normal then. |
It's possible that the arch package is compiled without libunwind. |
Excatly same issue for me on docker image (https://github.com/sethforprivacy/simple-monerod-docker) on x64 debian 10.
At least for my case I can say that I saw in the docker file of the image above libunwind being part of the build. |
Im having the same issue. The first few I have tried the
|
I've turned the log level to 1 (0=low, 4=high) and from what I can see after a few minutes of running it I notice that the exceptions occur next to this: Level 1 log
Which looks to me like something to do with the levin protocol timeout. I've located all log entries with the IP address in the log and it seems that the peer has connected, timed out and then either my node or the peer closed the connection. It could have been dropped idk: $ rg -N "193.142.4.248" /var/log/monero/monero_l4.log
2023-10-10 19:06:54.659 [P2P0] INFO net.p2p src/p2p/net_node.inl:2678 [193.142.4.248:18180 4ab669ef-5059-4991-bb3e-416abda9be2d OUT] NEW CONNECTION
2023-10-10 19:06:54.659 [P2P0] INFO net.p2p.traffic contrib/epee/include/net/levin_protocol_handler_async.h:56 [193.142.4.248:18180 OUT] 262 bytes sent for category command-1001 initiated by us
2023-10-10 19:07:00.032 [P2P8] INFO net contrib/epee/include/net/levin_protocol_handler_async.h:214 [193.142.4.248:18180 OUT] Timeout on invoke operation happened, command: 1001 timeout: 5000
2023-10-10 19:07:00.032 [P2P8] WARNING net.p2p src/p2p/net_node.inl:1163 [193.142.4.248:18180 OUT] COMMAND_HANDSHAKE invoke failed. (-4, LEVIN_ERROR_CONNECTION_TIMEDOUT)
2023-10-10 19:07:00.032 [P2P0] WARNING net.p2p src/p2p/net_node.inl:1222 [193.142.4.248:18180 OUT] COMMAND_HANDSHAKE Failed
2023-10-10 19:07:00.032 [P2P0] INFO net.p2p src/p2p/net_node.inl:1413 [193.142.4.248:18180 OUT] Failed to HANDSHAKE with peer 193.142.4.248:18180
2023-10-10 19:07:00.033 [P2P0] ERROR net contrib/epee/include/net/levin_protocol_handler_async.h:351 [193.142.4.248:18180 OUT] [levin_protocol] -->> start_outer_call failed
2023-10-10 19:07:00.034 [P2P0] ERROR net contrib/epee/include/net/levin_protocol_handler_async.h:351 [193.142.4.248:18180 OUT] [levin_protocol] -->> start_outer_call failed
2023-10-10 19:07:00.034 [P2P0] ERROR net contrib/epee/include/net/levin_protocol_handler_async.h:351 [193.142.4.248:18180 OUT] [levin_protocol] -->> start_outer_call failed
2023-10-10 19:07:00.038 [P2P2] INFO net.cn src/cryptonote_protocol/cryptonote_protocol_handler.inl:2945 [193.142.4.248:18180 OUT] [0] state: closed in state before_handshake
2023-10-10 19:07:00.038 [P2P2] INFO net.p2p src/p2p/net_node.inl:2697 [193.142.4.248:18180 4ab669ef-5059-4991-bb3e-416abda9be2d OUT] CLOSE CONNECTION So my absolute guess right now (without the debug build) is that the closed connection could have somehow caused the exception. In fact as I'm grepping the log for This seems to be the error log line:
And this seems to be the definition that catches the bad_weak_ptr exception that you can see in the if statement above (although I'm not sure)
Based on this search query I assume its the only place this exception is caught in |
@max-ishere Yes, currently Try running my #7345 patch. This changes the raw pointer usage to |
Actually, I'm not certain the code I was referencing could be responsible. It's almost certainly coming from the TCP server, just not certain where at the moment. However, if you could run #7345 and provide feedback, it would narrow things down somewhat. |
I'm trying to use
monerod
version 0.17.3.0 on FreeBSD 12 (amd64/x86_64).I need to synchronize 5+ months worth of blockchain.
monerod
starts to synchronize and shows good progress right after start, but then these exceptions start to occur:Exceptions become more and more frequent, and after ~2 hours (sometimes less sometimes more) or so
monerod
consumes 100% (!) of one core printing out exceptions one after another and stops to make any synchronization progress. At the beginningmonerod
consumes only 1-2% of one core.Please note, that
monerod
logs stacktrace one frame per second (!).I've seen issue #6473 and tired to enable 1.25GB (1280 large pages) of locked memory for monerod, but it doesn't help a lot, only defer exceptions for 30-45 minutes from daemon's start.
Log of sync progress and exceptions looks like this (all other lines are stripped) is here: monerod.flt.log.
NB: I've tried to build debug build, but it fails to install all shared libraries and doesn't work at all, so stacktraces doesn't have function names.
The text was updated successfully, but these errors were encountered: