-
Notifications
You must be signed in to change notification settings - Fork 787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node v.11.2 died after 3h #780
Comments
@stefonarch Could you chime in on the discussion in #743? What's your bandwidth data like after updating to v11.1 and v11.2? |
Weird. My node has been running smoothly after the 11.2 upgrade. No crashes or problems whatsoever. |
I guess I've jinxed it - my node (11.2) crashed earlier today as well. No particularly relevant info in the logs :( |
Anything in dmesg logs? |
Maybe:
This is on a 1GB RAM VPS that used to run all previous versions perfectly well. Memory usage doesn't look that much different with v11.2 compared to v11.1 and previous versions. Also, @stefonarch runs the node on machines with a bit more RAM. |
@stefonarch did the OOM killer take down yours as well? 1GB is awfully tight, so is 4GB if running with a ramdisk. Does if fare better if not using tmpfs? |
I've just upgraded to a 2GB VPS - will continue to monitor --> http://138.197.179.164/ Here's a CPU / memory log. Node was running OK for several days, then CPU usage and memory started to spike which led to the OOM crash (~6pm). |
On http://wehavethetechnology.io there is the following comment: "Nodes seem to be getting behind/crashing when unchecked blocks are flushed." I have indeed flushed unchecked blocks yesterday via RPC. Maybe that gives a hint towards the problem. |
@dbachm123 That one is my node, it's happened twice to me. Basically, I would see an "Unchecked blocks flushed." message but would not get the normal confirmation that it was completed. The node will continue to be able to connect with other reps but it does not check any blocks. The node will then shutdown after a broken pipe message not too long after. I haven't had any problems since rebooting the system and restarting the node. I've seen one other person have the same log error. I think it was Prometheus on Discord. Everyone else's seems like their resources get used up then crashes. I have not seen their logs though, but It seems similar because the node will still be responsive to RPC commands but will not vote or check blocks. |
Also, the list of trusted reps at https://nanode21.cloud/representatives.php shows many offline nodes. And I haven’t seen any of those nodes offline before. |
Ehm, there was a bug until 2 days, showing green dots always.... but yes, nodes need at least some crontab watching them. Personally I didn't see no crash anymore on all 3 nodes. EDIT: oops... 1gb 1core 11.2:
|
@cryptocode OOM (out of memory?) killer? Never. |
@stefonarch ok, dbachm123 had an oom-killer entry in his dmesg logs, figured I'd ask. |
Thanks @stefonarch |
Problem started after I flushed/cleared 20000+ unchecked blocks I had since Jan. Could that be the reason? |
I’m having same issue. Upgraded to 11.2 and periodically my node will crash and die. I issue a docker restart and everything is ok for an indeterminate amount of time. I’m hosted on digital ocean VPS on Ubuntu with 1GB of RAM. Releases prior to this ran fine including 11.0 |
I'm trying my best to figure out how to run a node. Where do I click lol
code is completely new to me but I have a vision and drive to make, create,
run a node, blockchain biz. Or help the best out there, please help me out,
and I am doing everything on Google Pixel XL 2.
…On Mon, Apr 9, 2018, 9:01 PM Trevin ***@***.***> wrote:
I’m having same issue. Upgraded to 11.2 and periodically my node will
crash and die. I issue a docker restart and everything is ok for an
indeterminate amount of time.
I’m hosted on digital ocean VPS on Ubuntu with 1GB of RAM.
Releases prior to this ran fine including 11.0
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#780 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AkOkEsC5q8FAA6EMpajinUxPgCzNDZD1ks5tnAR4gaJpZM4TGpBa>
.
|
@BeeChains Wrong place to post that question. You cannot run a node on a mobile device. You need to host it on a server, like a Digital Ocean VPS: https://medium.com/@seanomlor/how-to-run-your-own-raiblocks-node-on-digitalocean-6a5a2492c29b |
I've seen several mentions of flushing unchecked blocks, but can't find out how to do this. I have 20k unchecked blocks for some reason that consistently stays at the value. Looking at other nodes, they have much less than this so I'm assuming something is wrong. |
@tmchow you send an "unchecked_clear" RPC command. AFAIK that is not the same as when the node does it's own flushing of unchecked blocks. I would restart the node and see what if that fixes your issue. |
@oFLIPSTARo I hit this again.. looks like my node crashed this morning despite restarting the node. It was offline for about 8 hours. I just restarted the container now. I don't know how to issue an RPC command to the container. Is there some simple steps to outline? |
Yeah same problem for me. I have a cronjob restarting the node every 30min because every 60min was too long! |
Here's a very rough version of my watchdog scripts that run as cronjob. Will not work out of the box due to some hardcoded paths, but it might give you a starting point to better watch the node. |
@dbachm123 Thanks, I'm running your code now. Works great! |
Just chiming in here with some strange activity. My node crashed and restarted last night, and when it did, it looks like it lost about 200k blocks from the reported block count. It's no longer catching up either. |
Commit f749697 is running very smoothly on my rep node. All previously observed issues are gone 👍 |
No issue for some time now. Close this? |
Overcome by newer versions... |
Description of bug:
Node just died without messages in the log after running for 3 hours. Compiled from git master; commit 84a6b51
Additional information you deem important (e.g. issue happens only occasionally):
System Load seems significant higher than with 11.1.1, sometimes 90%-180% in top, will attach graph when more useful in some hours.
Environment:
Arch Linux Server, 4gb ram, SSD
CPU(s): 2 Single core Intel Xeon E5-2680 v2s (-MT-SMP-) cache: 32768 KB
clock speeds: max: 2799 MHz 1: 2799 MHz 2: 2799 MHz
Raiblocks folder stored in ramdisk.
Node Monitor: https://nanode21.cloud/stats.php
logs
https://nanode21.cloud/11.2.log
The text was updated successfully, but these errors were encountered: