New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reports of Windows users losing peers #13431 #13936
Comments
I am having this same problem of losing peers on prysm and geth for months. I have a Windows 10 and a Windows 11 PC and they were running prysm and geth smoothly before January, until they began losing peers. This bug does not seem to happen on every Windows machine. I have a Windows server in NYC with the same configs and even the same database files and everything works fine. For those who are suffering from this issue, I have a temporary workaround before this bug is fixed(by using lighthouse and static peers): |
thanks so much for the workaround! i've been suffering for months lol.. at least now I know I'm not crazy and this is definitely a real issue.. lighthouse here we come |
Having the EXACT same issue. Since tesnet and genesis validating without any issues on windows. Now all of a sudden last few days peers drop to zero and must restart prysm validator to fix the issue. Even with a restart, it fails within a day or so and then must restart again. Tagging myself in here to see if a real solution comes along, not a big fan of setting static nodes. Edit: Read other thread which hinted in the past this could be time related. Since genesis been syncing time with Nettime without a single issue, always staying within' ms of actual time so the time drift is never larger than 20-30ms.... So time drift doesnt seem to be the cause here? Trying a different timeserver (had google, switched to 0.nettime.pool.ntp.org) to see if this resolves it.... But right now must restart every day to keep it attesting efficiently.... |
#14025 a new report to the same issue. |
Describe the bug
I spoke too soon. Still having the same problem and even changed ISP and still not helping. I've worked with nishant on this for a long time and we tried almost every trick in the book. At one point after changing some settings on net-time (time drift) it worked perfect for 2 days, which was the longest.. The only thing that halfway works now is to literally shut down the node every 4 hours and re-start using task scheduler.. not a very healthy way to run a validator..
Background
Opening this as it's nuanced and haven't been wide spread however it does randomly happen to some windows users
where the OS decides to drop all packets from prysm for some reason. This results in a reduced peer count - and eventually all peers lost. Root cause is currently unknown but opening this as a tracking issue.
Currently there have been a non trivial amount of users complaining on Windows about their number of peers
slowly reducing to zero. v1.05 has had some important changes with regards to stream management and subnet search.
However none of these components should have had any material impact on discovery. Windows machines being more
susceptible to clock drifts suddenly stop being able to find new peers for some reason, being minutes ahead/behind with respect to the network time should by right not have any effect in finding new peers.
related prysmaticlabs/documentation#891
related #8144
Has this worked before in a previous version?
🔬 Minimal Reproduction
Error
Platform(s)
Windows 10 (x86)
What version of Prysm are you running? (Which release)
latest version
Anything else relevant (validator index / public key)?
https://beaconcha.in/validator/26606#attestations
The text was updated successfully, but these errors were encountered: