-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry time fixes #4867
Retry time fixes #4867
Conversation
Update the bigledger peers state in case of an exception. Previously the publicRoot state was updated instead. This ment that in case of an error the node would never attempt to increase the number of big ledger peers again.
The bootstrap peers and public root peers should be different. Therefor the failure of one kind shouldn't automatically carry over to the other kind.
Lower the maximum retry timer from above 1h to 4 minutes. Normally public roots have a ttl of 5s when taken from the ledger. But for bootstrap peers the ttl from DNS will be returned. Since this can be hours or days we cap it to 60s and depend on backoff counter to increase it further if needed.
When calculating publicRoot lookup results having only configured peers with IP addreses is not a DNS failure. Instead of a 3h TTL use 60s for IP addresses only (will be increased by backoff logic).
d32f6a1
to
0ef9e22
Compare
These all look sensible to me :) |
inProgressBigLedgerPeersReq = False, | ||
bigLedgerPeerBackoffs = bigLedgerPeerBackoffs', | ||
bigLedgerPeerRetryTime = bigLedgerPeerRetryTime' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice.
If we are above the credits instead of failing instantly take a look if any of the events provide us with more credits. This makes the test suite more robust incase one kind of events happens close to another kind of events. For example a burst of peersharing activity just before the node starts to look for new public roots.
One of the tests failed on Hydra
|
This breakage seems to have been introduced in a38e5a1 , that is it exist in master. |
I have looked into it and it seems it happens on an edge case when there's a trusted local root peer being in progress promote to cold and then the targets change to be all 0 (due to the quickcheck arbitrary script). This will clamp the local roots to the |
Description
Fix problems related to big ledger and public root peers retry time.
Checklist
Quality
Maintenance
ouroboros-network
project.