-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Peer is already bound to another channel" #3979
Comments
Once a backoff completely fails, we won't restart it by itself. It will be reset upon the next successful interaction with the I think this works as expected. We have to give up talking to the relay at some point if all we are getting is timeouts. |
Closing as can't reproduce |
The more interesting logs are the following:
I would be interesting to see the related logs on the relay for these two transaction IDs. |
I can only see IPv4 traffic in these logs so likely, all the timeouts are for the IPv6 relays. I am improving the logs to make that more obvious. |
@thomaseizinger Here are the matching logs -- looks like this particular case went to europe-west-1d: |
Interesting. I'll look into that. Appears that we are not advancing the channel numbers correctly somewhere. |
@thomaseizinger Yeah it looks like it's happening a lot: |
This is still an issue. More recent logs: |
The result of this is that the gateway becomes unable to establish connections. The other error seen during this scenario is "Channel is already bound to a different peer". The two cases are checked here and here. It seems like for this to happen, either:
|
) Previously, the relay neither scheduled a `Wake` command nor did it register a `TimedAction` to expire a channel binding. Such an action was only scheduled after the first refresh. This PR fixes this and adds a test that asserts we can re-bind the same channel to a different peer after 15 minutes. Resolves: #3979.
Describe the bug
Gateway stops allowing connections
To Reproduce
Run a Gateway for a few days with heavy usage
Expected behavior
Connections resume once the backoffs subside
Screenshots / Logs
https://firezonehq.slack.com/archives/C0691K7382G/p1709611053234289
Platform (please complete the following information)
Additional context
The text was updated successfully, but these errors were encountered: