New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rehandshaking #838
Rehandshaking #838
Conversation
119b60c
to
a4fcdfe
Compare
a4fcdfe
to
d4bfad6
Compare
…send a CreateRelayResponse to the relayed peer.
7eacf6b
to
819fc71
Compare
how is age of the cert determined? |
Thanks for the contribution! Before we can merge this, we need @brad-defined to sign the Salesforce.com Contributor License Agreement. |
Co-authored-by: Wade Simmons <wadey@slack-corp.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM and working live
|
||
switch decision { | ||
case deleteTunnel: | ||
n.hostMap.DeleteHostInfo(hostinfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use n.intf.closeTunnel(hostinfo)
here which also does f.lightHouse.DeleteVpnIp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't want to use that one specifically but I could see a good argument for dropping it from the lighthouse cache. The only time we hit this case today is when we aren't receiving traffic even after sending a test packet and waiting some time.
I can spin another PR pretty quick to take care of that after merging this.
f.SendMessageToVpnIp(header.Control, 0, target, msg, make([]byte, 12), make([]byte, mtu)) | ||
f.SendMessageToHostInfo(header.Control, 0, peer, msg, make([]byte, 12), make([]byte, mtu)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this a bug that it was target
before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was mostly not a bug? idk. But I think it's better this way.
'peer' is a HostInfo object, and 'target' is the VPN IP address of that HostInfo, so conceptually it's sending the message to the same target. Note the function call also changed from SendMessageToVpnIp to SendMessageToHostInfo. The difference is that SendMessageToVpnIp will always send a message to the top HostInfo in its list, while SendMessageToHostInfo will send using a specific HostInfo that may or may not be the top one in the list.
We now have multiple HostInfos for the same VPN IP in the main hostmap, and we want to be 100% sure that we're sending the CreateRelayRequest over the tunnel associated with the HostInfo on which we're storing relay state (in the AddRelay() call up in line 264).
If a new HostInfo were to successfully handshake and move to the front of the list after AddRelay() but before SendMessageToVpnIP(), then using SendMessageToVpnIp (which always sends to the top HostInfo in the list) we'd send the CreateRelayRequest to the peer's HostInfo for which our corresponding HostInfo lacked the Relay state. Oof, hope that makes sense.
Co-authored-by: Brad Higgins <brad@defined.net> Co-authored-by: Wade Simmons <wadey@slack-corp.com>
Nebula can soft-reload its host certificate on a HUP signal. When it reloads a certificate, any new peer tunnels created will use the updated certificate. However, any previously established tunnels will not be updated with this new certificate information.
With this change, Nebula will detect when an older certificate is in use on an existing tunnel, and rebuild the tunnel with the newly updated host certificate. Once hosts have established a new tunnel, packet traversal is migrated to use the new tunnel, and the old tunnel is torn down when its lack of use is detected.
This PR also includes migrating relay information from the previously established tunnel to the new tunnel. A relay established over an older tunnel will be migrated over to the new tunnel, ensuring that the old tunnel can be torn down from lack of use.