-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exit if interface loses IP #10
Conversation
pkg/publisher/publisher.go
Outdated
// This shouldn't be possible, but we need to handle it anyway | ||
log.Printf("[ERR] mdns-publish: Failed to find own process") | ||
} | ||
err = process.Signal(syscall.SIGTERM) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it better to pass shutdown channel here and close is internally? Instead of looking for your own PID and commit suicide. Would save some lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe closing the shutdown channel would only stop the listening goroutine. The main thread blocks waiting for a signal (<-sig in https://github.com/openshift/mdns-publisher/blob/master/cmd/publish.go#L93). If we only close shutdown then we'd stop listening but keep running until a signal is delivered.
I also considered just calling Exit() here, but apparently that's the nuclear option and doesn't allow any cleanup to run. I don't know whether that would cause trouble, but it seemed like a graceful shutdown was less likely to cause problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, fair enough.
func IfaceCheck(ip net.IP, iface net.Interface) { | ||
for { | ||
i, err := FindIface(ip) | ||
if err != nil || i.Name != iface.Name { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's say we fail to find the interface since the IP is being moved and does not sit anywhere at the moment. What would happen with mdns on the next start?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirmed that if the IP is not found mdns-publisher just exits immediately, so it would crash loop in that case until the IP gets configured.
A. Meaning that both IGMP messages and MDNS multicast packets will be sent on the specified interface ( instead of using routing table for that purpose). As per the Rx side, we use a UDP L4 socket with dest IP == MDNS multicast address, so we shouldn't care about the interface that the 'baremetal' IP is assigned to and neither the Rx network interface. B. So this PR makes sure that mdns and IGMP packets sent on the interface that baremetal IP is attached to and trigger the 'cacheFlush' process (as a result of mdns-publisher restart) C. I think that the comments in code should be updated ( e.g: 'The interface we were listening on changed'). [1] https://github.com/celebdor/zeroconf/blob/master/connection.go#L86 |
pkg/publisher/publisher.go
Outdated
@@ -73,6 +76,30 @@ func FindIface(ip net.IP) (iface net.Interface, err error) { | |||
return iface, fmt.Errorf("Couldn't find interface with IP address %s", ip) | |||
} | |||
|
|||
// If an interface changes after we've started listening on it (say because it | |||
// got bridged), we need to be restarted to listen on the new interface that | |||
// contains the specified IP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that comments should be updated here(something like - verify that we join to mdns multicast on the interface holds the IP)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update this.
cmd/publish.go
Outdated
@@ -89,6 +89,7 @@ var publishCmd = &cobra.Command{ | |||
}).Info("Publishing service") | |||
go publisher.Publish(ip, iface, service, shutdownChannel, waitGroup) | |||
} | |||
go publisher.IfaceCheck(ip, iface) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm new to GOLANG, so I'm not sure if it's a good idea.
Do you think that changing [1] to something similar to what we have in [2] , could be useful?
I mean something like that:
for {
select {
case <-shutdown:
'what we have do incase of shutdown'
default:
//call to any periodic function ( IfaceCheck, and we can add other functions in future)
IfaceCheck()
time.sleep(X)
}
[1]
mdns-publisher/pkg/publisher/publisher.go
Line 35 in cb9c32c
select { |
[2] https://github.com/openshift/baremetal-runtimecfg/blob/master/pkg/monitor/monitor.go#L49
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I wasn't familiar with that pattern in Go, but I think we can use it to eliminate the pid stuff in the check function. I'm not sure we want to get rid of the goroutine, but we can use another channel to do the communication instead of signaling. I'll include that change in the next version of this patch.
6eaa4ef
to
ad33ada
Compare
On startup, mdns-publisher looks up which interface has the configured IP. However, it is possible for this IP to be subsequently moved, such as when the interface gets added to a bridge. In that case, we need to be restarted and re-detect the bridge or our records become inaccessible. This change adds a goroutine that loops every five seconds checking if the detected interface still has the configured IP. If it does not, it closes a channel to indicate to the main thread that it should exit. This should allow kubernetes to restart it on the correct interface.
ad33ada
to
da14107
Compare
Okay, I think this version addresses the comments so far. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked it on dev-scripts env + CNV - looks good
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: celebdor, cybertron, phoracek, yboaron The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
On startup, mdns-publisher looks up which interface has the configured
IP. However, it is possible for this IP to be subsequently moved,
such as when the interface gets added to a bridge. In that case, we
need to be restarted and listen on the bridge or our records become
inaccessible.
This change adds a goroutine that loops every five seconds checking
if the detected interface still has the configured IP. If it does not,
the function sends a SIGTERM to the process to exit gracefully.
This should allow kubernetes to restart it on the correct interface.