You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Actual behavior
Trace failed with "bundle reply is timeout"
Versions:
Please provide the following information:
Linux kernel version on the Kubernetes Nodes (uname -r). 4.15.0-88-generic
If you chose to compile the Open vSwitch kernel module manually instead of using the kernel module built into the Linux kernel, which version of the OVS kernel module are you using? Include the output of modinfo openvswitch for the Kubernetes Nodes.
@abhiraut Could you share environment info with @wenyingd and me? We have hit this issue before, but cannot analyze the root cause without access to OVS.
@abhiraut Wenying and I found the root cause.
I'll enhance the reliability on below code snippet in pkg/agent/openflow/packetin.go:
wait.PollUntil(time.Second, func() (done bool, err error) {
pktIn := <-ch
for name, handler := range c.packetInHandlers {
err = handler.HandlePacketIn(pktIn)
if err != nil {
klog.Errorf("PacketIn handler %s failed to process packet: %+v", name, err)
}
}
return false, err
}, stopCh)
You won't get this error if you have fixed the comment #918 (review)
But if you get this error again, please workaround this by changing from return false, err to return false, nil
The root cause is, some error happend in PacketInHandler, which causes the thread jump out of the for-loop. There is a channel between the PacketInHandler and the ofnet, ofnet is blocking at sending new "PacketIn" message into the channel (no consumer is at the other side of the channel at that time). Hence, ofnet could not handle the next "inbound" message. But ofnet's "outbound" channel is working well, so we could continue to sending Bundle control message out to OVS. But ofnet can't receive the reply for Bundle control message, hence Antrea got the timeout error.
Describe the bug
Start a trace between two pods using traceflow CRD. Status fails with the following error
To Reproduce
Following yaml used for Traceflow
Expected
Expected trace to succeed
Actual behavior
Trace failed with "bundle reply is timeout"
Versions:
Please provide the following information:
Linux kernel version on the Kubernetes Nodes (
uname -r
).4.15.0-88-generic
If you chose to compile the Open vSwitch kernel module manually instead of using the kernel module built into the Linux kernel, which version of the OVS kernel module are you using? Include the output of
modinfo openvswitch
for the Kubernetes Nodes.The text was updated successfully, but these errors were encountered: