-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait for OVS bridge datapath ID to be available after creating br-int #6472
Wait for OVS bridge datapath ID to be available after creating br-int #6472
Conversation
We wait (for a maximum of 5s) for the datapath_id of the br-int OVS bridge to be reported in OVSDB, after creating the bridge and before checking supported datapath features. This prevents errors when querying the supported features before the ofproto-dpif provider has been initialized. Fixes antrea-io#6471 Signed-off-by: Antonin Bas <antonin.bas@broadcom.com>
2b6c71c
to
d869a13
Compare
@@ -317,6 +317,45 @@ func (br *OVSBridge) GetDatapathID() (string, Error) { | |||
} | |||
} | |||
|
|||
func (br *OVSBridge) WaitForDatapathID(timeout time.Duration) (string, Error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a GetDatapathID()
method and a GetOFPort(ifName string, waitUntilValid bool)
method. Does it make sense to add a waitUntilValid
to the former to unify them and probably to reduce some code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a big fan of the current GetOFPort
method, but maybe I can come up with a solution that I like that avoid the code duplication
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find a simple solution I really liked. GetOFPort
is a bit different from GetDatapathID
/ WaitForDatapathID
. In GetOFPort
we always use a Wait
in the transaction, and waitUntilValid
only determines the wait condition. For Linux, we wait for the value to be non-empty, for Windows, we wait for the value to be different from -1 (which indicates that the interface doesn't exist).
Here for the datapath ID, if we want one version with wait and one version without wait, then we have significant differences in the implementations, because one version has a single operation in its OVSDB transaction and one version has two operations. I feel like it is not worth unifying in this case, but let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, I was just not sure if you missed it. No problem from me if it looks worse by unifying them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test-all |
I am not backporting this based on the risk-return tradeoff. The issue is quite rare and even when it does happen, older versions of Antrea can recover automatically and very quickly. I also only have observed the issue on "fresh" installations so far, even though it may (should?) be possible for the issue to also happen during restarts / upgrades. |
We wait (for a maximum of 5s) for the datapath_id of the br-int OVS bridge to be reported in OVSDB, after creating the bridge and before checking supported datapath features. This prevents errors when querying the supported features before the ofproto-dpif provider has been initialized.
Fixes #6471