New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reject duplicate P4Runtime election IDs from multiple controllers #857
Conversation
Codecov Report
@@ Coverage Diff @@
## main #857 +/- ##
=======================================
Coverage 78.55% 78.56%
=======================================
Files 334 334
Lines 30054 30057 +3
=======================================
+ Hits 23610 23613 +3
Misses 6444 6444
|
883d21f
to
3418b8f
Compare
3418b8f
to
34a75cd
Compare
@pierventre FYI |
@pudelkoM regarding your additional comments, I agree on using a well tested library. I am not fully sure about your final remarks - in theory an ONOS cluster should not need this and each instance can potentially guess the election id of the other instances. Happy to discuss today. Thanks a lot for this hot fix. |
tested - works great |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
P4Service currently does not reject duplicate election IDs from controllers. This leads to a data corruption when a slave controller sends a MastershipUpdate with the same election ID as the current master:
AddOrModifyController 1, 2222, ipv4:127.0.0.1:56994.
Controller (2, 2222, ipv4:127.0.0.1:56994), Controller (3, 1212, ipv4:127.0.0.1:56994), Controller (1, 1111, ipv4:127.0.0.1:56994), .
Controller (2, 2222, ipv4:127.0.0.1:56994), Controller (3, 1212, ipv4:127.0.0.1:56994), .
Controller (2, 2222, ipv4:127.0.0.1:56994), Controller (3, 1212, ipv4:127.0.0.1:56994), .
Controller (connection_id: 1, election_id: 2222, uri: ipv4:127.0.0.1:56994) is connected as MASTER for node (aka device) with ID 123123123.
Controller #2 mastership update: arbitration { device_id: 123123123 election_id { low: 2222 } status { } }
Controller #3 mastership update: arbitration { device_id: 123123123 election_id { low: 2222 } status { code: 6 message: "You are not my master!" } }
ERR_PERMISSION_DENIED
errorsThis happens because we store Controllers in a
set
with a custom comparison function. Controllers are not hashed, but only compared by their election ID. Thus, two controllers with the same election ID are not possible.stratum/stratum/hal/lib/common/p4_service.h
Lines 235 to 240 in 34a75cd
stratum/stratum/hal/lib/common/p4_service.h
Lines 81 to 87 in 34a75cd
This PR changes this by checking for insertion success. Failure indicates a duplicate and we close the stream, all according to spec:
P4RT Spec - Section 5.3:
Additional considerations
Stratum is not fully P4RT spec compliant, with or without this fix. While this change brings us closer, a gap remains. At some point we should consider delegating P4RT related code to an external, well tested library.
Outright closing the stream without telling the offending controller the currently highest election ID puts the controller an an interesting situation: