-
Notifications
You must be signed in to change notification settings - Fork 260
feat: [NPM] Reset ipsets & update a Prometheus function ResetIPSetEntries (previously just used for UTs) #1108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| metrics.ResetIPSetEntries() | ||
| return nil | ||
| } | ||
| creator, originalNumSets, destroyFailureCount := iMgr.fileCreatorForReset(searchResults) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
creator naming will be specific to context of this function (ipset), for example, ipSetRestorer.
But, it is fine as is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps we can rename all the functions and variables with (file)creator in them, but this should be a separate cleanup PR
| hashedSetName := string(line) | ||
| names = append(names, hashedSetName) | ||
| // error handlers specific to resetting ipsets | ||
| errorHandlers := []*ioutil.LineErrorHandler{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work to classify all errors and corresponding actions.
I understood the high-level ideas about LineErrorHandler which holds necessary error handling information to the corresponding executed line when there is error.
While I am fine as is, but just curious.
Do we consolidate all error handlings code without this errorHandlers here - https://github.com/Azure/azure-container-networking/blob/master/npm/pkg/dataplane/ioutil/restore_linux.go#L199?
For maintenance, it would be nice to consolidate error handling in one places if possible.
I thought Callback has only error message, but it changed destroyFailureCount in some places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some callbacks modify destroyFailureCount because the error is a kind of failure to delete (as opposed to the set not existing, which is technically an error but not a failure in our eyes). The line you reference is code which recognizes which line an error happened on and is somewhat unrelated to the error handlers. Did this answer your question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. I understood the intention of codes.
My fundamental comments are whether we can consolidate all error definitions and handlers in one place per ipset and iptables instead of sparsely declaring them in each function and using Callback.
It will be easier to understand and maintain codes, but I may miss somethings and it will need big changes.
wdyt?
BTW, it is fine as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still might not fully understand your question. To try to help consolidate, for creatorForApply, I group all of the error handling code at the bottom of the file. I can't move the error handling code out of creatorForReset because of the dependency on destroyFailureCount.
We could consolidate the definition and method (e.g. ContinueAndAbortSection) but the callback will always be unique and will eventually do more than logging for most (e.g. mark a set as needing reconcile).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for instance though, we could make this error handler a function of the set name and destroyFailureCount pointer though
| // create restore file that flushes all sets, then deletes all sets | ||
| // technically don't need to flush a hashset | ||
| // this needs to be a separate function because we need to check creator contents in UTs | ||
| func (iMgr *IPSetManager) fileCreatorForReset(ipsetListOutput []byte) (creator *ioutil.FileCreator, originalNumAzureSets int, destroyFailureCount *int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed offline avoid usage of named returns, and potentially just return creator, originalNumAzureSets and destroyFailureCounts are purely used for for metrics tracking and don't particularly related to the usage of creator in clients
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we fail the lint check in favor of unnamed returns? Also, I'm not sure how we could decouple the numbers for metrics from the function output because we set metrics based on the success/failure of creator.RunCommand
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
| zero := 0 | ||
| destroyFailureCount = &zero | ||
| creator = ioutil.NewFileCreator(iMgr.ioShim, maxTryCount, ipsetRestoreLineFailurePattern) | ||
| func (iMgr *IPSetManager) fileCreatorForReset(ipsetListOutput []byte) (*ioutil.FileCreator, int, *int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we do not return pointer.
(*ioutil.FileCreator, int, *int) -> (*ioutil.FileCreator, int, int)and can use non-pointer operation in caller as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might need the pointer because the destroyFailureCount is updated after we get the creator and run creator.Run(). I'll add a clarifying comment next PR
JungukCho
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! You can resolve my last comment in next PR if it makes sense.
However, please check @matmerr comments.
Logic for resetting IPSets in Linux:
ipset list --name(which returns just the set names per line) and grep for "azure-npm-"Update for Prometheus
ResetIPSetEntries: