-
Notifications
You must be signed in to change notification settings - Fork 260
test: [WIN-NPM] dataplane test framework #1652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| } | ||
| } | ||
|
|
||
| func (fEndpoint *FakeHostComputeEndpoint) PrettyString() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize that existing methods do this, but the generally accepted style in the wider Go community is for a single letter matching the first letter of the type the method is being implemented on for "this":
func (f *FakeHostComputeEndpoint) PrettyString() stringThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would you suggest keeping the same style throughout the file or adding new methods in the generally accepted way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
linters complain if all method receivers are not named the same, so all should be changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, ordinarily I'd say just do it for this one... but if linters are going to be a bother, than it should be done everywhere. Really, I just want to avoid perpetuating bad patterns in the name of consistency with the existing codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the reviewer's sake, can address this in a followup PR since it will create a diff for most of the file
| foundACL := false | ||
| for _, cacheACL := range cachedACLs { | ||
| if expectedACL.ID == cacheACL.ID { | ||
| if reflect.DeepEqual(expectedACL, cacheACL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmp.Equal from google.com/go-cmp/cmp is categorically better than reflect.DeepEqual. The talk that lays out the argument for it is here: https://www.youtube.com/watch?v=OuT8YYAOOVI. There's also a cmp.Diff that will make it easier to see why it's not equal.
| } | ||
|
|
||
| // verifyACLs is true if HNS strictly has the expected Endpoints and ACLs. | ||
| func verifyACLs(t *testing.T, hns *hnswrapper.Hnsv2wrapperFake, expectedEndpointACLs map[string][]*hnswrapper.FakeEndpointPolicy) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these are being used as true assertions (e.g. there's no running off to the network or anything other than asserting received data), then you should make this a helper using t.Helper(). That removes this function from the call stack so you can actually see which test failed. It avoids the issues we have with orchestrator_test.go in DNC.
| event(t, dp, hns) | ||
| } | ||
|
|
||
| verifyHNSCache(t, hns, tt.expectedSetPolicies, tt.expectedEnpdointACLs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, now that I see this other one in use, this should definitely be a t.Helper().
| func policyUpdateEvent(policy *networkingv1.NetworkPolicy) dpEvent { | ||
| return func(t *testing.T, dp *DataPlane, _ *hnswrapper.Hnsv2wrapperFake) { | ||
| npmNetPol, err := translation.TranslatePolicy(policy) | ||
| require.Nil(t, err, "failed to translate policy") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super crazy about doing assertions in functions that do things on your behalf, since I think it can obscure where assertions are happening. I'd rather see the tests do:
err := event(dp, fake)
require.Nil(t, err, "executing event")... and then, of course, use error wrapping mechanisms to pack in the detail that you have here:
return errors.Wrap(err, "failed to translate policy")It also makes them a bit easier to move around if these functions become more generally useful, since there's no hard dependency on testing.
| wg := new(sync.WaitGroup) | ||
| wg.Add(len(session)) | ||
| threadErrors := make(chan error, len(session)) | ||
| for j, th := range session { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since it was named as concurrent sessions, shouldn't each session start in separate thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
poor wording. thanks for pointing out
|
|
||
| type Action interface { | ||
| Do() error | ||
| SetHNS(hns *hnswrapper.Hnsv2wrapperFake) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo the interface shouldn't contain implementation specific method names just for them to be ignored, can we have the child action just contain Do() and call SetHNS/SetDP with respective dependency injection at test runtime?
| if s.InBackground { | ||
| wg := new(sync.WaitGroup) | ||
| wg.Add(1) | ||
| tt.AddStepWaitGroup(s.ID, wg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if goal is to kick and move on, I think it would be simpler to define a wait group outside of the step loop and wait on that instead of maintaining a slice of waitgroups that only contain a single task
| } | ||
|
|
||
| tt.WaitForAll() | ||
| close(backgroundErrors) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could potentially use errgroup here which may be used for a failfast
https://go.dev/play/p/pBY1lNK3EWB
thoughts @ramiro-gamarra?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errgroup can be useful, yes, but it seems like the goal here is to wait for multiple things to complete and verify no errors at the end. errgroup will return the first error, which may not paint the entire picture at the end of processing.
ramiro-gamarra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments
| } | ||
| } | ||
|
|
||
| tt.WaitForAllStepsToComplete() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this waits for goroutines added with MarkStepRunningInBackground (and completed with MarkStepComplete)? Are more steps added in other scopes? Otherwise, would a regular waitgroup accomplish the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simplified things with a new structure. all go routines and wait groups would now be handled in the actual UT in test.Run()
| tt.WaitForAllStepsToComplete() | ||
| close(backgroundErrors) | ||
| for err := range backgroundErrors { | ||
| assert.Nil(t, err, "failed during concurrency") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems easier to just collect any errors and assert that there were none as opposed to handling nils from an error channel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated to require.Empty() for better semantics
| } | ||
|
|
||
| tt.WaitForAll() | ||
| close(backgroundErrors) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errgroup can be useful, yes, but it seems like the goal here is to wait for multiple things to complete and verify no errors at the end. errgroup will return the first error, which may not paint the entire picture at the end of processing.
| err = s.HNSAction.Do(hns) | ||
| } else if s.DPAction != nil { | ||
| err = s.DPAction.Do(dp) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a test step valid if both are nil or both are set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an Action is valid only if exactly one of HNSAction or DPAction is non-nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this design emulates cyclonus (upstream in netpol-api now). The Action here is similar to the Action in Cyclonus, although we use interfaces instead of having a separate "interpreter" for each case
https://github.com/mattfenwick/cyclonus/blob/ec73330a5c88654ed5545ccdbcfad33843722ebd/pkg/generator/action.go#L5
| } | ||
|
|
||
| tt.WaitForAllStepsToComplete() | ||
| close(backgroundErrors) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe there's a deadlock here in case of background errors: since its an unbuffered channel, any goroutine that encounters an error will be blocked on the send to the channel until something can consume it, but the only thing that consumes it starts AFTER all goroutines have completed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for catching this
a62d53c to
79464ed
Compare
| require.NoError(t, err, "failed to initialize dp") | ||
|
|
||
| wg := new(sync.WaitGroup) | ||
| wg.Add(len(tt.Threads)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't mean to be pedantic, but is the "thread" terminology necessary? A thread usually carries the connotation of being an OS level construct, which Go abstracts away in the form of goroutines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps GoRoutines then? or Workstreams?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe Jobs? GoRoutines isn't bad... Workstreams is corp-speak though that I would avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now using Jobs
| errStrings := make([]string, len(backgroundErrors)) | ||
| for err := range backgroundErrors { | ||
| errStrings = append(errStrings, fmt.Sprintf("[%s]", err.Error())) | ||
| } | ||
| assert.Empty(t, backgroundErrors, "encountered errors in threaded test: %s", strings.Join(errStrings, ",")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- the
errStringsinitialization is odd: if there are any errors, the len ofbackgroundErrorswill be > 0, but then this appends AFTER the last index. - asserting on channel length is also unorthodox. im not sure buffered channels are what you need here. my recommendation would be another type for
multierrthat's concurrent safe: you can just append to it from any goroutine. after tests, you just assert that the type contains no errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can use "go.uber.org/multierr"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd agree an error type for multiple errors is a good idea, but do we really need another dependency for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to avoid the new dependency, went back to a channel approach, except errString is initialized after checking length of backgroundErrors now
| } | ||
|
|
||
| func getAllSerialTests() []*SerialTestCase { | ||
| return []*SerialTestCase{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this pattern, how do you run a single test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could target the test's name like go test . --run TestAllSerialCases/pod_x/a_created,_then_relevant_network_policy_created
Plan to add a function to filter by Tag as well (for local testing)
| } | ||
| } | ||
|
|
||
| func getAllSerialTests() []*SerialTestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any way of injecting expected errors to model nonhappy paths? Suppose we wanted to inject dp/hns returning errors and getting coverage in the controllers around those scenarios
matmerr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sticking to serial tests for the moment, need to add methods of injecting errors into mock hns/dp for testing nonhappy paths
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
…d of asserting on channel length
a0316f9 to
39894d2
Compare
* wip with StrictlyHasSetPolicies approach * better approaching of getting all set policies * wip for rigorous win dp UTs * marshal setpolicies in hns mock and dont short circuit in UTs * policy stuff and update test cases * marshal ACLs in hns mock * more UTs and minor refinements * option to apply dp or not * address cmp.Equal and t.Helper comments * dpEvent returns error and better defined concurrency * remove unnecessary logic in concurrent test code * approach Azure#3 emulating cyclonus * namespace method for podmetadata * refactor Action structure and TestCase wait group behavior * hnsactions and renaming a file * refactor to Serial and ThreadedTestCase structs, and move files to dp pkg * hns latency hard coded to be the same for all threaded test cases * fix build error after rebasing * export fake hns network id * address comments on multierr and terminology * add comment about pod metadata in controller * pod update and delete actions * move ApplyDPAction to top * namespace actions and rename some fields of UpdatePod * adding code comments * reconcile action * fix bug in key-val ipsets * implement all previous test cases * fix incorrect error wrapping in dataplane.go * multi-job tests are working. updated terminology from routine to job * MultiErrManager instead of dependency for multierr * return to the channel approach for multierr, now using FailNow instead of asserting on channel length * fix some lints * fix more lints
Framework to specify/test any arbitrary sequence of DP events.