Skip to content

Conversation

@huntergregory
Copy link
Contributor

Framework to specify/test any arbitrary sequence of DP events.

@huntergregory huntergregory added npm Related to NPM. windows labels Oct 13, 2022
@huntergregory huntergregory requested a review from a team as a code owner October 13, 2022 23:04
@huntergregory huntergregory requested review from vakalapa and removed request for a team October 13, 2022 23:04
@huntergregory huntergregory changed the title test: [WIN-NPM] dataplane UT framework test: [WIN-NPM] dataplane test framework Oct 17, 2022
}
}

func (fEndpoint *FakeHostComputeEndpoint) PrettyString() string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that existing methods do this, but the generally accepted style in the wider Go community is for a single letter matching the first letter of the type the method is being implemented on for "this":

func (f *FakeHostComputeEndpoint) PrettyString() string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you suggest keeping the same style throughout the file or adding new methods in the generally accepted way?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linters complain if all method receivers are not named the same, so all should be changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ordinarily I'd say just do it for this one... but if linters are going to be a bother, than it should be done everywhere. Really, I just want to avoid perpetuating bad patterns in the name of consistency with the existing codebase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the reviewer's sake, can address this in a followup PR since it will create a diff for most of the file

foundACL := false
for _, cacheACL := range cachedACLs {
if expectedACL.ID == cacheACL.ID {
if reflect.DeepEqual(expectedACL, cacheACL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmp.Equal from google.com/go-cmp/cmp is categorically better than reflect.DeepEqual. The talk that lays out the argument for it is here: https://www.youtube.com/watch?v=OuT8YYAOOVI. There's also a cmp.Diff that will make it easier to see why it's not equal.

}

// verifyACLs is true if HNS strictly has the expected Endpoints and ACLs.
func verifyACLs(t *testing.T, hns *hnswrapper.Hnsv2wrapperFake, expectedEndpointACLs map[string][]*hnswrapper.FakeEndpointPolicy) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these are being used as true assertions (e.g. there's no running off to the network or anything other than asserting received data), then you should make this a helper using t.Helper(). That removes this function from the call stack so you can actually see which test failed. It avoids the issues we have with orchestrator_test.go in DNC.

event(t, dp, hns)
}

verifyHNSCache(t, hns, tt.expectedSetPolicies, tt.expectedEnpdointACLs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, now that I see this other one in use, this should definitely be a t.Helper().

func policyUpdateEvent(policy *networkingv1.NetworkPolicy) dpEvent {
return func(t *testing.T, dp *DataPlane, _ *hnswrapper.Hnsv2wrapperFake) {
npmNetPol, err := translation.TranslatePolicy(policy)
require.Nil(t, err, "failed to translate policy")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super crazy about doing assertions in functions that do things on your behalf, since I think it can obscure where assertions are happening. I'd rather see the tests do:

err := event(dp, fake)
require.Nil(t, err, "executing event")

... and then, of course, use error wrapping mechanisms to pack in the detail that you have here:

return errors.Wrap(err, "failed to translate policy")

It also makes them a bit easier to move around if these functions become more generally useful, since there's no hard dependency on testing.

@huntergregory huntergregory mentioned this pull request Oct 17, 2022
3 tasks
wg := new(sync.WaitGroup)
wg.Add(len(session))
threadErrors := make(chan error, len(session))
for j, th := range session {
Copy link
Member

@tamilmani1989 tamilmani1989 Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since it was named as concurrent sessions, shouldn't each session start in separate thread?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

poor wording. thanks for pointing out

@huntergregory huntergregory marked this pull request as draft October 20, 2022 19:48

type Action interface {
Do() error
SetHNS(hns *hnswrapper.Hnsv2wrapperFake)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo the interface shouldn't contain implementation specific method names just for them to be ignored, can we have the child action just contain Do() and call SetHNS/SetDP with respective dependency injection at test runtime?

if s.InBackground {
wg := new(sync.WaitGroup)
wg.Add(1)
tt.AddStepWaitGroup(s.ID, wg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if goal is to kick and move on, I think it would be simpler to define a wait group outside of the step loop and wait on that instead of maintaining a slice of waitgroups that only contain a single task

}

tt.WaitForAll()
close(backgroundErrors)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could potentially use errgroup here which may be used for a failfast
https://go.dev/play/p/pBY1lNK3EWB

thoughts @ramiro-gamarra?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errgroup can be useful, yes, but it seems like the goal here is to wait for multiple things to complete and verify no errors at the end. errgroup will return the first error, which may not paint the entire picture at the end of processing.

Copy link
Contributor

@ramiro-gamarra ramiro-gamarra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments

}
}

tt.WaitForAllStepsToComplete()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this waits for goroutines added with MarkStepRunningInBackground (and completed with MarkStepComplete)? Are more steps added in other scopes? Otherwise, would a regular waitgroup accomplish the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplified things with a new structure. all go routines and wait groups would now be handled in the actual UT in test.Run()

tt.WaitForAllStepsToComplete()
close(backgroundErrors)
for err := range backgroundErrors {
assert.Nil(t, err, "failed during concurrency")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems easier to just collect any errors and assert that there were none as opposed to handling nils from an error channel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to require.Empty() for better semantics

}

tt.WaitForAll()
close(backgroundErrors)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errgroup can be useful, yes, but it seems like the goal here is to wait for multiple things to complete and verify no errors at the end. errgroup will return the first error, which may not paint the entire picture at the end of processing.

err = s.HNSAction.Do(hns)
} else if s.DPAction != nil {
err = s.DPAction.Do(dp)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a test step valid if both are nil or both are set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an Action is valid only if exactly one of HNSAction or DPAction is non-nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this design emulates cyclonus (upstream in netpol-api now). The Action here is similar to the Action in Cyclonus, although we use interfaces instead of having a separate "interpreter" for each case
https://github.com/mattfenwick/cyclonus/blob/ec73330a5c88654ed5545ccdbcfad33843722ebd/pkg/generator/action.go#L5

}

tt.WaitForAllStepsToComplete()
close(backgroundErrors)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there's a deadlock here in case of background errors: since its an unbuffered channel, any goroutine that encounters an error will be blocked on the send to the channel until something can consume it, but the only thing that consumes it starts AFTER all goroutines have completed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for catching this

require.NoError(t, err, "failed to initialize dp")

wg := new(sync.WaitGroup)
wg.Add(len(tt.Threads))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't mean to be pedantic, but is the "thread" terminology necessary? A thread usually carries the connotation of being an OS level construct, which Go abstracts away in the form of goroutines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps GoRoutines then? or Workstreams?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe Jobs? GoRoutines isn't bad... Workstreams is corp-speak though that I would avoid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now using Jobs

Comment on lines 105 to 109
errStrings := make([]string, len(backgroundErrors))
for err := range backgroundErrors {
errStrings = append(errStrings, fmt.Sprintf("[%s]", err.Error()))
}
assert.Empty(t, backgroundErrors, "encountered errors in threaded test: %s", strings.Join(errStrings, ","))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • the errStrings initialization is odd: if there are any errors, the len of backgroundErrors will be > 0, but then this appends AFTER the last index.
  • asserting on channel length is also unorthodox. im not sure buffered channels are what you need here. my recommendation would be another type for multierr that's concurrent safe: you can just append to it from any goroutine. after tests, you just assert that the type contains no errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can use "go.uber.org/multierr"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd agree an error type for multiple errors is a good idea, but do we really need another dependency for it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to avoid the new dependency, went back to a channel approach, except errString is initialized after checking length of backgroundErrors now

}

func getAllSerialTests() []*SerialTestCase {
return []*SerialTestCase{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this pattern, how do you run a single test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could target the test's name like go test . --run TestAllSerialCases/pod_x/a_created,_then_relevant_network_policy_created

Plan to add a function to filter by Tag as well (for local testing)

@huntergregory huntergregory marked this pull request as ready for review October 24, 2022 20:25
}
}

func getAllSerialTests() []*SerialTestCase {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any way of injecting expected errors to model nonhappy paths? Suppose we wanted to inject dp/hns returning errors and getting coverage in the controllers around those scenarios

matmerr
matmerr previously approved these changes Oct 24, 2022
Copy link
Member

@matmerr matmerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sticking to serial tests for the moment, need to add methods of injecting errors into mock hns/dp for testing nonhappy paths

ck319
ck319 previously approved these changes Oct 25, 2022
@huntergregory
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@huntergregory huntergregory dismissed stale reviews from ck319 and matmerr via 39894d2 October 25, 2022 22:26
@vakalapa vakalapa merged commit e3ffab8 into master Oct 31, 2022
@vakalapa vakalapa deleted the hgregory/win-dp-uts branch October 31, 2022 16:20
rjdenney pushed a commit to rjdenney/azure-container-networking that referenced this pull request Jan 19, 2023
* wip with StrictlyHasSetPolicies approach

* better approaching of getting all set policies

* wip for rigorous win dp UTs

* marshal setpolicies in hns mock and dont short circuit in UTs

* policy stuff and update test cases

* marshal ACLs in hns mock

* more UTs and minor refinements

* option to apply dp or not

* address cmp.Equal and t.Helper comments

* dpEvent returns error and better defined concurrency

* remove unnecessary logic in concurrent test code

* approach Azure#3 emulating cyclonus

* namespace method for podmetadata

* refactor Action structure and TestCase wait group behavior

* hnsactions and renaming a file

* refactor to Serial and ThreadedTestCase structs, and move files to dp pkg

* hns latency hard coded to be the same for all threaded test cases

* fix build error after rebasing

* export fake hns network id

* address comments on multierr and terminology

* add comment about pod metadata in controller

* pod update and delete actions

* move ApplyDPAction to top

* namespace actions and rename some fields of UpdatePod

* adding code comments

* reconcile action

* fix bug in key-val ipsets

* implement all previous test cases

* fix incorrect error wrapping in dataplane.go

* multi-job tests are working. updated terminology from routine to job

* MultiErrManager instead of dependency for multierr

* return to the channel approach for multierr, now using FailNow instead of asserting on channel length

* fix some lints

* fix more lints
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

npm Related to NPM. windows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants