test: [WIN-NPM] dataplane test framework #1652

huntergregory · 2022-10-13T23:04:26Z

Framework to specify/test any arbitrary sequence of DP events.

network/hnswrapper/hnsv2wrapperfake.go

timraymond · 2022-10-17T18:39:32Z

network/hnswrapper/hnsv2wrapperfake.go

 	}
 }

+func (fEndpoint *FakeHostComputeEndpoint) PrettyString() string {


I realize that existing methods do this, but the generally accepted style in the wider Go community is for a single letter matching the first letter of the type the method is being implemented on for "this":

func (f *FakeHostComputeEndpoint) PrettyString() string

would you suggest keeping the same style throughout the file or adding new methods in the generally accepted way?

linters complain if all method receivers are not named the same, so all should be changed.

Yeah, ordinarily I'd say just do it for this one... but if linters are going to be a bother, than it should be done everywhere. Really, I just want to avoid perpetuating bad patterns in the name of consistency with the existing codebase.

for the reviewer's sake, can address this in a followup PR since it will create a diff for most of the file

timraymond · 2022-10-17T18:41:53Z

npm/pkg/dataplane/dataplane_windows_test.go

+			foundACL := false
+			for _, cacheACL := range cachedACLs {
+				if expectedACL.ID == cacheACL.ID {
+					if reflect.DeepEqual(expectedACL, cacheACL) {


cmp.Equal from google.com/go-cmp/cmp is categorically better than reflect.DeepEqual. The talk that lays out the argument for it is here: https://www.youtube.com/watch?v=OuT8YYAOOVI. There's also a cmp.Diff that will make it easier to see why it's not equal.

timraymond · 2022-10-17T18:43:56Z

npm/pkg/dataplane/dataplane_windows_test.go

+}
+
+// verifyACLs is true if HNS strictly has the expected Endpoints and ACLs.
+func verifyACLs(t *testing.T, hns *hnswrapper.Hnsv2wrapperFake, expectedEndpointACLs map[string][]*hnswrapper.FakeEndpointPolicy) bool {


If these are being used as true assertions (e.g. there's no running off to the network or anything other than asserting received data), then you should make this a helper using t.Helper(). That removes this function from the call stack so you can actually see which test failed. It avoids the issues we have with orchestrator_test.go in DNC.

timraymond · 2022-10-17T18:45:20Z

npm/pkg/dataplane/dataplane_windows_test.go

+				event(t, dp, hns)
+			}
+
+			verifyHNSCache(t, hns, tt.expectedSetPolicies, tt.expectedEnpdointACLs)


Yeah, now that I see this other one in use, this should definitely be a t.Helper().

timraymond · 2022-10-17T18:54:35Z

npm/pkg/dataplane/dataplane_windows_test.go

+func policyUpdateEvent(policy *networkingv1.NetworkPolicy) dpEvent {
+	return func(t *testing.T, dp *DataPlane, _ *hnswrapper.Hnsv2wrapperFake) {
+		npmNetPol, err := translation.TranslatePolicy(policy)
+		require.Nil(t, err, "failed to translate policy")


I'm not super crazy about doing assertions in functions that do things on your behalf, since I think it can obscure where assertions are happening. I'd rather see the tests do:

err := event(dp, fake) require.Nil(t, err, "executing event")

... and then, of course, use error wrapping mechanisms to pack in the detail that you have here:

return errors.Wrap(err, "failed to translate policy")

It also makes them a bit easier to move around if these functions become more generally useful, since there's no hard dependency on testing.

tamilmani1989 · 2022-10-19T06:43:16Z

npm/pkg/dataplane/dataplane_windows_test.go

+				wg := new(sync.WaitGroup)
+				wg.Add(len(session))
+				threadErrors := make(chan error, len(session))
+				for j, th := range session {


since it was named as concurrent sessions, shouldn't each session start in separate thread?

poor wording. thanks for pointing out

matmerr · 2022-10-20T20:39:47Z

npm/pkg/dataplane/e2es/types_windows_test.go

+
+type Action interface {
+	Do() error
+	SetHNS(hns *hnswrapper.Hnsv2wrapperFake)


imo the interface shouldn't contain implementation specific method names just for them to be ignored, can we have the child action just contain Do() and call SetHNS/SetDP with respective dependency injection at test runtime?

matmerr · 2022-10-20T20:53:54Z

npm/pkg/dataplane/e2es/e2e_windows_test.go

+				if s.InBackground {
+					wg := new(sync.WaitGroup)
+					wg.Add(1)
+					tt.AddStepWaitGroup(s.ID, wg)


if goal is to kick and move on, I think it would be simpler to define a wait group outside of the step loop and wait on that instead of maintaining a slice of waitgroups that only contain a single task

matmerr · 2022-10-20T21:07:43Z

npm/pkg/dataplane/e2es/e2e_windows_test.go

+			}
+
+			tt.WaitForAll()
+			close(backgroundErrors)


could potentially use errgroup here which may be used for a failfast
https://go.dev/play/p/pBY1lNK3EWB

thoughts @ramiro-gamarra?

errgroup can be useful, yes, but it seems like the goal here is to wait for multiple things to complete and verify no errors at the end. errgroup will return the first error, which may not paint the entire picture at the end of processing.

ramiro-gamarra

Some comments

ramiro-gamarra · 2022-10-21T20:00:53Z

npm/pkg/dataplane/e2es/e2e_windows_test.go

+				}
+			}
+
+			tt.WaitForAllStepsToComplete()


Seems like this waits for goroutines added with MarkStepRunningInBackground (and completed with MarkStepComplete)? Are more steps added in other scopes? Otherwise, would a regular waitgroup accomplish the same?

simplified things with a new structure. all go routines and wait groups would now be handled in the actual UT in test.Run()

ramiro-gamarra · 2022-10-21T20:02:07Z

npm/pkg/dataplane/e2es/e2e_windows_test.go

+			tt.WaitForAllStepsToComplete()
+			close(backgroundErrors)
+			for err := range backgroundErrors {
+				assert.Nil(t, err, "failed during concurrency")


Seems easier to just collect any errors and assert that there were none as opposed to handling nils from an error channel?

updated to require.Empty() for better semantics

ramiro-gamarra · 2022-10-21T20:05:11Z

npm/pkg/dataplane/e2es/e2e_windows_test.go

+			}
+
+			tt.WaitForAll()
+			close(backgroundErrors)


errgroup can be useful, yes, but it seems like the goal here is to wait for multiple things to complete and verify no errors at the end. errgroup will return the first error, which may not paint the entire picture at the end of processing.

ramiro-gamarra · 2022-10-21T20:08:00Z

npm/pkg/dataplane/e2es/e2e_windows_test.go

+							err = s.HNSAction.Do(hns)
+						} else if s.DPAction != nil {
+							err = s.DPAction.Do(dp)
+						}


Is a test step valid if both are nil or both are set?

an Action is valid only if exactly one of HNSAction or DPAction is non-nil

this design emulates cyclonus (upstream in netpol-api now). The Action here is similar to the Action in Cyclonus, although we use interfaces instead of having a separate "interpreter" for each case
https://github.com/mattfenwick/cyclonus/blob/ec73330a5c88654ed5545ccdbcfad33843722ebd/pkg/generator/action.go#L5

ramiro-gamarra · 2022-10-21T20:12:23Z

npm/pkg/dataplane/e2es/e2e_windows_test.go

+			}
+
+			tt.WaitForAllStepsToComplete()
+			close(backgroundErrors)


I believe there's a deadlock here in case of background errors: since its an unbuffered channel, any goroutine that encounters an error will be blocked on the send to the channel until something can consume it, but the only thing that consumes it starts AFTER all goroutines have completed.

thanks for catching this

ramiro-gamarra · 2022-10-24T16:51:16Z

npm/pkg/dataplane/dataplane_windows_test.go

+			require.NoError(t, err, "failed to initialize dp")
+
+			wg := new(sync.WaitGroup)
+			wg.Add(len(tt.Threads))


Don't mean to be pedantic, but is the "thread" terminology necessary? A thread usually carries the connotation of being an OS level construct, which Go abstracts away in the form of goroutines.

Perhaps GoRoutines then? or Workstreams?

Maybe Jobs? GoRoutines isn't bad... Workstreams is corp-speak though that I would avoid.

now using Jobs

ramiro-gamarra · 2022-10-24T17:08:54Z

npm/pkg/dataplane/dataplane_windows_test.go

+			errStrings := make([]string, len(backgroundErrors))
+			for err := range backgroundErrors {
+				errStrings = append(errStrings, fmt.Sprintf("[%s]", err.Error()))
+			}
+			assert.Empty(t, backgroundErrors, "encountered errors in threaded test: %s", strings.Join(errStrings, ","))


the errStrings initialization is odd: if there are any errors, the len of backgroundErrors will be > 0, but then this appends AFTER the last index.

asserting on channel length is also unorthodox. im not sure buffered channels are what you need here. my recommendation would be another type for multierr that's concurrent safe: you can just append to it from any goroutine. after tests, you just assert that the type contains no errors.

can use "go.uber.org/multierr"

I'd agree an error type for multiple errors is a good idea, but do we really need another dependency for it?

to avoid the new dependency, went back to a channel approach, except errString is initialized after checking length of backgroundErrors now

ramiro-gamarra · 2022-10-24T17:14:30Z

npm/pkg/dataplane/dataplane-test-cases_windows_test.go

+}
+
+func getAllSerialTests() []*SerialTestCase {
+	return []*SerialTestCase{


With this pattern, how do you run a single test?

You could target the test's name like go test . --run TestAllSerialCases/pod_x/a_created,_then_relevant_network_policy_created

Plan to add a function to filter by Tag as well (for local testing)

matmerr · 2022-10-24T23:33:38Z

npm/pkg/dataplane/dataplane-test-cases_windows_test.go

+	}
+}
+
+func getAllSerialTests() []*SerialTestCase {


Do we have any way of injecting expected errors to model nonhappy paths? Suppose we wanted to inject dp/hns returning errors and getting coverage in the controllers around those scenarios

matmerr

sticking to serial tests for the moment, need to add methods of injecting errors into mock hns/dp for testing nonhappy paths

huntergregory · 2022-10-25T00:16:08Z

/azp run

azure-pipelines · 2022-10-25T00:16:27Z

Azure Pipelines successfully started running 2 pipeline(s).

… pkg

…d of asserting on channel length

* wip with StrictlyHasSetPolicies approach * better approaching of getting all set policies * wip for rigorous win dp UTs * marshal setpolicies in hns mock and dont short circuit in UTs * policy stuff and update test cases * marshal ACLs in hns mock * more UTs and minor refinements * option to apply dp or not * address cmp.Equal and t.Helper comments * dpEvent returns error and better defined concurrency * remove unnecessary logic in concurrent test code * approach Azure#3 emulating cyclonus * namespace method for podmetadata * refactor Action structure and TestCase wait group behavior * hnsactions and renaming a file * refactor to Serial and ThreadedTestCase structs, and move files to dp pkg * hns latency hard coded to be the same for all threaded test cases * fix build error after rebasing * export fake hns network id * address comments on multierr and terminology * add comment about pod metadata in controller * pod update and delete actions * move ApplyDPAction to top * namespace actions and rename some fields of UpdatePod * adding code comments * reconcile action * fix bug in key-val ipsets * implement all previous test cases * fix incorrect error wrapping in dataplane.go * multi-job tests are working. updated terminology from routine to job * MultiErrManager instead of dependency for multierr * return to the channel approach for multierr, now using FailNow instead of asserting on channel length * fix some lints * fix more lints

huntergregory added npm Related to NPM. windows labels Oct 13, 2022

huntergregory requested a review from a team as a code owner October 13, 2022 23:04

huntergregory requested review from vakalapa and removed request for a team October 13, 2022 23:04

huntergregory changed the title ~~test: [WIN-NPM] dataplane UT framework~~ test: [WIN-NPM] dataplane test framework Oct 17, 2022

timraymond reviewed Oct 17, 2022

View reviewed changes

huntergregory mentioned this pull request Oct 17, 2022

run Windows UT's #1554

Merged

3 tasks

tamilmani1989 reviewed Oct 19, 2022

View reviewed changes

huntergregory marked this pull request as draft October 20, 2022 19:48

matmerr reviewed Oct 20, 2022

View reviewed changes

ramiro-gamarra reviewed Oct 21, 2022

View reviewed changes

huntergregory force-pushed the hgregory/win-dp-uts branch from a62d53c to 79464ed Compare October 21, 2022 23:49

ramiro-gamarra reviewed Oct 24, 2022

View reviewed changes

huntergregory marked this pull request as ready for review October 24, 2022 20:25

matmerr reviewed Oct 24, 2022

View reviewed changes

matmerr previously approved these changes Oct 24, 2022

View reviewed changes

ck319 previously approved these changes Oct 25, 2022

View reviewed changes

huntergregory added 8 commits October 25, 2022 15:26

wip with StrictlyHasSetPolicies approach

ccb9c70

better approaching of getting all set policies

a092f05

wip for rigorous win dp UTs

dea282f

marshal setpolicies in hns mock and dont short circuit in UTs

bf03f80

policy stuff and update test cases

8700de4

marshal ACLs in hns mock

f69c7a8

more UTs and minor refinements

2d0de59

option to apply dp or not

ce0b406

huntergregory added 24 commits October 25, 2022 15:26

remove unnecessary logic in concurrent test code

ee91dfd

approach #3 emulating cyclonus

eecb47d

namespace method for podmetadata

03ee497

refactor Action structure and TestCase wait group behavior

c0cdbc8

hnsactions and renaming a file

c27f3b4

refactor to Serial and ThreadedTestCase structs, and move files to dp…

477d66c

… pkg

hns latency hard coded to be the same for all threaded test cases

ba0cf5b

fix build error after rebasing

60928b3

export fake hns network id

32cd5ab

address comments on multierr and terminology

cfc0099

add comment about pod metadata in controller

c1bad9d

pod update and delete actions

e99fba5

move ApplyDPAction to top

67ba42c

namespace actions and rename some fields of UpdatePod

12d0dcf

adding code comments

76f040c

reconcile action

1f304f6

fix bug in key-val ipsets

9740ccd

implement all previous test cases

de6baaf

fix incorrect error wrapping in dataplane.go

9bde4b2

multi-job tests are working. updated terminology from routine to job

0aaba05

MultiErrManager instead of dependency for multierr

67d9518

return to the channel approach for multierr, now using FailNow instea…

036a6c8

…d of asserting on channel length

fix some lints

8cf6b59

fix more lints

39894d2

huntergregory dismissed stale reviews from ck319 and matmerr via 39894d2 October 25, 2022 22:26

huntergregory force-pushed the hgregory/win-dp-uts branch from a0316f9 to 39894d2 Compare October 25, 2022 22:26

vakalapa merged commit e3ffab8 into master Oct 31, 2022

vakalapa deleted the hgregory/win-dp-uts branch October 31, 2022 16:20

test: [WIN-NPM] dataplane test framework #1652

test: [WIN-NPM] dataplane test framework #1652

Uh oh!

Conversation

huntergregory commented Oct 13, 2022

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tamilmani1989 Oct 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ramiro-gamarra left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tamilmani1989 Oct 19, 2022 •

edited

Loading