E2E tests: add basic disruption tests #221

martinkennelly · 2022-10-24T12:09:24Z

Signed-off-by: Martin Kennelly mkennell@redhat.com

Expecting test named IngressNodeFirewall policy is configurable after daemon deletion to fail - seen with KinD and bare metal server.
INF policy configuration isn't working after daemon deletion.

Executed go mod tidy + go mod vendor
Executed make lint and its clean

msherif1234 · 2022-10-24T12:51:17Z

/assign @msherif1234

martinkennelly · 2022-10-24T13:01:18Z

P.S if you want me to add reboot test scenario LMK. I know you wanted to do it previously.

martinkennelly · 2022-10-24T13:02:33Z

Also, I didnt hide these tests behind a flag because the tests arent adding that much latency to the overall test time. If they get too long later, we can do that.

to be to reuse unix socket we have to unlink them when the events container is shuting down Signed-off-by: Mohamed Mahmoud <mmahmoud@redhat.com>

Signed-off-by: Martin Kennelly <mkennell@redhat.com>

Daemon may miss deletion of this object and therefore not handle any cleanup operations that needs to be performed. Wait for daemon to consume deletion update and only then allow this object to be deleted. Signed-off-by: Martin Kennelly <mkennell@redhat.com>

If the daemon is restarted, we start not knowing about any of the managed interfaces. If we perform activities like reseting everything due to IngressNodeFirewallNodeState deletion for example, we will not cleanup all XDP programs. IngNodeFwController knows all pinned interfaces and ensure we know about them in ebpfSingleton. Future work is needed to refactor this to not have two seperate sources of information about the same thing. Signed-off-by: Martin Kennelly <mkennell@redhat.com>

martinkennelly · 2022-10-30T15:59:59Z

/retest
Infra failure

msherif1234 · 2022-10-31T11:56:03Z

/retest

msherif1234

I will take closer look to the new e2e test cases

test/e2e/functional/tests/e2e.go

msherif1234 · 2022-10-31T12:18:30Z

controllers/ingressnodefirewallnodestate_controller.go

+
+func isNodeStateDeletionInProgress(nodeState *infv1alpha1.IngressNodeFirewallNodeState) bool {
+	return !nodeState.ObjectMeta.DeletionTimestamp.IsZero()
+}


Hmm I did that in the past attempting to fix ut race but it really didn't help, I think this the right thing because fw and node state are different scope and there is an issue with garbage collection hence its recommended to use finalizers , the only thing here when I used the same there were issues I hit during testing which I don't remember them atm but given we have much more advanced e2e now and if this works I am fine with adding it

Can u explain what is the exact issue(s) the finalizers helped with currently the nodeState controller taking care of deleting objects.

If daemon is not listening for events when node state is deleted, it will miss cleaning up the node. Finalizer ensures the daemon consumes the delete event for node state, performs the cleanup ops, and then removes the finalizer.

msherif1234 · 2022-10-31T12:24:28Z

pkg/ebpfsyncer/ebpfsyncer.go

@@ -87,6 +87,12 @@ func (e *ebpfSingleton) SyncInterfaceIngressRules(
 		return err
 	}

+	// Ensure IngNodeFwController's pinned links and our managed interfaces align
+	// TODO: refactor to not have managed interfaces names from two sources
+	for _, linkName := range e.c.GetPinnedLinkNames() {


why we need this createNewManager will load pinned map ?

pls make sure with any changes to this package u run this manual ut from ebpfsyncer folder
ebpfsyncer]$ go test -exec sudo ./...

msherif1234 · 2022-10-31T12:27:53Z

@andreaskaris PTAL

openshift-ci · 2022-10-31T12:27:56Z

@msherif1234: GitHub didn't allow me to assign the following users: PTAL.

Note that only openshift members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @andreaskaris PTAL

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

msherif1234 · 2022-10-31T14:39:00Z

/test ingress-node-firewall-e2e-metal-ipi

martinkennelly · 2022-11-01T14:56:53Z

/hold

openshift-merge-robot · 2022-11-01T14:57:02Z

@martinkennelly: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-bot · 2023-01-31T01:00:40Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2023-03-02T08:30:52Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2023-04-02T00:01:00Z

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2023-04-02T00:01:58Z

@openshift-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci · 2023-05-18T19:35:20Z

@martinkennelly: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/unit-test	`957b7a9`	link	true	`/test unit-test`
ci/prow/ingress-node-firewall-e2e-metal-ipi	`957b7a9`	link	true	`/test ingress-node-firewall-e2e-metal-ipi`
ci/prow/images	`957b7a9`	link	true	`/test images`
ci/prow/ci-index	`957b7a9`	link	true	`/test ci-index`
ci/prow/lint	`957b7a9`	link	true	`/test lint`
ci/prow/test-fmt	`957b7a9`	link	true	`/test test-fmt`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci · 2023-05-18T19:35:44Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: martinkennelly
Once this PR has been reviewed and has the lgtm label, please ask for approval from msherif1234. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

This is continuation of github.com/openshift/pull/221 Signed-off-by: msherif1234 <mmahmoud@redhat.com>

msherif1234 · 2023-05-27T13:28:26Z

PR #354 will continue this work

This is continuation of github.com/openshift/pull/221 Signed-off-by: msherif1234 <mmahmoud@redhat.com>

This is continuation of github.com/openshift/pull/221 Signed-off-by: msherif1234 <mmahmoud@redhat.com> (cherry picked from commit 601efda)

This is continuation of github.com/openshift/pull/221 Signed-off-by: msherif1234 <mmahmoud@redhat.com>

openshift-ci bot requested review from danwinship and dcbw October 24, 2022 12:10

martinkennelly mentioned this pull request Oct 24, 2022

Daemon doesnt remove INF policy #222

Closed

openshift-ci bot assigned msherif1234 Oct 24, 2022

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 27, 2022

msherif1234 and others added 4 commits October 28, 2022 17:23

OCPBUGS-2934: unlink events unix socket during shutdown

19a6ce3

to be to reuse unix socket we have to unlink them when the events container is shuting down Signed-off-by: Mohamed Mahmoud <mmahmoud@redhat.com>

E2E tests: add basic disruption tests

24c6846

Signed-off-by: Martin Kennelly <mkennell@redhat.com>

martinkennelly force-pushed the disruption branch from 953ee96 to 957b7a9 Compare October 30, 2022 15:31

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 30, 2022

msherif1234 reviewed Oct 31, 2022

View reviewed changes

openshift-ci bot assigned andreaskaris Oct 31, 2022

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 1, 2022

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 1, 2022

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 31, 2023

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 2, 2023

openshift-ci bot closed this Apr 2, 2023

msherif1234 mentioned this pull request May 18, 2023

OCPBUGS-11888: handle daemonSet pods restart #347

Merged

msherif1234 reopened this May 18, 2023

msherif1234 added a commit to msherif1234/ingress-node-firewall that referenced this pull request May 27, 2023

SDN-3979: adding disruption e2e coverage

f4f6de2

This is continuation of github.com/openshift/pull/221 Signed-off-by: msherif1234 <mmahmoud@redhat.com>

msherif1234 mentioned this pull request May 27, 2023

SDN-3979: E2E extend testing by including disruption test cases #354

Merged

msherif1234 closed this May 27, 2023

msherif1234 added a commit to msherif1234/ingress-node-firewall that referenced this pull request May 27, 2023

SDN-3979: adding disruption e2e coverage

e343a94

This is continuation of github.com/openshift/pull/221 Signed-off-by: msherif1234 <mmahmoud@redhat.com>

msherif1234 added a commit to msherif1234/ingress-node-firewall that referenced this pull request May 27, 2023

SDN-3979: adding disruption e2e coverage

37ed3c0

This is continuation of github.com/openshift/pull/221 Signed-off-by: msherif1234 <mmahmoud@redhat.com>

msherif1234 added a commit to msherif1234/ingress-node-firewall that referenced this pull request May 27, 2023

SDN-3979: adding disruption e2e coverage

604e568

This is continuation of github.com/openshift/pull/221 Signed-off-by: msherif1234 <mmahmoud@redhat.com>

msherif1234 added a commit to msherif1234/ingress-node-firewall that referenced this pull request May 28, 2023

SDN-3979: adding disruption e2e coverage

32e6c84

This is continuation of github.com/openshift/pull/221 Signed-off-by: msherif1234 <mmahmoud@redhat.com>

msherif1234 added a commit to msherif1234/ingress-node-firewall that referenced this pull request May 28, 2023

SDN-3979: adding disruption e2e coverage

1f973a4

This is continuation of github.com/openshift/pull/221 Signed-off-by: msherif1234 <mmahmoud@redhat.com>

msherif1234 added a commit to msherif1234/ingress-node-firewall that referenced this pull request May 28, 2023

SDN-3979: adding disruption e2e coverage

8afbbe0

This is continuation of github.com/openshift/pull/221 Signed-off-by: msherif1234 <mmahmoud@redhat.com>

msherif1234 added a commit to msherif1234/ingress-node-firewall that referenced this pull request May 28, 2023

SDN-3979: adding disruption e2e coverage

93f7ae6

This is continuation of github.com/openshift/pull/221 Signed-off-by: msherif1234 <mmahmoud@redhat.com>

msherif1234 added a commit to msherif1234/ingress-node-firewall that referenced this pull request May 29, 2023

SDN-3979: adding disruption e2e coverage

531b7d2

This is continuation of github.com/openshift/pull/221 Signed-off-by: msherif1234 <mmahmoud@redhat.com>

msherif1234 added a commit to msherif1234/ingress-node-firewall that referenced this pull request May 29, 2023

SDN-3979: adding disruption e2e coverage

0ed6340

This is continuation of github.com/openshift/pull/221 Signed-off-by: msherif1234 <mmahmoud@redhat.com>

msherif1234 added a commit to msherif1234/ingress-node-firewall that referenced this pull request May 29, 2023

SDN-3979: adding disruption e2e coverage

601efda

This is continuation of github.com/openshift/pull/221 Signed-off-by: msherif1234 <mmahmoud@redhat.com>

msherif1234 added a commit to msherif1234/ingress-node-firewall that referenced this pull request Mar 8, 2024

SDN-3979: adding disruption e2e coverage

83feba4

This is continuation of github.com/openshift/pull/221 Signed-off-by: msherif1234 <mmahmoud@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

E2E tests: add basic disruption tests #221

E2E tests: add basic disruption tests #221

martinkennelly commented Oct 24, 2022

msherif1234 commented Oct 24, 2022

martinkennelly commented Oct 24, 2022

martinkennelly commented Oct 24, 2022

martinkennelly commented Oct 30, 2022

msherif1234 commented Oct 31, 2022

msherif1234 left a comment

msherif1234 Oct 31, 2022 •

edited

Loading

martinkennelly Nov 1, 2022

msherif1234 Oct 31, 2022

msherif1234 Oct 31, 2022

msherif1234 commented Oct 31, 2022 •

edited

Loading

openshift-ci bot commented Oct 31, 2022

msherif1234 commented Oct 31, 2022

martinkennelly commented Nov 1, 2022

openshift-merge-robot commented Nov 1, 2022

openshift-bot commented Jan 31, 2023

openshift-bot commented Mar 2, 2023

openshift-bot commented Apr 2, 2023

openshift-ci bot commented Apr 2, 2023

openshift-ci bot commented May 18, 2023

openshift-ci bot commented May 18, 2023

msherif1234 commented May 27, 2023

E2E tests: add basic disruption tests #221

E2E tests: add basic disruption tests #221

Conversation

martinkennelly commented Oct 24, 2022

msherif1234 commented Oct 24, 2022

martinkennelly commented Oct 24, 2022

martinkennelly commented Oct 24, 2022

martinkennelly commented Oct 30, 2022

msherif1234 commented Oct 31, 2022

msherif1234 left a comment

Choose a reason for hiding this comment

msherif1234 Oct 31, 2022 • edited Loading

Choose a reason for hiding this comment

martinkennelly Nov 1, 2022

Choose a reason for hiding this comment

msherif1234 Oct 31, 2022

Choose a reason for hiding this comment

msherif1234 Oct 31, 2022

Choose a reason for hiding this comment

msherif1234 commented Oct 31, 2022 • edited Loading

openshift-ci bot commented Oct 31, 2022

msherif1234 commented Oct 31, 2022

martinkennelly commented Nov 1, 2022

openshift-merge-robot commented Nov 1, 2022

openshift-bot commented Jan 31, 2023

openshift-bot commented Mar 2, 2023

openshift-bot commented Apr 2, 2023

openshift-ci bot commented Apr 2, 2023

openshift-ci bot commented May 18, 2023

openshift-ci bot commented May 18, 2023

msherif1234 commented May 27, 2023

msherif1234 Oct 31, 2022 •

edited

Loading

msherif1234 commented Oct 31, 2022 •

edited

Loading