Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

heal: fix dataplane healing #1362

Merged
merged 1 commit into from
Sep 29, 2022

Conversation

glazychev-art
Copy link
Contributor

Signed-off-by: Artem Glazychev artem.glazychev@xored.com

Description

Issue link

#1359

How Has This Been Tested?

  • Added Fixed unit test
  • Tested manually
  • Tested by integration testing
  • Have not tested

Types of changes

  • Bug fix
  • New functionallity
  • Documentation
  • Refactoring
  • CI

timeout = 10 * time.Second
tick = 10 * time.Millisecond
requireTimeout = 2 * time.Second
ctxTimeout = 5 * time.Second
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do these timers impact reconvergence time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, these were unnecessary changes, just for testing.
I just wanted to check timeouts because I changed the check from require.Eventually to require.Never

Signed-off-by: Artem Glazychev <artem.glazychev@xored.com>
@@ -278,7 +278,7 @@ func TestNSMGRHealEndpoint_DatapathHealthy_CtrlPlaneBroken(t *testing.T) {
domain.Nodes[0].NewEndpoint(ctx, nseReg2, sandbox.GenerateTestToken, counter)

// Should not connect to new NSE
require.Eventually(t, func() bool { return counter.UniqueRequests() == 1 }, timeout, tick)
require.Never(t, func() bool { return counter.UniqueRequests() > 1 }, time.Second*2, tick)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to change this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an invalid condition in the test. It only worked on ci because of the timeout settings.

This test checks that if the dataplan is alive, we will not select another endpoint. Therefore, the number of unique requests to endpoints should never exceed 1.

In the previous implementation, the exit from Eventually happened almost instantly, just without waiting for the second request.

Copy link
Member

@denis-tingaikin denis-tingaikin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I very much like the channel approach for the datapath healing.

Can we add a test for this scenario to avoid regressions?

@glazychev-art
Copy link
Contributor Author

@denis-tingaikin
In fact, we already have tests that cover this. But one of them didn't work correctly (fixed from Eventually to Never)

@denis-tingaikin denis-tingaikin merged commit 95cee7e into networkservicemesh:main Sep 29, 2022
nsmbot pushed a commit to networkservicemesh/cmd-ipam-vl3 that referenced this pull request Sep 29, 2022
…k@main

PR link: networkservicemesh/sdk#1362

Commit: 95cee7e
Author: Artem Glazychev
Date: 2022-09-29 17:03:43 +0700
Message:
  - heal: fix dataplane healing (#1362)
Signed-off-by: Artem Glazychev <artem.glazychev@xored.com>
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-map-ip-k8s that referenced this pull request Sep 29, 2022
…k@main

PR link: networkservicemesh/sdk#1362

Commit: 95cee7e
Author: Artem Glazychev
Date: 2022-09-29 17:03:43 +0700
Message:
  - heal: fix dataplane healing (#1362)
Signed-off-by: Artem Glazychev <artem.glazychev@xored.com>
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/sdk-kernel that referenced this pull request Sep 29, 2022
…k@main

PR link: networkservicemesh/sdk#1362

Commit: 95cee7e
Author: Artem Glazychev
Date: 2022-09-29 17:03:43 +0700
Message:
  - heal: fix dataplane healing (#1362)
Signed-off-by: Artem Glazychev <artem.glazychev@xored.com>
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-cluster-info-k8s that referenced this pull request Sep 29, 2022
…k@main

PR link: networkservicemesh/sdk#1362

Commit: 95cee7e
Author: Artem Glazychev
Date: 2022-09-29 17:03:43 +0700
Message:
  - heal: fix dataplane healing (#1362)
Signed-off-by: Artem Glazychev <artem.glazychev@xored.com>
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/sdk-k8s that referenced this pull request Sep 29, 2022
…k@main

PR link: networkservicemesh/sdk#1362

Commit: 95cee7e
Author: Artem Glazychev
Date: 2022-09-29 17:03:43 +0700
Message:
  - heal: fix dataplane healing (#1362)
Signed-off-by: Artem Glazychev <artem.glazychev@xored.com>
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-nse-remote-vlan that referenced this pull request Sep 29, 2022
…k@main

PR link: networkservicemesh/sdk#1362

Commit: 95cee7e
Author: Artem Glazychev
Date: 2022-09-29 17:03:43 +0700
Message:
  - heal: fix dataplane healing (#1362)
Signed-off-by: Artem Glazychev <artem.glazychev@xored.com>
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-nsmgr-proxy that referenced this pull request Sep 29, 2022
…k@main

PR link: networkservicemesh/sdk#1362

Commit: 95cee7e
Author: Artem Glazychev
Date: 2022-09-29 17:03:43 +0700
Message:
  - heal: fix dataplane healing (#1362)
Signed-off-by: Artem Glazychev <artem.glazychev@xored.com>
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-registry-memory that referenced this pull request Sep 29, 2022
…k@main

PR link: networkservicemesh/sdk#1362

Commit: 95cee7e
Author: Artem Glazychev
Date: 2022-09-29 17:03:43 +0700
Message:
  - heal: fix dataplane healing (#1362)
Signed-off-by: Artem Glazychev <artem.glazychev@xored.com>
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-registry-proxy-dns that referenced this pull request Sep 29, 2022
…k@main

PR link: networkservicemesh/sdk#1362

Commit: 95cee7e
Author: Artem Glazychev
Date: 2022-09-29 17:03:43 +0700
Message:
  - heal: fix dataplane healing (#1362)
Signed-off-by: Artem Glazychev <artem.glazychev@xored.com>
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-nse-vfio that referenced this pull request Sep 29, 2022
…k@main

PR link: networkservicemesh/sdk#1362

Commit: 95cee7e
Author: Artem Glazychev
Date: 2022-09-29 17:03:43 +0700
Message:
  - heal: fix dataplane healing (#1362)
Signed-off-by: Artem Glazychev <artem.glazychev@xored.com>
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-nsc-init that referenced this pull request Sep 29, 2022
…k@main

PR link: networkservicemesh/sdk#1362

Commit: 95cee7e
Author: Artem Glazychev
Date: 2022-09-29 17:03:43 +0700
Message:
  - heal: fix dataplane healing (#1362)
Signed-off-by: Artem Glazychev <artem.glazychev@xored.com>
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
nsmbot pushed a commit to networkservicemesh/cmd-nsmgr that referenced this pull request Sep 29, 2022
…k@main

PR link: networkservicemesh/sdk#1362

Commit: 95cee7e
Author: Artem Glazychev
Date: 2022-09-29 17:03:43 +0700
Message:
  - heal: fix dataplane healing (#1362)
Signed-off-by: Artem Glazychev <artem.glazychev@xored.com>
Signed-off-by: NSMBot <nsmbot@networkservicmesh.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants