Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Kind-specific manifest and scripts #3413

Merged
merged 3 commits into from Mar 15, 2022

Conversation

antoninbas
Copy link
Contributor

@antoninbas antoninbas commented Mar 7, 2022

Use OVS Kernel datapath for Kind clusters

It seems that we no longer need to use the netdev datapath. In
particular, the Linux VM used for Docker Desktop on macOS now includes
the openvswitch and wireguard Kernel modules.

As a consequence:

  • we remove the Kind-specific YAML manifest and the install_cni_kind script
  • we remove the kind-fix-networking.sh script
  • we give a warning to the user if they try to use the netdev datapath,
    since it is not fully supported and there are no good use cases anymore
  • we enable all tests (except for the IPsec tests) which were previously
    skipped on Kind (requires increasing the test timeout by 10 mins)
  • we no longer need support for wireguard-go (userspace implementation
    of WireGuard)
  • we remove the coreDNS hack in e2e framework introduced for netdev
    datapath

@codecov-commenter
Copy link

codecov-commenter commented Mar 7, 2022

Codecov Report

Merging #3413 (703f2ed) into main (efca1bb) will decrease coverage by 8.53%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3413      +/-   ##
==========================================
- Coverage   63.53%   55.00%   -8.54%     
==========================================
  Files         268      374     +106     
  Lines       26900    41447   +14547     
==========================================
+ Hits        17091    22798    +5707     
- Misses       7944    16253    +8309     
- Partials     1865     2396     +531     
Flag Coverage Δ
integration-tests 35.82% <ø> (?)
kind-e2e-tests 55.81% <ø> (+4.46%) ⬆️
unit-tests 42.47% <ø> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/agent/util/ethtool/ethtool_linux.go 0.00% <0.00%> (-70.00%) ⬇️
pkg/agent/openflow/meters_linux.go 36.36% <0.00%> (-18.19%) ⬇️
pkg/agent/flowexporter/exporter/certificate.go 33.33% <0.00%> (-16.67%) ⬇️
pkg/agent/flowexporter/exporter/exporter.go 74.51% <0.00%> (-6.78%) ⬇️
...kg/agent/flowexporter/connections/conntrack_ovs.go 72.72% <0.00%> (-4.85%) ⬇️
pkg/controller/grouping/controller.go 64.65% <0.00%> (-2.59%) ⬇️
pkg/agent/controller/networkpolicy/fqdn.go 73.76% <0.00%> (-2.34%) ⬇️
pkg/ovs/ovsctl/appctl.go 42.39% <0.00%> (-2.18%) ⬇️
...g/agent/cniserver/interface_configuration_linux.go 15.80% <0.00%> (-0.35%) ⬇️
antrea/pkg/agent/proxy/endpoints.go 0.00% <0.00%> (ø)
... and 151 more

@antoninbas antoninbas force-pushed the remove-kind-specific-manifest branch 4 times, most recently from 7aecad7 to d97175c Compare March 8, 2022 22:09
@antoninbas antoninbas marked this pull request as ready for review March 8, 2022 22:09
@antoninbas
Copy link
Contributor Author

This PR depends on #3421 to avoid some Egress e2e test failures.

@antoninbas
Copy link
Contributor Author

@jianjuns @xliuxu I have kept the --wireguard-go flag for generate-manifest.sh for now. Do you think I should remove it now that we can just use the Wireguard kernel module for Kind clusters?

@antoninbas antoninbas force-pushed the remove-kind-specific-manifest branch from d97175c to ee83b82 Compare March 8, 2022 23:00
@xliuxu
Copy link
Contributor

xliuxu commented Mar 9, 2022

@antoninbas I think it is safe to remove the wireguard related manifests by removing the provider check in wireguard_test.go.

@tnqn tnqn added the action/release-note Indicates a PR that should be included in release notes. label Mar 9, 2022
@tnqn tnqn added this to the Antrea v1.6 release milestone Mar 9, 2022
ci/kind/test-e2e-kind.sh Outdated Show resolved Hide resolved
docs/kind.md Outdated Show resolved Hide resolved
docs/kind.md Outdated Show resolved Hide resolved
docs/kind.md Outdated Show resolved Hide resolved
test/e2e/traceflow_test.go Show resolved Hide resolved
}
}
assert.Contains(t, endpoints, fmt.Sprintf("%s:%d", nodeIP, apis.WireGuardListenPort))
}
}
Copy link
Contributor

@xliuxu xliuxu Mar 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests can pass with the above changes. I can work on a follow-up PR to ensure the traffic went through the tunnel without using the wg command line.

@antoninbas
Copy link
Contributor Author

@tnqn @xliuxu I have removed support for wireguard-go in the last commit since my understanding is that it is not needed any more

Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to see if we clean up this!

tnqn
tnqn previously approved these changes Mar 10, 2022
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one minor comment

docs/kind.md Show resolved Hide resolved
@antoninbas
Copy link
Contributor Author

@tnqn thanks for the reviews. I just finished fixing all the tests, and I squashed my commits.

@tnqn
Copy link
Member

tnqn commented Mar 10, 2022

@antoninbas I found two permanent failures:

  1. FAIL: TestEgress/testEgressUpdateNodeSelector (11.04s) in E2e tests on a Kind cluster on Linux with AntreaProxy all Service support
  2. FAIL: TestIPSec/testIPSecTunnelConnectivity (260.82s) in E2e tests on a Kind cluster on Linux (hybrid)

The first one was caused by an implementation bug and it's great to see this PR increases the test coverage and exposes the bug. I have created #3430 to fix it.

@antoninbas
Copy link
Contributor Author

@tnqn Thanks, I'll rebase when your PR is merged. For IPsec I have decided to disable the test cases in Kind for now. There are a couple of reasons for that:

  • the tests take too long to run 2 * 250s so almost 10 minutes
  • the test fails in the Kind Hybrid cluster: it seems to be because when 2 different docker bridges are used (this is how we "simulate" the hybrid encap case), docker still configures SNAT (masquerade) for cross-bridge traffic. This breaks the IPsec key exchange. I don't want to mess with the docker iptables rules so I'm skipping the tests.

We may want to consider adding a new Kind CI job later on which will run just the IPsec tests. It will take care of both issues above.

It seems that we no longer need to use the netdev datapath. In
particular, the Linux VM used for Docker Desktop on macOS now includes
the openvswitch and wireguard Kernel modules.

As a consequence:
* we remove the Kind-specific YAML manifest and the install_cni_kind script
* we remove the kind-fix-networking.sh script
* we give a warning to the user if they try to use the netdev datapath,
  since it is not fully supported and there are no good use cases anymore
* we enable all tests (except for the IPsec tests) which were previously
  skipped on Kind (requires increasing the test timeout by 10 mins)
* we no longer need support for wireguard-go (userspace implementation
  of WireGuard)
* we remove the coreDNS hack in e2e framework introduced for netdev
  datapath

Signed-off-by: Antonin Bas <abas@vmware.com>
@antoninbas antoninbas force-pushed the remove-kind-specific-manifest branch from f82fef5 to 4a4e079 Compare March 11, 2022 01:47
@antoninbas
Copy link
Contributor Author

/test-all

tnqn
tnqn previously approved these changes Mar 11, 2022
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@antoninbas
Copy link
Contributor Author

/test-e2e
some Flow Aggregator tests failing

@antoninbas
Copy link
Contributor Author

/test-e2e

@antoninbas antoninbas force-pushed the remove-kind-specific-manifest branch from 74be804 to 54a7cb0 Compare March 11, 2022 07:50
@antoninbas
Copy link
Contributor Author

/test-e2e

tnqn
tnqn previously approved these changes Mar 11, 2022
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tnqn
Copy link
Member

tnqn commented Mar 11, 2022

../test/e2e/fixtures.go:236:2: faClusterIP declared but not used

Signed-off-by: Antonin Bas <abas@vmware.com>
@antoninbas
Copy link
Contributor Author

/test-all

tnqn
tnqn previously approved these changes Mar 11, 2022
To avoid the following errors in the Agent logs:

```
E0311 22:19:33.901740       8 exporter.go:211] "Error when initializing flow
exporter" err="cannot retrieve CA cert: error getting ConfigMap
flow-aggregator-ca: configmaps \"flow-aggregator-ca\" is forbidden: User
\"system:serviceaccount:kube-system:antrea-agent\" cannot get resource
\"configmaps\" in API group \"\" in the namespace \"flow-aggregator\""
```

Signed-off-by: Antonin Bas <abas@vmware.com>
@antoninbas
Copy link
Contributor Author

/test-e2e

@antoninbas
Copy link
Contributor Author

/skip-networkpolicy
/skip-conformance

@antoninbas antoninbas merged commit a891eff into antrea-io:main Mar 15, 2022
@antoninbas antoninbas deleted the remove-kind-specific-manifest branch March 15, 2022 18:40
@tnqn tnqn mentioned this pull request Mar 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/release-note Indicates a PR that should be included in release notes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants