Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix strongSwan logging configuration #1184

Merged
merged 1 commit into from Aug 31, 2020
Merged

Conversation

jianjuns
Copy link
Contributor

strongSwan logging configuration is reported as invalid in some OSes
(e.g Ubuntu 16.04) when starting the strongSwan services. It is probably
due to the strongSwan version change after the Antea Docker image is
updated to use Ubuntu 20.04. This commit fixes the strongSwan logging
configuration.

strongSwan logging configuration is reported as invalid in some OSes
(e.g Ubuntu 16.04) when starting the strongSwan services. It is probably
due to the strongSwan version change after the Antea Docker image is
updated to use Ubuntu 20.04. This commit fixes the strongSwan logging
configuration.
@antrea-bot
Copy link
Collaborator

Thanks for your PR.
Unit tests and code linters are run automatically every time the PR is updated.
E2e, conformance and network policy tests can only be triggered by a member of the vmware-tanzu organization. Regular contributors to the project should join the org.

The following commands are available:

  • /test-e2e: to trigger e2e tests.
  • /skip-e2e: to skip e2e tests.
  • /test-conformance: to trigger conformance tests.
  • /skip-conformance: to skip conformance tests.
  • /test-whole-conformance: to trigger all conformance tests on linux.
  • /skip-whole-conformance: to skip all conformance tests on linux.
  • /test-networkpolicy: to trigger networkpolicy tests.
  • /skip-networkpolicy: to skip networkpolicy tests.
  • /test-windows-conformance: to trigger windows conformance tests.
  • /skip-windows-conformance: to skip windows conformance tests.
  • /test-windows-networkpolicy: to trigger windows networkpolicy tests.
  • /skip-windows-networkpolicy: to skip windows networkpolicy tests.
  • /test-hw-offload: to trigger ovs hardware offload test.
  • /skip-hw-offload: to skip ovs hardware offload test.
  • /test-all: to trigger all tests (except whole conformance).
  • /skip-all: to skip all tests (except whole conformance).

@codecov-commenter
Copy link

codecov-commenter commented Aug 31, 2020

Codecov Report

Merging #1184 into master will decrease coverage by 0.07%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1184      +/-   ##
==========================================
- Coverage   56.10%   56.02%   -0.08%     
==========================================
  Files         106      106              
  Lines       11535    11535              
==========================================
- Hits         6472     6463       -9     
- Misses       4496     4502       +6     
- Partials      567      570       +3     
Flag Coverage Δ
#integration-tests 47.37% <ø> (ø)
#unit-tests 41.44% <ø> (-0.09%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/apiserver/certificate/certificate.go 72.83% <0.00%> (-6.18%) ⬇️
pkg/apiserver/storage/ram/watch.go 85.71% <0.00%> (-3.18%) ⬇️
pkg/apiserver/storage/ram/store.go 80.39% <0.00%> (-1.31%) ⬇️

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@abhiraut abhiraut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this format works in earlier releases too?

@jianjuns
Copy link
Contributor Author

works

I think it is not about Antrea release, but strongSwan version used in our image, depended on the Ubuntu version.

@abhiraut
Copy link
Contributor

abhiraut commented Aug 31, 2020

I think it is not about Antrea release, but strongSwan version used in our image, depended on the Ubuntu version.

sorry i meant earlier ubuntu release.. 16.04 :D

I did not try, but I remember not (must be the reason I used the previous format).

@antoninbas
Copy link
Contributor

Should only depend on the Ubuntu version we use to build the Antrea Docker image. But the person who reported the issue mentioned it only happened for certain Node configurations...

@jianjuns
Copy link
Contributor Author

Should only depend on the Ubuntu version we use to build the Antrea Docker image. But the person who reported the issue mentioned it only happened for certain Node configurations...

Right. But I do not understand why our CI e2e tests passed with no failure. Do you have an idea?

@jianjuns
Copy link
Contributor Author

Should only depend on the Ubuntu version we use to build the Antrea Docker image. But the person who reported the issue mentioned it only happened for certain Node configurations...

Right. But I do not understand why our CI e2e tests passed with no failure. Do you have an idea?

Ok I guess might be the IPsec services failed to start, but traffic goes through without IPsec. Let me think about how to check this.

@antoninbas
Copy link
Contributor

We know that IPsec coverage is broken: #1043. The ovs-monitor-ipsec daemon was actually not starting correctly for a very long time (2 releases), and this did not get caught by the e2e tests. My guess is that the tests do not detect that the Agents do not get started properly after updating the manifest to enable IPsec. The issue is still open and assigned to @lzhecheng because we need to fix the tests. @lzhecheng, do you have cycles to work on it this week? If not, we can re-assign.

@antoninbas
Copy link
Contributor

@jianjuns My guess is this needs to be changed: https://github.com/vmware-tanzu/antrea/blob/88a8df32b482ae27252bdee3a1496a34cbcf5b0d/test/e2e/connectivity_test.go#L211-L214. I think it returns immediately because it thinks all Pods are ready, but it must be looking at pre-update Pods, and it does not realize the new Pods are crashing.

@jianjuns
Copy link
Contributor Author

You mean to check IPsec started correctly or not, or a better solution for Python3? I can look at the former.

@jianjuns
Copy link
Contributor Author

/skip-all

@jianjuns jianjuns merged commit 8817bb5 into antrea-io:master Aug 31, 2020
@antoninbas
Copy link
Contributor

We need a generic solution to make sure that the Antrea Agent Pods run correctly after the update to enable IPsec. Currently we do:

  1. apply new YAML
  2. run kubectl rollout status -w
  3. make sure we have the correct number of Pods available

Apparently this is not enough to detect this case where the Pod crashes after a few seconds, so we need to improve it. We don't use the MinReadySeconds field in the DaemonSet spec, so we have to come up with something else.

@jianjuns
Copy link
Contributor Author

Seems to me we can either check the container status, or run IPsec command inside the antrea-ipsec container.

@antoninbas
Copy link
Contributor

I think the solution does not have to be tied to IPsec (or should not). So yes, maybe the solution is to spin for a couple seconds and make sure the Pods stay ready.

@lzhecheng
Copy link
Contributor

lzhecheng commented Sep 1, 2020

@jianjuns , I found this issue yesterday as well. I changed /var/log/strongswan/charon.log to charon without the path field you just added. It still failed and logs said /etc/strongswan.d/ovs.conf containes unexpected .. I am not sure if this solution solves this as well?

@lzhecheng
Copy link
Contributor

lzhecheng commented Sep 1, 2020

@antoninbas , I have some ipv6 e2e test failure to solve. It is great if you can help on it. Thank you!
I noticed that after antrea-ipsec.yml is applied, antrea-agent pods will repeatly switch between crash and running because they try to restart. Finally they turn crash. I am not sure if wait for just a few seconds will work? Maybe after the check, they turns crash again?

@jianjuns
Copy link
Contributor Author

jianjuns commented Sep 1, 2020

@lzhecheng : this PR is to fix the ovs.conf error: #1191

antoninbas pushed a commit to antoninbas/antrea that referenced this pull request Sep 3, 2020
strongSwan logging configuration is reported as invalid in some OSes
(e.g Ubuntu 16.04) when starting the strongSwan services. It is probably
due to the strongSwan version change after the Antea Docker image is
updated to use Ubuntu 20.04. This commit fixes the strongSwan logging
configuration.
antoninbas pushed a commit to antoninbas/antrea that referenced this pull request Sep 3, 2020
strongSwan logging configuration is reported as invalid in some OSes
(e.g Ubuntu 16.04) when starting the strongSwan services. It is probably
due to the strongSwan version change after the Antea Docker image is
updated to use Ubuntu 20.04. This commit fixes the strongSwan logging
configuration.
antoninbas pushed a commit to antoninbas/antrea that referenced this pull request Sep 3, 2020
strongSwan logging configuration is reported as invalid in some OSes
(e.g Ubuntu 16.04) when starting the strongSwan services. It is probably
due to the strongSwan version change after the Antea Docker image is
updated to use Ubuntu 20.04. This commit fixes the strongSwan logging
configuration.
antoninbas pushed a commit that referenced this pull request Sep 3, 2020
strongSwan logging configuration is reported as invalid in some OSes
(e.g Ubuntu 16.04) when starting the strongSwan services. It is probably
due to the strongSwan version change after the Antea Docker image is
updated to use Ubuntu 20.04. This commit fixes the strongSwan logging
configuration.
GraysonWu pushed a commit to GraysonWu/antrea that referenced this pull request Sep 22, 2020
strongSwan logging configuration is reported as invalid in some OSes
(e.g Ubuntu 16.04) when starting the strongSwan services. It is probably
due to the strongSwan version change after the Antea Docker image is
updated to use Ubuntu 20.04. This commit fixes the strongSwan logging
configuration.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants