Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds a test case for L7 policy with TLS #1406

Merged
merged 2 commits into from
Mar 7, 2023

Conversation

meyskens
Copy link
Member

This adds a test to check if HTTPS traffic can get inspected in a CNP. To enable this it adds new helper functions to provision secrets, certificates and a public CA bundle into the test setup.

It implements part of the work described in cilium/cilium#23784 (but not all, see comment in issue).

A few parts I am not sure of are in here: like the location of the cabundle in the project so feedback is welcome

@meyskens
Copy link
Member Author

cc @jrajahalme (i have no permissions to add you as a reviewer here)

@ldelossa
Copy link
Contributor

@meyskens looks like you have some linter issues to fix up before tests will pass.

@meyskens meyskens temporarily deployed to ci February 21, 2023 15:57 — with GitHub Actions Inactive
@meyskens
Copy link
Member Author

Noticed, cleaned up the small mistake and added the proper gate to the 2nd TLS test :)

@meyskens
Copy link
Member Author

meyskens commented Feb 22, 2023

Multicluster test failed (don't think this is releated to this PR):

job.batch/cilium-cli condition met
Error from server: Get "https://10.101.248.4:10250/containerLogs/kube-system/cilium-cli-zfvq9/test?timestamps=true": No agent available
Error: Process completed with exit code 1.

Copy link
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few parts I am not sure of are in here: like the location of the cabundle in the project so feedback is welcome

Storing it in connectivity/check/assets LGTM.

Some spelling nits inline, but feel free to ignore in case there are no other change requests.

Out of curiosity: how would a successful test run for these tests look? It seems like they are currently skipped on all CI workflows. So I guess we'd need to extend the cilium ClusterRole in order to be able to test these?

connectivity/check/features.go Outdated Show resolved Hide resolved
connectivity/check/secrets.go Outdated Show resolved Hide resolved
connectivity/suite.go Outdated Show resolved Hide resolved
connectivity/suite.go Outdated Show resolved Hide resolved
@meyskens
Copy link
Member Author

@tklauser yes indeed, the helmchart has an option to set the secrets in the RBAC with a flag which we can set to run these in the CI. I discussed with @aanm yesterday to add this install option in this repo also to run them, but it seemed L7 wasn't even enabled in some test scenarios so we were unsure. When this is merged a follow up should be made in cilium/cilium to enable the tests there in a CI run.

@meyskens
Copy link
Member Author

install command needed to make them run ./cilium install --version v1.13.0 --kube-proxy-replacement=partial --helm-set ingressController.enabled=true --helm-set tls.secretsBackend=k8s (technically l7 only + secretsBackend is enough but i added ingress as that is in my usual setup)

@meyskens meyskens requested review from a team as code owners February 24, 2023 10:13
@meyskens meyskens temporarily deployed to ci February 24, 2023 10:13 — with GitHub Actions Inactive
@meyskens meyskens temporarily deployed to ci February 24, 2023 10:22 — with GitHub Actions Inactive
@meyskens
Copy link
Member Author

EKS test seems to fail due to too many clusters 2023-02-24 10:23:39 [✖] AWS::EC2::VPC/VPC: CREATE_FAILED – "Resource handler returned message: \"The maximum number of VPCs has been reached. (Service: Ec2, Status Code: 400, Request ID: b74adfb6-004c-4ec4-b826-b265c0d26d62)\" (RequestToken: 8fe7466e-4158-d711-cf22-6860437920c6, HandlerErrorCode: GeneralServiceException)"

@meyskens meyskens temporarily deployed to ci February 24, 2023 10:29 — with GitHub Actions Inactive
Copy link
Member

@jrajahalme jrajahalme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work! I'd like to see the curl client verify the certificate to guard against possible bugs. We currently do not support proxy choosing a certificate based on the SNI, but we could in future. Or even simpler, Cilium agent could have a bug and pass a wrong cert to the proxy.

connectivity/check/test.go Outdated Show resolved Hide resolved
// PodToWorldWithTLSIntercept sends an HTTPS request to one.one.one.one (default value of ExternalTarget) from from random client
func PodToWorldWithTLSIntercept(curlOpts ...string) check.Scenario {
s := &podToWorldWithTLSIntercept{
curlOpts: []string{"-k"}, // skip TLS verification as it will be our internal cert
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipping the certificate verification could hide bugs in our secret/policy handling, maybe? It is a bit more work to pass the cacert to curl, but I think it would be worth it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the ability to via a command inject the ca cert into the pod. It is not the best way IMO however we cannot restart the pod or change it's spec to load a secret as the design assumes it to be there "forever".
Any suggestions to improve the way it does this very welcome!

connectivity/tests/world.go Show resolved Hide resolved
ct.NewTest("client-egress-l7-tls-deny-without-headers").
WithFeatureRequirements(check.RequireFeatureEnabled(check.FeatureL7Proxy)).
WithFeatureRequirements(check.RequireFeatureEnabled(check.FeatureSecretBackendK8s)).
WithCABundleSecret().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the CA bundle Secret be created automatically when needed by WithCertificate() or WithSecret()? As such it does not take any input and feels like unnecessary boilerplate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CA Bundle here refer to the public CA store and isn't really related to a certificate being created or not.
That is why i made the initial decision not to do that. Happy to change that if you want.

@jrajahalme
Copy link
Member

I tested this locally, and the first invocation failed:

  ℹ️  📜 Applying secret 'cabundle' to namespace 'cilium-test'..
  ℹ️  📜 Applying secret 'externaltarget-tls' to namespace 'cilium-test'..
  ℹ️  📜 Applying CiliumNetworkPolicy 'l7-policy-tls' to namespace 'cilium-test'..
  [-] Scenario [client-egress-l7-tls-deny-without-headers/pod-to-world-with-tls-intercept]
  [.] Action [client-egress-l7-tls-deny-without-headers/pod-to-world-with-tls-intercept/https-to-one.one.one.one-0: cilium-test/client-784c67ffc4-l2zfs (10.244.0.79) -> one.one.one.one-https (one.one.one.one:443)]
  [.] Action [client-egress-l7-tls-deny-without-headers/pod-to-world-with-tls-intercept/https-to-one.one.one.one-1: cilium-test/client2-67754cb6fb-zrzk8 (10.244.0.37) -> one.one.one.one-https (one.one.one.one:443)]
  ❌ command "curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 -k https://one.one.one.one:443" failed with unexpected exit code: command terminated with exit code 35 (expected 22, found 35)
  📄 No flows recorded during action https-to-one.one.one.one-1
  📄 No flows recorded during action https-to-one.one.one.one-1
  ℹ️  📜 Deleting secret 'cabundle' from namespace 'cilium-test'..
  ℹ️  📜 Deleting secret 'externaltarget-tls' from namespace 'cilium-test'..
  ℹ️  📜 Deleting CiliumNetworkPolicy 'l7-policy-tls' from namespace 'cilium-test'..
[=] Test [client-egress-l7-tls-headers]
..

[=] Skipping Test [dns-only]

[=] Skipping Test [to-fqdns]

📋 Test Report
❌ 1/2 tests failed (1/4 actions), 32 tests skipped, 0 scenarios skipped:
Test [client-egress-l7-tls-deny-without-headers]:
  ❌ client-egress-l7-tls-deny-without-headers/pod-to-world-with-tls-intercept/https-to-one.one.one.one-1: cilium-test/client2-67754cb6fb-zrzk8 (10.244.0.37) -> one.one.one.one-https (one.one.one.one:443)
connectivity test failed: 1 tests failed

Unfortunatly I did not have hubble enabled, so did not see the flows. I was not able to reproduce this, after enabling hubble it worked as expected, also after uninstall/install. I tried it like this:

$ cilium install --helm-set tls.secretsBackend=k8s
$ cilium hubble enable
$ cilium hubble port-forward&
$ cilium connectivity test --test tls --debug
...
  🐛 Executing command [curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 -k https://one.one.one.one:443]
  🐛 command "curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --output /dev/null --connect-timeout 2 --max-time 10 -k https://one.one.one.one:443" failed as expected: command terminated with exit code 22
  📄 Validating flows for peer cilium-test/client-784c67ffc4-564qg
  🐛 Validating 38 flows against 3 requirements
  ✅ DNS request found at 0
  ✅ DNS response found at 4
  ✅ SYN found at 28
  ✅ L3/L4 Drop not found
  ✅ SYN found at 28
  ✅ L7 Drop found at 34
  ✅ L3/L4 Drop not found
  ✅ Flow validation successful for peer cilium-test/client-784c67ffc4-564qg (first: 0, last: 34, matched: 4)
...

@meyskens meyskens temporarily deployed to ci February 28, 2023 13:02 — with GitHub Actions Inactive
@maintainer-s-little-helper
Copy link

Commit 583f9c6 does not contain "Signed-off-by".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@meyskens
Copy link
Member Author

meyskens commented Mar 3, 2023

https://github.com/cilium/cilium-cli/actions/runs/4293305139/jobs/7480818048#step:11:4833

This is quite interesting the test seems to flake... not entirely sure why, do no merge until this is fixed

@meyskens
Copy link
Member Author

meyskens commented Mar 3, 2023

Not sure if this was caused by a flake in the test or an issue in the flow matching? Aborting flow matching: context deadline exceeded. Any suggestions?

@meyskens meyskens temporarily deployed to ci March 3, 2023 14:03 — with GitHub Actions Inactive
This adds a test to check if HTTPS traffic can get inspected in a CNP.
To enable this it adds new helper functions to provision secrets,
certificates and a public CA bundle into the test setup.
It also adds the functionality to insert a file into a pod,
this is done for having the certificate chain checked in curl.

Signed-off-by: Maartje Eyskens <maartje.eyskens@isovalent.com>
This enables the Envoy l7 proxy backend in the test setups
as well as the tls secrets backend to be k8s.
This allows the L7 ans TLS connectivity tests to run.

Signed-off-by: Maartje Eyskens <maartje.eyskens@isovalent.com>
@meyskens meyskens temporarily deployed to ci March 6, 2023 09:06 — with GitHub Actions Inactive
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 7, 2023
@tklauser tklauser merged commit eecce64 into cilium:master Mar 7, 2023
@meyskens meyskens deleted the meyskens/egress-tls branch March 7, 2023 11:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants