Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.14 Backports 2023-11-07 #29030

Merged
merged 25 commits into from
Nov 8, 2023
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
dffcc82
pkg/k8s: do not perform an early return on ipcache errors
aanm Oct 27, 2023
4f0bd64
certmanager: solve CannotRegenerateKey
universam1 Oct 25, 2023
c9e0f8d
gh/workflows: Dump Cilium LB node logs in case of failure
brb Oct 26, 2023
c1f7b61
bpf: lb: fix missing drop reason in reverse_map_l4_port()
julianwiedmann Oct 31, 2023
a6abf27
ci: Bump timeout on ci-runtime privileged worksflow
jrajahalme Nov 1, 2023
0eeedc3
k8s: Log Warning for Policies that Support "EndPort"
nathanjsweet Nov 1, 2023
41c63a8
ipsec: Remove dead code for IPsec node encryption
pchaigno Oct 31, 2023
ea132c3
datapath: Move linuxNodeHandler IPsec functions to their own file
pchaigno Nov 2, 2023
ab9e534
ingress: update resources on changed ingress class field
mhofstetter Oct 30, 2023
9469cca
bpf: Add TC_ACT_REDIRECT check for nodeport
sayboras Nov 1, 2023
9a0c93f
services: don't wait for clustermesh to delete local stale backends
giorio94 Oct 23, 2023
fb13a78
services: refactor SyncWithK8sFinished to return stale services
giorio94 Oct 30, 2023
eb47b72
ci: disable envoy tracing in multi-pool workflow
tklauser Nov 3, 2023
d0726cc
doc: Add roadmap for mutual authentication
tgraf Nov 6, 2023
d3d1f02
ipsec: Helper function to get direction from XFRM mark
pchaigno Oct 17, 2023
38b24c8
ipsec: Log node ID and direction whenever possible
pchaigno Oct 17, 2023
13be08b
ipsec: Log node ID in hex format for consistency
pchaigno Oct 17, 2023
3245eab
ipsec: Move getSPIFromXfrmPolicy to pkg/common
pchaigno Oct 25, 2023
2745248
cmd: Add confirmation to encrypt flush command
pchaigno Oct 25, 2023
a446e5a
cmd: New flag to flush only XFRM configs for given SPI
pchaigno Oct 25, 2023
cae4b75
cmd: Refactor XFRM filter function to ease generalization
pchaigno Oct 25, 2023
1643318
cmd: Unit test for the filterXFRMs function
pchaigno Oct 25, 2023
c04e850
ipsec: Move getNodeIDFromXfrmMark to pkg/common
pchaigno Oct 25, 2023
2182659
cmd: New flag to flush only XFRM configs for a given node ID
pchaigno Oct 25, 2023
b963ac9
cmd: Unit test for parseNodeID
pchaigno Oct 25, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 0 additions & 1 deletion .github/workflows/conformance-multi-pool.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,6 @@ jobs:
# non-masquerade CIDRs.
CILIUM_INSTALL_DEFAULTS="--chart-directory=install/kubernetes/cilium \
--helm-set=debug.enabled=true \
--helm-set=debug.verbose=envoy \
--helm-set=image.repository=quay.io/${{ env.QUAY_ORGANIZATION_DEV }}/cilium-ci \
--helm-set=image.useDigest=false \
--helm-set=image.tag=${SHA} \
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/conformance-runtime.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -364,7 +364,7 @@ jobs:

- name: Runtime privileged tests
if: ${{ matrix.focus == 'privileged' }}
timeout-minutes: 20
timeout-minutes: 30
uses: cilium/little-vm-helper@908ab1ff8a596a03cd5221a1f8602dc44c3f906d # v0.0.12
with:
provision: 'false'
Expand Down
13 changes: 3 additions & 10 deletions .github/workflows/tests-l4lb.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -122,17 +122,10 @@ jobs:
run: |
cd ${{ github.workspace }}/test/nat46x64 && sudo ./test.sh ${{ env.QUAY_ORGANIZATION_DEV }} ${{ steps.vars.outputs.sha }}

- name: Fetch cilium-sysdump
if: ${{ !success() }}
- name: Fetch Cilium Standalone LB logs
if: ${{ !success() && steps.lb-test.outcome != 'skipped' }}
run: |
sudo cilium sysdump --output-filename cilium-sysdump-out

- name: Upload cilium-sysdump
uses: actions/upload-artifact@a8a3f3ad30e3422c9c7b888a15615d19a852ae32 # v3.1.3
if: ${{ !success() }}
with:
name: cilium-sysdump-out.zip
path: cilium-sysdump-out.zip
docker exec -t lb-node docker logs cilium-lb

commit-status-final:
if: ${{ always() }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,12 @@
Mutual Authentication (Beta)
****************************

.. include:: ../../../beta.rst
.. note::

This is a beta feature. Please provide feedback and file a GitHub issue if
you experience any problems.

This feature is still incomplete, see :ref:`mutual_auth_roadmap` below for more details.

Mutual Authentication and mTLS Background
#########################################
Expand Down Expand Up @@ -102,7 +107,7 @@ the mutual authentication feature:

Limitations
###########

* Cilium Mutual Authentication is still in development and considered beta. Several planned security features have not been implemented yet, see below for details.
* Cilium's Mutual authentication has only been validated with SPIRE, the production-ready implementation of SPIFFE.
As Cilium uses SPIFFE APIs, it's possible that other SPIFFE implementations may work.
However, Cilium is currently only tested with the supplied SPIRE install, and using any other SPIFFE implementation is currently not supported.
Expand All @@ -111,3 +116,40 @@ Limitations
* The current support of mutual authentication only works within a Cilium-managed cluster and is not compatible with an external mTLS solution.


.. _mutual_auth_roadmap:

Detailed Roadmap Status
#######################

The following table shows the roadmap status of the mutual authentication feature.
There are several work items outstanding before the feature is complete from a security model perspective.
For details, see the [roadmap issue](https://github.com/cilium/cilium/issues/28986).


+--------------------------------------------------+----------------------------------------------------------+
| SPIFFE/SPIRE Integration | Beta |
+--------------------------------------------------+----------------------------------------------------------+
| Authentication API for agent | Beta |
+--------------------------------------------------+----------------------------------------------------------+
| mTLS handshake between agents | Beta |
+--------------------------------------------------+----------------------------------------------------------+
| Auth cache to enable per-identity handshake | Beta |
+--------------------------------------------------+----------------------------------------------------------+
| CiliumNetworkPolicy support | Beta |
+--------------------------------------------------+----------------------------------------------------------+
| Integrate with Wireguard | TODO |
+--------------------------------------------------+----------------------------------------------------------+
| Per-connection handshake | TODO |
+--------------------------------------------------+----------------------------------------------------------+
| Sync ipcache with auth data | TODO |
+--------------------------------------------------+----------------------------------------------------------+
| Detailed documentation of security model | TODO |
+--------------------------------------------------+----------------------------------------------------------+
| Conduct penetration test of model | TODO |
+--------------------------------------------------+----------------------------------------------------------+
| Minimize packet drops | TODO |
+--------------------------------------------------+----------------------------------------------------------+
| Use auth secret for network encryption | TODO |
+--------------------------------------------------+----------------------------------------------------------+
| Review maturity and consider for stable | TODO |
+--------------------------------------------------+----------------------------------------------------------+
17 changes: 14 additions & 3 deletions bpf/bpf_overlay.c
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,13 @@ static __always_inline int handle_ipv6(struct __ctx_buff *ctx,
#ifdef ENABLE_NODEPORT
if (!ctx_skip_nodeport(ctx)) {
ret = nodeport_lb6(ctx, ip6, *identity, ext_err);
if (ret < 0)
/* nodeport_lb6() returns with TC_ACT_REDIRECT for
* traffic to L7 LB. Policy enforcement needs to take
* place after L7 LB has processed the packet, so we
* return to stack immediately here with
* TC_ACT_REDIRECT.
*/
if (ret < 0 || ret == TC_ACT_REDIRECT)
return ret;
}
#endif
Expand Down Expand Up @@ -312,8 +318,13 @@ static __always_inline int handle_ipv4(struct __ctx_buff *ctx,
#ifdef ENABLE_NODEPORT
if (!ctx_skip_nodeport(ctx)) {
int ret = nodeport_lb4(ctx, ip4, ETH_HLEN, *identity, ext_err);

if (ret < 0)
/* nodeport_lb4() returns with TC_ACT_REDIRECT for
* traffic to L7 LB. Policy enforcement needs to take
* place after L7 LB has processed the packet, so we
* return to stack immediately here with
* TC_ACT_REDIRECT.
*/
if (ret < 0 || ret == TC_ACT_REDIRECT)
return ret;
}
#endif
Expand Down
5 changes: 2 additions & 3 deletions bpf/lib/lb.h
Original file line number Diff line number Diff line change
Expand Up @@ -365,9 +365,8 @@ static __always_inline int reverse_map_l4_port(struct __ctx_buff *ctx, __u8 next
int ret;

/* Port offsets for UDP and TCP are the same */
ret = l4_load_port(ctx, l4_off + TCP_SPORT_OFF, &old_port);
if (IS_ERR(ret))
return ret;
if (l4_load_port(ctx, l4_off + TCP_SPORT_OFF, &old_port) < 0)
return DROP_INVALID;

if (port != old_port) {
#ifdef ENABLE_SCTP
Expand Down
63 changes: 48 additions & 15 deletions daemon/cmd/state.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,16 @@ import (
"github.com/sirupsen/logrus"
"github.com/vishvananda/netlink"
k8serrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/util/sets"

"github.com/cilium/cilium/pkg/controller"
"github.com/cilium/cilium/pkg/endpoint"
"github.com/cilium/cilium/pkg/ipam"
"github.com/cilium/cilium/pkg/k8s"
slim_corev1 "github.com/cilium/cilium/pkg/k8s/slim/k8s/api/core/v1"
"github.com/cilium/cilium/pkg/k8s/watchers/resources"
"github.com/cilium/cilium/pkg/labels"
"github.com/cilium/cilium/pkg/lock"
"github.com/cilium/cilium/pkg/logging/logfields"
"github.com/cilium/cilium/pkg/maps/ctmap"
"github.com/cilium/cilium/pkg/maps/lxcmap"
Expand Down Expand Up @@ -455,30 +458,60 @@ func (d *Daemon) initRestore(restoredEndpoints *endpointRestoreState, endpointsR

go func() {
if d.clientset.IsEnabled() {
// Configure the controller which removes any leftover Kubernetes
// services that may have been deleted while Cilium was not
// running. Once this controller succeeds, because it has no
// RunInterval specified, it will not run again unless updated
// elsewhere. This means that if, for instance, a user manually
// adds a service via the CLI into the BPF maps, it will
// not be cleaned up by the daemon until it restarts.
syncServices := func(localOnly bool) {
d.controllers.UpdateController(
"sync-lb-maps-with-k8s-services",
controller.ControllerParams{
DoFunc: func(ctx context.Context) error {
var localServices sets.Set[k8s.ServiceID]
if localOnly {
localServices = d.k8sWatcher.K8sSvcCache.LocalServices()
}

stale, err := d.svc.SyncWithK8sFinished(localOnly, localServices)

// Always process the list of stale services, regardless
// of whether an error was returned.
swg := lock.NewStoppableWaitGroup()
for _, svc := range stale {
d.k8sWatcher.K8sSvcCache.EnsureService(svc, swg)
}

swg.Stop()
swg.Wait()

return err
},
Context: d.ctx,
},
)
}

// Also wait for all shared services to be synchronized with the
// datapath before proceeding.
if d.clustermesh != nil {
// Do a first pass synchronizing only the services which are not
// marked as global, so that we can drop their stale backends
// without needing to wait for full clustermesh synchronization.
syncServices(true /* only local services */)

err := d.clustermesh.ServicesSynced(d.ctx)
if err != nil {
log.WithError(err).Fatal("timeout while waiting for all clusters to be locally synchronized")
}
log.Debug("all clusters have been correctly synchronized locally")
}
// Start controller which removes any leftover Kubernetes
// services that may have been deleted while Cilium was not
// running. Once this controller succeeds, because it has no
// RunInterval specified, it will not run again unless updated
// elsewhere. This means that if, for instance, a user manually
// adds a service via the CLI into the BPF maps, that it will
// not be cleaned up by the daemon until it restarts.
controller.NewManager().UpdateController("sync-lb-maps-with-k8s-services",
controller.ControllerParams{
DoFunc: func(ctx context.Context) error {
return d.svc.SyncWithK8sFinished(d.k8sWatcher.K8sSvcCache.EnsureService)
},
Context: d.ctx,
},
)

// Now that possible global services have also been synchronized, let's
// do a final pass to remove the remaining stale services and backends.
syncServices(false /* all services */)
}
}()
} else {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,6 @@ spec:
dnsNames:
- "*.hubble-relay.cilium.io"
duration: {{ printf "%dh0m0s" (mul .Values.hubble.tls.auto.certValidityDuration 24) }}
privateKey:
rotationPolicy: Always
{{- end }}
Original file line number Diff line number Diff line change
Expand Up @@ -22,4 +22,6 @@ spec:
{{- end }}
{{- end }}
duration: {{ printf "%dh0m0s" (mul .Values.hubble.tls.auto.certValidityDuration 24) }}
privateKey:
rotationPolicy: Always
{{- end }}
Original file line number Diff line number Diff line change
Expand Up @@ -23,4 +23,6 @@ spec:
{{- end }}
{{- end }}
duration: {{ printf "%dh0m0s" (mul .Values.hubble.tls.auto.certValidityDuration 24) }}
privateKey:
rotationPolicy: Always
{{- end }}
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,6 @@ spec:
dnsNames:
- "*.hubble-ui.cilium.io"
duration: {{ printf "%dh0m0s" (mul .Values.hubble.tls.auto.certValidityDuration 24) }}
privateKey:
rotationPolicy: Always
{{- end }}
45 changes: 25 additions & 20 deletions operator/pkg/ingress/ingress.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ const (
type ingressAddedEvent struct {
ingress *slim_networkingv1.Ingress
}

type ingressUpdatedEvent struct {
oldIngress *slim_networkingv1.Ingress
newIngress *slim_networkingv1.Ingress
Expand Down Expand Up @@ -281,38 +282,43 @@ func (ic *Controller) handleIngressAddedEvent(event ingressAddedEvent) error {
}

func (ic *Controller) handleIngressUpdatedEvent(event ingressUpdatedEvent) error {
if !ic.isCiliumIngressEntry(event.newIngress) {
oldIngressClassCilium := ic.isCiliumIngressEntry(event.oldIngress)
newIngressClassCilium := ic.isCiliumIngressEntry(event.newIngress)

oldLBModeDedicated := ic.isEffectiveLoadbalancerModeDedicated(event.oldIngress)
newLBModeDedicated := ic.isEffectiveLoadbalancerModeDedicated(event.newIngress)

if !oldIngressClassCilium && !newIngressClassCilium {
return nil
}

ic.secretManager.Add(event)

// Perform clean up if there is change in LB mode
oldLBMode := ic.isEffectiveLoadbalancerModeDedicated(event.oldIngress)
newLBMode := ic.isEffectiveLoadbalancerModeDedicated(event.newIngress)
// Cleanup

// If the ingress is being switched from dedicated to shared, we need to
// clean up the dedicated resources (service, endpoints, envoy config)
if oldLBMode && !newLBMode {
if oldLBModeDedicated && (!newLBModeDedicated || (oldIngressClassCilium && !newIngressClassCilium)) {
// Delete dedicated resources (service, endpoints, CEC)
// - if ingress class changed from "cilium" to something else
// - if the ingress mode is being switched from dedicated to shared
if err := ic.deleteResources(event.oldIngress); err != nil {
log.WithError(err).Warn("Failed to delete resources for ingress")
return err
}
}

// If the ingress is being switched from shared to dedicated, we need to update
// shared CiliumEnvoyConfig.
if !oldLBMode && newLBMode {
err := ic.ensureResources(event.newIngress, true)
if err != nil {
} else if !oldLBModeDedicated && (newLBModeDedicated || (oldIngressClassCilium && !newIngressClassCilium)) {
// Update shared CiliumEnvoyConfig
// - if ingress class changed from "cilium" to something else
// - if the ingress mode is being switched from shared to dedicated
if err := ic.ensureResources(event.newIngress, true); err != nil {
return err
}
}

err := ic.ensureResources(event.newIngress, false)
if err != nil {
return err
if !newIngressClassCilium {
// skip further processing for non Cilium Ingresses
return nil
}
return nil

return ic.ensureResources(event.newIngress, false)
}

func (ic *Controller) handleIngressDeletedEvent(event ingressDeletedEvent) error {
Expand Down Expand Up @@ -597,8 +603,7 @@ func (ic *Controller) regenerate(ing *slim_networkingv1.Ingress, forceShared boo
translator = ic.sharedTranslator
for _, k := range ic.ingressStore.ListKeys() {
item, _ := ic.getByKey(k)
if !ic.isCiliumIngressEntry(item) || ic.isEffectiveLoadbalancerModeDedicated(item) ||
ing.GetDeletionTimestamp() != nil {
if !ic.isCiliumIngressEntry(item) || ic.isEffectiveLoadbalancerModeDedicated(item) || ing.GetDeletionTimestamp() != nil {
continue
}
m.HTTP = append(m.HTTP, ingestion.Ingress(*item, ic.defaultSecretNamespace, ic.defaultSecretName)...)
Expand Down