Skip to content

Commit

Permalink
dns: Set --tofqdns-min-ttl to zero by default
Browse files Browse the repository at this point in the history
This commit changes the default value of --tofqdns-min-ttl from 3600
seconds to zero. This means Cilium honors the TTLs returned from the
upstream DNS server by default. Explicitly configure --tofqdns-min-ttl
if you need to preserve the previous behavior that lets applications
create new connections within the pre-defined --tofqdns-min-ttl time
window after the DNS TTL is expired.

--tofqdns-min-ttl setting is no longer needed since the poll-based DNS
implementation has been replaced by the proxy-based implementation.
Having the minimum TTL set to 1 hour by default adds unnecessary CPU /
memory overhead, as Cilium ends up keeping track of expired DNS info.
This is especially problematic when the upstream DNS server returns
responses with short TTLs and many unique IP addresses.

Co-authored-by: Joe Stringer <joe@cilium.io>
Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
  • Loading branch information
michi-covalent and joestringer committed Mar 8, 2023
1 parent a91e20a commit 72d95c7
Show file tree
Hide file tree
Showing 10 changed files with 45 additions and 60 deletions.
2 changes: 1 addition & 1 deletion Documentation/cmdref/cilium-agent.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

39 changes: 21 additions & 18 deletions Documentation/contributing/development/debugging.rst
Original file line number Diff line number Diff line change
Expand Up @@ -274,25 +274,28 @@ included.
Unintended DNS Policy Drops
~~~~~~~~~~~~~~~~~~~~~~~~~~~

``toFQDNSs`` policy enforcement relies on the source POD performing a DNS query
before using an IP address returned in the DNS response. Sometimes PODs may hold
``toFQDNSs`` policy enforcement relies on the source pod performing a DNS query
before using an IP address returned in the DNS response. Sometimes pods may hold
on to a DNS response and start new connections to the same IP address at a later
time. This may trigger policy drops if the DNS response has expired as
requested by the DNS server in the time-to-live (TTL) value in the
response. When DNS is used for service load balancing the advertised TTL value
may be short (e.g., 60 seconds). To allow for reasonable POD behavior without
unintended policy drops Cilium employs a configurable minimum DNS TTL value via
``--tofqdns-min-ttl`` which defaults to 3600 seconds. This setting overrides
short TTLs and allows the POD to use the IP address in the DNS response for one
hour. Existing connections also keep the IP address as allowed in the
policy. Any new connections opened by the POD using the same IP address without
performing a new DNS query after the (possibly extended) DNS TTL has expired
can be dropped by Cilium policy enforcement. To allow PODs to use the DNS
response after TTL expiry for new connections a command line option
``--tofqdns-idle-connection-grace-period`` may be used to keep the
IP-address/name mapping valid in the policy for an extended time after DNS TTL
expiry. This option takes effect only if the POD has opened at least one
connection during the DNS TTL period.
time. This may trigger policy drops if the DNS response has expired as requested
by the DNS server in the time-to-live (TTL) value in the response. When DNS is
used for service load balancing the advertised TTL value may be short (e.g., 60
seconds).

Cilium honors the TTL values returned by the DNS server by default, but you can
override them by setting a minimum TTL using ``--tofqdns-min-ttl`` flag. This
setting overrides short TTLs and allows the pod to use the IP address in the DNS
response for a longer duration. Existing connections also keep the IP address as
allowed in the policy.

Any new connections opened by the pod using the same IP address without
performing a new DNS query after the (possibly extended) DNS TTL has expired are
dropped by Cilium policy enforcement. To allow pods to use the DNS response
after TTL expiry for new connections, a command line option
``--tofqdns-idle-connection-grace-period`` may be used to keep the IP address /
name mapping valid in the policy for an extended time after DNS TTL expiry. This
option takes effect only if the pod has opened at least one connection during
the DNS TTL period.

Datapath Plumbing
~~~~~~~~~~~~~~~~~
Expand Down
4 changes: 2 additions & 2 deletions Documentation/helm-values.rst

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions Documentation/operations/upgrade.rst
Original file line number Diff line number Diff line change
Expand Up @@ -311,6 +311,12 @@ Annotations:

1.14 Upgrade Notes
------------------
* The default value of ``--tofqdns-min-ttl`` has changed from 3600 seconds to
zero. This means Cilium DNS network policy now honors the TTLs returned from
the upstream DNS server by default. Explicitly configure ``--tofqdns-min-ttl``
if you need to preserve the previous DNS network policy behavior that lets
applications create new connections after the TTL specified by the upstream
DNS server is expired.

Added Metrics
~~~~~~~~~~~~~
Expand Down
34 changes: 2 additions & 32 deletions Documentation/security/policy/language.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,8 @@ can talk to each other. Layer 3 policies can be specified using the following me

* `DNS based`: Selects remote, non-cluster, peers using DNS names converted to
IPs via DNS lookups. It shares all limitations of the `CIDR based` rules
above. DNS information is acquired by routing DNS traffic via a proxy, or
polling for listed DNS targets. DNS TTLs are respected.
above. DNS information is acquired by routing DNS traffic via a proxy.
DNS TTLs are respected.

.. _Labels based:

Expand Down Expand Up @@ -582,39 +582,9 @@ Example
.. literalinclude:: ../../../examples/policies/l3/fqdn/fqdn.json


.. _DNS and Long-Lived Connections:

Managing Long-Lived Connections & Minimum DNS Cache Times
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Often, an application may keep a connection open for longer than the DNS TTL.
Without further DNS queries the remote IP used in the long-lived connection may
expire out of the DNS cache. When this occurs, existing connections established
before the TTL expires will continue to be allowed until they terminate. Unused
IPs will no longer be allowed, however, even when from the same DNS lookup as
an in-use IP. This tracking is per-endpoint per-IP and DNS entries in this
state will be have ``source: connection`` with a single IP listed within the
``cilium fqdn cache list`` output.

A minimum TTL is used to ensure a lower time bound to DNS data expiration, and
IPs allowed by a ``toFQDNs`` rule will be allowed at least this long It can be
configured with the ``--tofqdns-min-ttl`` CLI option. The value is in integer
seconds and must be 1 or more, the default is 1 hour.

Some care needs to be taken when setting ``--tofqdns-min-ttl`` with DNS data
that returns many distinct IPs over time. A long TTL will keep each IP cached
long after the related connections have terminated. Large numbers of IPs each
have corresponding Security Identities and too many may slow down Cilium policy
regeneration.

Managing Short-Lived Connections & Maximum IPs per FQDN/endpoint
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The minimum TTL for DNS entries in the cache is deliberately long with 1 hour
as the default. This is done to accommodate long-lived persistent connections.
On the other end of the spectrum are workloads that perform short-lived
connections in repetition to FQDNs that are backed by a large number of IP
addresses (e.g. AWS S3).

Many short-lived connections can grow the number of IPs mapping to an FQDN
quickly. In order to limit the number of IP addresses that map a particular
FQDN, each FQDN has a per-endpoint max capacity of IPs that will be retained
Expand Down
2 changes: 1 addition & 1 deletion daemon/cmd/daemon_main.go
Original file line number Diff line number Diff line change
Expand Up @@ -835,7 +835,7 @@ func initializeFlags() {
flags.MarkHidden(option.CMDRef)
option.BindEnv(Vp, option.CMDRef)

flags.Int(option.ToFQDNsMinTTL, 0, fmt.Sprintf("The minimum time, in seconds, to use DNS data for toFQDNs policies. (default %d )", defaults.ToFQDNsMinTTL))
flags.Int(option.ToFQDNsMinTTL, defaults.ToFQDNsMinTTL, "The minimum time, in seconds, to use DNS data for toFQDNs policies")
option.BindEnv(Vp, option.ToFQDNsMinTTL)

flags.Int(option.ToFQDNsProxyPort, 0, "Global port on which the in-agent DNS proxy should listen. Default 0 is a OS-assigned port.")
Expand Down
2 changes: 1 addition & 1 deletion install/kubernetes/cilium/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ contributors across the globe, there is almost always someone available to help.
| dnsProxy.endpointMaxIpPerHostname | int | `50` | Maximum number of IPs to maintain per FQDN name for each endpoint. |
| dnsProxy.idleConnectionGracePeriod | string | `"0s"` | Time during which idle but previously active connections with expired DNS lookups are still considered alive. |
| dnsProxy.maxDeferredConnectionDeletes | int | `10000` | Maximum number of IPs to retain for expired DNS lookups with still-active connections. |
| dnsProxy.minTtl | int | `3600` | The minimum time, in seconds, to use DNS data for toFQDNs policies. |
| dnsProxy.minTtl | int | `0` | The minimum time, in seconds, to use DNS data for toFQDNs policies. If the upstream DNS server returns a DNS record with a shorter TTL, Cilium overwrites the TTL with this value. Setting this value to zero means that Cilium will honor the TTLs returned by the upstream DNS server. |
| dnsProxy.preCache | string | `""` | DNS cache data at this path is preloaded on agent startup. |
| dnsProxy.proxyPort | int | `0` | Global port on which the in-agent DNS proxy should listen. Default 0 is a OS-assigned port. |
| dnsProxy.proxyResponseMaxDelay | string | `"100ms"` | The maximum time the DNS proxy holds an allowed DNS response before sending it along. Responses are sent as soon as the datapath is updated with the new IP information. |
Expand Down
7 changes: 5 additions & 2 deletions install/kubernetes/cilium/values.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 5 additions & 2 deletions install/kubernetes/cilium/values.yaml.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -2558,8 +2558,11 @@ dnsProxy:
idleConnectionGracePeriod: 0s
# -- Maximum number of IPs to retain for expired DNS lookups with still-active connections.
maxDeferredConnectionDeletes: 10000
# -- The minimum time, in seconds, to use DNS data for toFQDNs policies.
minTtl: 3600
# -- The minimum time, in seconds, to use DNS data for toFQDNs policies. If
# the upstream DNS server returns a DNS record with a shorter TTL, Cilium
# overwrites the TTL with this value. Setting this value to zero means that
# Cilium will honor the TTLs returned by the upstream DNS server.
minTtl: 0
# -- DNS cache data at this path is preloaded on agent startup.
preCache: ""
# -- Global port on which the in-agent DNS proxy should listen. Default 0 is a OS-assigned port.
Expand Down
2 changes: 1 addition & 1 deletion pkg/defaults/defaults.go
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ const (

// ToFQDNsMinTTL is the default lower bound for TTLs used with ToFQDNs rules.
// This is used in DaemonConfig.Populate
ToFQDNsMinTTL = 3600 // 1 hour in seconds
ToFQDNsMinTTL = 0

// ToFQDNsMaxIPsPerHost defines the maximum number of IPs to maintain
// for each FQDN name in an endpoint's FQDN cache
Expand Down

0 comments on commit 72d95c7

Please sign in to comment.