Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs/ipsec: Extend troubleshooting section #26808

Merged
merged 1 commit into from
Jul 19, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
50 changes: 50 additions & 0 deletions Documentation/security/network/encryption-ipsec.rst
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,8 @@ commands:
15:16:21.627699 IP 10.60.1.1 > 10.60.0.1: ESP(spi=0x00000001,seq=0x57e4), length 100
15:16:21.628408 IP 10.60.1.1 > 10.60.0.1: ESP(spi=0x00000001,seq=0x57e5), length 100

.. _ipsec_key_rotation:

Key Rotation
============

Expand Down Expand Up @@ -207,6 +209,54 @@ Troubleshooting
reaches its maximum value, it will also result in errors. The number of
keys in use should be 2 during a key rotation and always 1 otherwise.

* All XFRM errors correspond to a packet drop in the kernel. Except for
``XfrmFwdHdrError`` and ``XfrmInError``, all XFRM errors indicate a bug in
Cilium or an operational mistake. ``XfrmOutStateSeqError`` and
``XfrmInStateProtoError`` may be caused by operational mistakes, as detailed
in the following points.

* If the sequence number reaches its maximum value for any XFRM OUT state, it
will result in packet drops and XFRM errors of type
``XfrmOutStateSeqError``. A key rotation resets all sequence numbers.
Rotate keys frequently to avoid this issue.

* ``XfrmInStateProtoError`` errors can happen if the key is updated without
incrementing the SPI (also called ``KEYID`` in :ref:`ipsec_key_rotation`
instructions above). It can be fixed by performing a new key rotation,
properly.

* ``XfrmFwdHdrError`` and ``XfrmInError`` happen when the kernel fails to
lookup the route for a packet it decrypted. This can legitimately happen
when a pod was deleted but some packets are still in transit. Note these
errors can also happen under memory pressure when the kernel fails to
allocate memory.

* The following table documents the known explanations for several XFRM errors
that were observed in the past. Many other error types exist, but they are
usually for Linux subfeatures that Cilium doesn't use (e.g., XFRM
expiration).

======================= ==================================================
Error Known explanation
======================= ==================================================
XfrmInError The kernel (1) decrypted and tried to route a
packet for a pod that was deleted or (2) failed to
allocate memory.
XfrmInNoStates Bug in the XFRM configuration for decryption.
XfrmInStateProtoError There is a key mismatch between nodes.
XfrmInTmplMismatch Bug in the XFRM configuration for decryption.
XfrmInNoPols Bug in the XFRM configuration for decryption.
XfrmInPolBlock Explicit drop, not used by Cilium.
XfrmOutNoStates Bug in the XFRM configuration for encryption.
XfrmOutStateSeqError The sequence number of an encryption XFRM
configuration reached its maximum value.
XfrmOutPolBlock Cilium dropped packets that would have otherwise
left the node in plain-text.
XfrmFwdHdrError The kernel (1) decrypted and tried to route a
packet for a pod that was deleted or (2) failed to
allocate memory.
======================= ==================================================

Disabling Encryption
====================

Expand Down
1 change: 1 addition & 0 deletions Documentation/spelling_wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -609,6 +609,7 @@ subclasses
subcommand
subcommands
subdirectories
subfeatures
subnet
subnets
subregister
Expand Down