Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spire-agent not able to run on tainted nodes #27228

Closed
1 task done
tvonhacht-apple opened this issue Aug 3, 2023 · 0 comments · Fixed by #27229
Closed
1 task done

spire-agent not able to run on tainted nodes #27228

tvonhacht-apple opened this issue Aug 3, 2023 · 0 comments · Fixed by #27229
Labels
area/helm Impacts helm charts and user deployment experience feature/authentication kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. sig/agent Cilium agent related.

Comments

@tvonhacht-apple
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

spire-agent is not running on tainted nodes by default or does not provide option to add tolerations.

For example cilium-agent by default runs on every node as an allow all toleration is added by default. (https://github.com/cilium/cilium/blob/main/install/kubernetes/cilium/values.yaml#L168-L169)

This results in 2 problems:

  • pods running on nodes that have taint can't create identities
  • if cilium-operator runs on tainted node, no cilium:mutual-auth identity can be created

Cilium Version

1.14.0

Kernel Version

irrelevant to bug

Kubernetes Version

1.27

Sysdump

No response

Relevant log output

# cilium-agent


cilium-mhhj8 cilium-agent level=warning msg="Failed to authenticate request" error="failed to authenticate with auth type spire: failed to get certificate for local identity 19737: no SPIFFE ID for spiffe://spiffe.cilium/identity/19737" key="localIdentity=19737, remoteIdentity=37615, remoteNodeID=0, authType=spire" subsys=auth


# cilium-operator

```bash
cilium-operator-688f47cd6d-7rbxt cilium-operator level=error msg="Failed to watch the Workload API: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial unix /run/spire/sockets/agent/agent.sock: connect: no such file or directory\"" subsys=spire-client


### Anything else?

_No response_

### Code of Conduct

- [X] I agree to follow this project's Code of Conduct
@tvonhacht-apple tvonhacht-apple added kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish severity and next steps. labels Aug 3, 2023
@dylandreimerink dylandreimerink added area/helm Impacts helm charts and user deployment experience sig/agent Cilium agent related. feature/authentication and removed needs/triage This issue requires triaging to establish severity and next steps. labels Aug 3, 2023
tvonhacht-apple added a commit to tvonhacht-apple/cilium that referenced this issue Aug 10, 2023
Previously, it was not possible to run the spire-agent on nodes with
taints like the cilium-agent does by default. This feature matches
similar behaviour.

Added as well options to define affinity, nodeSelector and tolerations
for spire-server.

Fixes: cilium#27228

Signed-off-by: Thorben von Hacht <tvonhacht@apple.com>
lmb pushed a commit that referenced this issue Aug 16, 2023
Previously, it was not possible to run the spire-agent on nodes with
taints like the cilium-agent does by default. This feature matches
similar behaviour.

Added as well options to define affinity, nodeSelector and tolerations
for spire-server.

Fixes: #27228

Signed-off-by: Thorben von Hacht <tvonhacht@apple.com>
tklauser pushed a commit to tklauser/cilium that referenced this issue Oct 24, 2023
[ upstream commit b599370 ]

Previously, it was not possible to run the spire-agent on nodes with
taints like the cilium-agent does by default. This feature matches
similar behaviour.

Added as well options to define affinity, nodeSelector and tolerations
for spire-server.

Fixes: cilium#27228

Signed-off-by: Thorben von Hacht <tvonhacht@apple.com>
Signed-off-by: Tobias Klauser <tobias@cilium.io>
dylandreimerink pushed a commit that referenced this issue Oct 25, 2023
[ upstream commit b599370 ]

Previously, it was not possible to run the spire-agent on nodes with
taints like the cilium-agent does by default. This feature matches
similar behaviour.

Added as well options to define affinity, nodeSelector and tolerations
for spire-server.

Fixes: #27228

Signed-off-by: Thorben von Hacht <tvonhacht@apple.com>
Signed-off-by: Tobias Klauser <tobias@cilium.io>
sayboras pushed a commit that referenced this issue Nov 25, 2023
[ upstream commit b599370 ]

Previously, it was not possible to run the spire-agent on nodes with
taints like the cilium-agent does by default. This feature matches
similar behaviour.

Added as well options to define affinity, nodeSelector and tolerations
for spire-server.

Fixes: #27228

Signed-off-by: Thorben von Hacht <tvonhacht@apple.com>
Signed-off-by: Tobias Klauser <tobias@cilium.io>
Signed-off-by: Tobias Klauser <tobias@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/helm Impacts helm charts and user deployment experience feature/authentication kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. sig/agent Cilium agent related.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants