MetalLB Version
latest
Deployment method
Not relevant
Main CNI
N/A
Kubernetes Version
No response
Cluster Distribution
No response
Describe the bug
Using the webhooks, we do not want to block resources depending on the order they are received - e.g we don't want to block a peer specifying a bfdProfile that doesn't exist yet because it could eventually converge. today we achieve that using the TransientError type, returning it only when an error that relates to dependencies between resources occur and we ignore this type of err in the webhook's validator.
The problem with that approach is that in some cases this causes the webhook to miss an actual error, because in the webhook we pass multiple objects of the same type to the validation and return early when hitting any error, which for example if a TransientError happens for the first resource the rest of the resources are not being validated at all.
As an example, consider the following resources:
apiVersion: metallb.io/v1beta2
kind: BGPPeer
metadata:
name: peer1
namespace: metallb-system
spec:
peerAddress: 172.16.30.2
peerASN: 65433
myASN: 65439
bfdProfile: "non-existing"
---
apiVersion: metallb.io/v1beta2
kind: BGPPeer
metadata:
name: peer2
namespace: metallb-system
spec:
peerAddress: 172.16.30.3
peerASN: 0
myASN: 65439
peer2 is invalid since it specifies peerASN: 0 so we would expect the webhook to block it anyways, but the outcome of the webhook changes as a result of the order: if peer2 is created without peer1 it is blocked, if it is created after peer1 it passes.
A possible solution is instead of ignoring transient errors in the webhook's validator, it would pass the resources to For without fields that can cause a transient error (resetting them), essentially not allowing the webhook to receive a transient error at all.
TransientErrors could stay in the codebase (serving as a "doc"), we would just not act on them in the webhooks.
To Reproduce
described above
Expected Behavior
described above
Additional Context
described above
I've read and agree with the following
I've read and agree with the following
MetalLB Version
latest
Deployment method
Not relevant
Main CNI
N/A
Kubernetes Version
No response
Cluster Distribution
No response
Describe the bug
Using the webhooks, we do not want to block resources depending on the order they are received - e.g we don't want to block a peer specifying a bfdProfile that doesn't exist yet because it could eventually converge. today we achieve that using the TransientError type, returning it only when an error that relates to dependencies between resources occur and we ignore this type of err in the webhook's validator.
The problem with that approach is that in some cases this causes the webhook to miss an actual error, because in the webhook we pass multiple objects of the same type to the validation and return early when hitting any error, which for example if a TransientError happens for the first resource the rest of the resources are not being validated at all.
As an example, consider the following resources:
peer2 is invalid since it specifies peerASN: 0 so we would expect the webhook to block it anyways, but the outcome of the webhook changes as a result of the order: if peer2 is created without peer1 it is blocked, if it is created after peer1 it passes.
A possible solution is instead of ignoring transient errors in the webhook's validator, it would pass the resources to
Forwithout fields that can cause a transient error (resetting them), essentially not allowing the webhook to receive a transient error at all.TransientErrors could stay in the codebase (serving as a "doc"), we would just not act on them in the webhooks.
To Reproduce
described above
Expected Behavior
described above
Additional Context
described above
I've read and agree with the following
I've read and agree with the following