You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The problem described below is related to most of the resources managed by NIC, but I will describe it based on VirtualServer.
Currently, if an existing VirtualServer resource becomes invalid, NIC removes the corresponding configuration from Nginx (code, doc), resulting in an outage. And it wouldn't be so bad and scary if there was an interface to make "comprehensive validation" before updating VirtualServer resource. But there is no built-in and reliable way to validate VirtualServer resource before applying it - only structural validation by OpenAPI schema (correct me if I'm wrong).
We are migrating to the NIC as the main ingress for an external and internal traffic to our clusters. At the same time, we provide to our users an interface to change VS resource (for example, to add/change routes). How can we protect ourselves and our users from allowing someone to break important ingress endpoints because they added a duplicated path in routes or made some other syntax mistake (which passed structural validation but failed during the comprehensive validation phase)?
Preferred solutions:
Do not remove the corresponding nginx configuration in case if VirtualServer becomes invalid
There is no sense in removing it, and it's difficult to imagine how it can make a situation better - too big a penalty for a configuration error. Should be enough just to mark it “Invalid” and then responsible ppl or automation (by alerts, cd pipelines errors, etc…) can decide what to do with it. But let me know if I'm missing something and there's a legitimate reason for the current behavior.
Provide interface for "comprehensive validation" before applying VirtualServer resource
For example, it could be done via ValidatingAdmissionWebhook for usual and dry-run mode (oss nginx-ingress has something similar). With this approach, there will be much fewer possibilities to make VirtualServer invalid and there will be a possibility to make "pre-validation" with something like "kubectl apply –dry-run=server".
(optionally) To think about auto rollback for invalid resources
It seems this is not necessary if proper "pre-validation" is implemented in the way described above. But maybe there are some cases where it can be useful.
Alternative workarounds:
Please let me know if you see any good workarounds for the current behavior. From what I see:
Install additional NIC in isolated namespace with separate "ingressClassName" for validation purpose only, and integrate it with our cd pipeline (to first create resource in this isolated namespace with dedicated "ingressClassName" for validation purpose only, and only then proceed to change the main resource if validation passed).
This should work, but seems a bit complicated, requires additional “validation” NIC, and duplicates already existing logic.
Create a dedicated validator based on NIC logic and trigger it either by cd pipeline or by policy managers like kyverno.
The same comment: seems complicated, and duplicates already existing logic in multiple places. And there are other problems, like keeping this tool in sync with upstream validation.
Additional comments:
Ideally, to change the behavior for other resources as well (like transportserver, ingress, etc..), but let's focus on VirtualServer first. We can discuss other resources later, if you are agree with the proposed changes for VirtualServer.
The text was updated successfully, but these errors were encountered:
Problem:
The problem described below is related to most of the resources managed by NIC, but I will describe it based on VirtualServer.
Currently, if an existing VirtualServer resource becomes invalid, NIC removes the corresponding configuration from Nginx (code, doc), resulting in an outage. And it wouldn't be so bad and scary if there was an interface to make "comprehensive validation" before updating VirtualServer resource. But there is no built-in and reliable way to validate VirtualServer resource before applying it - only structural validation by OpenAPI schema (correct me if I'm wrong).
We are migrating to the NIC as the main ingress for an external and internal traffic to our clusters. At the same time, we provide to our users an interface to change VS resource (for example, to add/change routes). How can we protect ourselves and our users from allowing someone to break important ingress endpoints because they added a duplicated path in routes or made some other syntax mistake (which passed structural validation but failed during the comprehensive validation phase)?
Preferred solutions:
There is no sense in removing it, and it's difficult to imagine how it can make a situation better - too big a penalty for a configuration error. Should be enough just to mark it “Invalid” and then responsible ppl or automation (by alerts, cd pipelines errors, etc…) can decide what to do with it. But let me know if I'm missing something and there's a legitimate reason for the current behavior.
For example, it could be done via ValidatingAdmissionWebhook for usual and dry-run mode (oss nginx-ingress has something similar). With this approach, there will be much fewer possibilities to make VirtualServer invalid and there will be a possibility to make "pre-validation" with something like "kubectl apply –dry-run=server".
It seems this is not necessary if proper "pre-validation" is implemented in the way described above. But maybe there are some cases where it can be useful.
Alternative workarounds:
Please let me know if you see any good workarounds for the current behavior. From what I see:
This should work, but seems a bit complicated, requires additional “validation” NIC, and duplicates already existing logic.
The same comment: seems complicated, and duplicates already existing logic in multiple places. And there are other problems, like keeping this tool in sync with upstream validation.
Additional comments:
Ideally, to change the behavior for other resources as well (like transportserver, ingress, etc..), but let's focus on VirtualServer first. We can discuss other resources later, if you are agree with the proposed changes for VirtualServer.
The text was updated successfully, but these errors were encountered: