New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rate: Wait(n=1) would exceed context deadline #362
Comments
@LionelJouin Thanks for the great report. We'll take care of it in v1.8.0. Also, we found something similar during scale testing networkservicemesh/deployments-k8s#5494 (comment) Just for now, I recommend increasing CPU limits for registry-k8s. |
Hi @denis-tingaikin, i tried applying the workaround of increasing the cpu and memory limits on registry-k8s but it did not work. And also, the networkserviceendpoint objects seem to be in a flux, a number of them get deleted and recreated in a loop. Is the root cause known for this issue? |
@bharath-avesha We've found a root cause and the problem seems fixed. For now, the latest main version of deployments-k8s repository should be fine. We also will have pinged you when v1.8.0-rc.1 will be available 😉 |
@LionelJouin Could you please test ghcr.io/networkservicemesh/cmd-registry-k8s:v1.8.0-rc.1? |
Sure I will |
I just tried with 76 NSCs and 26 NSEs, the problem seems fixed with v1.8.0-rc.1, thank you. |
Expected Behavior
The k8s registry should work as the memory registry, it should be able to handle many nodes, and many NSEs/NSCs.
Current Behavior
I am not sure yet if it is due to the amount of NSEs and NSCs, or if it the number of Kubernetes worker, or if it is the combination of multiple factors, but using the K8S-registry (no problem with the memory-registry), some of my NSEs are failing to register / some of my NSCs are failing the request and I believe it is due to the registry since I can see logs in nsmgr and forwarder referring to the registry.
Failure Information (for bugs)
The issue is not happening all the time, sometimes I managed to deploy with no problem.
Here is some logs coming from the registry:
Steps to Reproduce
No instructions yet, I had this problem using my project. I will try to check if I can reproduce using the example provided on the deployment-k8s.
Context
Number of NS: 3
Number of NSC: 26
Number of NSE: 13
Number of Kubernetes worker: 10
Number of registry: 1 (I haven't noticed any problem with more replicas)
NSM: v1.7.1 (Tried with v1.6.1 and v1.5.0 and the issue was existing)
Kubernetes: v1.25 (Kind) / v1.24 (Real cluster)
Failure Logs
registry.log
The text was updated successfully, but these errors were encountered: