New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Policy Base Routing failure; Possible table ID collision. #9119
Comments
Hi @ljkiraly. Yes, there might be a problem if one NSC sends two parallel requests. There are two options how to fix this:
|
ljkiraly
added a commit
to Nordix/nsm-sdk-kernel
that referenced
this issue
May 4, 2023
Related issue: networkservicemesh/deployments-k8s/issues/9119 Added mutex lock to protect table ID selection from parallel runs. Signed-off-by: Laszlo Kiraly <laszlo.kiraly@est.tech>
Hi @NikitaSkrynnik, Thanks for suggestions. I misleading you I think. I was wrong in statement that the requests causes the problem. Most probably the NSE responses are processed in parallel. I proposed to fix this with a preventive lock before selecting and assigning table IDs. |
ljkiraly
added a commit
to Nordix/nsm-sdk-kernel
that referenced
this issue
May 25, 2023
Related issue: networkservicemesh/deployments-k8s/issues/9119 Added mutex lock to protect table ID selection from parallel runs. Signed-off-by: Laszlo Kiraly <laszlo.kiraly@est.tech>
ljkiraly
added a commit
to Nordix/nsm-sdk-kernel
that referenced
this issue
May 25, 2023
Related issue: networkservicemesh/deployments-k8s/issues/9119 Added a mechanism to protect table ID selection from parallel runs. Signed-off-by: Laszlo Kiraly <laszlo.kiraly@est.tech>
ljkiraly
added a commit
to Nordix/nsm-sdk-kernel
that referenced
this issue
Jun 14, 2023
Related issue: networkservicemesh/deployments-k8s/issues/9119 Added a mechanism to protect table ID selection from parallel runs. Signed-off-by: Laszlo Kiraly <laszlo.kiraly@est.tech>
ljkiraly
added a commit
to Nordix/nsm-sdk-kernel
that referenced
this issue
Jun 14, 2023
Related issue: networkservicemesh/deployments-k8s/issues/9119 Added a mechanism to protect table ID selection from parallel runs. Signed-off-by: Laszlo Kiraly <laszlo.kiraly@est.tech>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Expected Behavior
On target node the routing table ID should be different for different source IPs.
Current Behavior
For some reason can happen that some route is missing and the rule point to wrong table ID:
Failure Information (for bugs)
As I see the forwarder stores the routing table IDs into sync.Map keyed by connection ID. In theory two concurrent connection requests coming from the same NSC but towards different NSs can get the same table ID since there are no semaphores used in getFreeTableID call. https://github.com/networkservicemesh/sdk-kernel/blob/0ca43b02fb6f2b9467d547f79825f12075bda5a4/pkg/kernel/networkservice/connectioncontextkernel/ipcontext/iprule/common.go#L79
Steps to Reproduce
The logs were on info level when this happened. The reproduction steps not yet known.
I will update the issue if I am able to reproduce it.
The text was updated successfully, but these errors were encountered: