-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(kuma-cp) logging of resource conflicts #1254
Conversation
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
if err != nil { | ||
// Error might occur only if we run out of VIPs. There is no point to pass it through, | ||
// we must notify user in logs and proceed | ||
vipsAllocatorLog.Error(err, "failed to allocate new VIPs") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we return after this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, UpdateMeshedVIPs
could allocate some IPs and we should update them in the configmap
@@ -42,6 +43,10 @@ func (r *ConfigMapReconciler) Reconcile(req kube_ctrl.Request) (kube_ctrl.Result | |||
} | |||
|
|||
if err := r.VIPsAllocator.CreateOrUpdateVIPConfig(mesh); err != nil { | |||
if kube_apierrs.IsConflict(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this should be store.IsResourceConflict
. We are translating Kube errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but ConfigStore didn't handle it and returned an original error. Fixed both ConfigStore and configmap_controller
@@ -42,6 +43,10 @@ func (r *ConfigMapReconciler) Reconcile(req kube_ctrl.Request) (kube_ctrl.Result | |||
} | |||
|
|||
if err := r.VIPsAllocator.CreateOrUpdateVIPConfig(mesh); err != nil { | |||
if kube_apierrs.IsConflict(err) { | |||
r.Log.V(1).Info("resource conflict", "err", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds scary. This should be more descriptive like
r.Log.V(1).Info("VIPs were updated in the other place. Retrying")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also on V1 "updating VIPs for mesh" and after the operation "VIPs for mesh udpated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, fixed these
pkg/core/resources/store/store.go
Outdated
@@ -124,6 +124,10 @@ func ErrorResourceConflict(rt model.ResourceType, name, mesh string) error { | |||
return fmt.Errorf("Resource conflict: type=%q name=%q mesh=%q", rt, name, mesh) | |||
} | |||
|
|||
func IsResourceConflict(err error) bool { | |||
return err != nil && strings.HasPrefix(err.Error(), "Resource conflict") | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge origin/master, this function should be there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/dns/vips_allocator.go
Outdated
func (d *VIPsAllocator) createOrUpdateVIPConfigs(meshes ...string) (errs error) { | ||
// createOrUpdateVIPConfigs updates VIP configs for provided meshes. | ||
// - fastFail - if 'true' return the first occurred error, stop processing meshes. | ||
// if 'false' then process all meshes, return the list of errors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what failFast
does, but I don't understand why 🤔 can you help me understand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just forget, I got rid of it
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
pkg/xds/server/components.go
Outdated
return reconciler.Clear(&proxyID) | ||
} | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this fix the situation you described, because you used ReadOnlyResourceManager
.
When you get the dataplane in the first step, it is cached so it's very high chance you will still retrieve V1. I think you need to either
- Use non cached ResourceManager
- Use the dataplane that was used to compiled the hash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be created as a separate PR imho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you get Dataplane after calculating hash it is the same or newer than a dataplane which reflected in the hash. Because when calculating hash we are also using ReadOnlyResourceManager
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakubdyszkiewicz create separate PR #1313, let's proceed there
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com> # Conflicts: # api/go.sum # go.sum # pkg/plugins/resources/k8s/native/go.sum # pkg/plugins/runtime/k8s/controllers/configmap_controller.go
pkg/dns/vips_allocator_test.go
Outdated
|
||
err = errAllocator.CreateOrUpdateVIPConfigs() | ||
Expect(err).To(HaveOccurred()) | ||
fmt.Println(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
pkg/dns/vips_allocator_test.go
Outdated
err = errAllocator.CreateOrUpdateVIPConfigs() | ||
Expect(err).To(HaveOccurred()) | ||
fmt.Println(err) | ||
Expect(err.Error()).To(Equal("error during update, mesh = mesh-1; error during update, mesh = mesh-2")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Expect(err).To(MatchError(""))
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
(cherry picked from commit b348fb3)
Signed-off-by: Ilya Lobkov ilya.lobkov@konghq.com
Summary
Since we are using a caching client on Kubernetes
Get
might return an old version of the resource and causeResource conflict
error. Despite these errors in the log Kuma CP reconciles resources as expected. So current PR just eliminates such log lines forinsight resyncer
andvip allocator
Documentation
Not required