Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(kuma-cp) logging of resource conflicts #1254

Merged
merged 11 commits into from
Dec 23, 2020

Conversation

lobkovilya
Copy link
Contributor

Signed-off-by: Ilya Lobkov ilya.lobkov@konghq.com

Summary

Since we are using a caching client on Kubernetes Get might return an old version of the resource and cause Resource conflict error. Despite these errors in the log Kuma CP reconciles resources as expected. So current PR just eliminates such log lines for insight resyncer and vip allocator

Documentation

Not required

Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
if err != nil {
// Error might occur only if we run out of VIPs. There is no point to pass it through,
// we must notify user in logs and proceed
vipsAllocatorLog.Error(err, "failed to allocate new VIPs")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we return after this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, UpdateMeshedVIPs could allocate some IPs and we should update them in the configmap

@@ -42,6 +43,10 @@ func (r *ConfigMapReconciler) Reconcile(req kube_ctrl.Request) (kube_ctrl.Result
}

if err := r.VIPsAllocator.CreateOrUpdateVIPConfig(mesh); err != nil {
if kube_apierrs.IsConflict(err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be store.IsResourceConflict. We are translating Kube errors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but ConfigStore didn't handle it and returned an original error. Fixed both ConfigStore and configmap_controller

@@ -42,6 +43,10 @@ func (r *ConfigMapReconciler) Reconcile(req kube_ctrl.Request) (kube_ctrl.Result
}

if err := r.VIPsAllocator.CreateOrUpdateVIPConfig(mesh); err != nil {
if kube_apierrs.IsConflict(err) {
r.Log.V(1).Info("resource conflict", "err", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds scary. This should be more descriptive like

r.Log.V(1).Info("VIPs were updated in the other place. Retrying")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also on V1 "updating VIPs for mesh" and after the operation "VIPs for mesh udpated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, fixed these

@@ -124,6 +124,10 @@ func ErrorResourceConflict(rt model.ResourceType, name, mesh string) error {
return fmt.Errorf("Resource conflict: type=%q name=%q mesh=%q", rt, name, mesh)
}

func IsResourceConflict(err error) bool {
return err != nil && strings.HasPrefix(err.Error(), "Resource conflict")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merge origin/master, this function should be there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

func (d *VIPsAllocator) createOrUpdateVIPConfigs(meshes ...string) (errs error) {
// createOrUpdateVIPConfigs updates VIP configs for provided meshes.
// - fastFail - if 'true' return the first occurred error, stop processing meshes.
// if 'false' then process all meshes, return the list of errors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what failFast does, but I don't understand why 🤔 can you help me understand?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just forget, I got rid of it

pkg/insights/resyncer.go Outdated Show resolved Hide resolved
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
return reconciler.Clear(&proxyID)
}
return err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this fix the situation you described, because you used ReadOnlyResourceManager.
When you get the dataplane in the first step, it is cached so it's very high chance you will still retrieve V1. I think you need to either

  1. Use non cached ResourceManager
  2. Use the dataplane that was used to compiled the hash

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be created as a separate PR imho

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you get Dataplane after calculating hash it is the same or newer than a dataplane which reflected in the hash. Because when calculating hash we are also using ReadOnlyResourceManager

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakubdyszkiewicz create separate PR #1313, let's proceed there

Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>

# Conflicts:
#	api/go.sum
#	go.sum
#	pkg/plugins/resources/k8s/native/go.sum
#	pkg/plugins/runtime/k8s/controllers/configmap_controller.go

err = errAllocator.CreateOrUpdateVIPConfigs()
Expect(err).To(HaveOccurred())
fmt.Println(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

err = errAllocator.CreateOrUpdateVIPConfigs()
Expect(err).To(HaveOccurred())
fmt.Println(err)
Expect(err.Error()).To(Equal("error during update, mesh = mesh-1; error during update, mesh = mesh-2"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Expect(err).To(MatchError(""))

Signed-off-by: Ilya Lobkov <ilya.lobkov@konghq.com>
@lobkovilya lobkovilya merged commit b348fb3 into master Dec 23, 2020
@lobkovilya lobkovilya deleted the fix/resource-conflict-error branch December 23, 2020 13:59
mergify bot pushed a commit that referenced this pull request Dec 23, 2020
nickolaev pushed a commit that referenced this pull request Jan 4, 2021
(cherry picked from commit b348fb3)

Co-authored-by: Ilya Lobkov <ilya.lobkov@konghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants