Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GVK differ reports incorrectly with multiple groups of same kind #531

Closed
alaypatel07 opened this issue May 11, 2020 · 3 comments · Fixed by #534
Closed

GVK differ reports incorrectly with multiple groups of same kind #531

alaypatel07 opened this issue May 11, 2020 · 3 comments · Fixed by #534
Labels
kind/bug Categorizes issue or PR as related to a bug. release-1.2.0
Milestone

Comments

@alaypatel07
Copy link
Contributor

When multiple groups of the same kind are present and one of
the group is getting deprecated at the destination cluster, the
GVK diff complains about deprecated but Valero uses the other
group's preferred kind group-version to migrate successfully.

Example: let's assume the source has two group of deployment
as follows:

Group: extensions
Version: v1beta1
Kind: deployments

and

Group: apps
Version: v1
Kind: deployments

Assume the destination has deprecated extensions group so
the GVK at the destination is:

Group: apps
Version: v1
Kind: deployments

Problem

The GVK differ will correctly identify that extensions/v1beta1 deployments
is deprecated and incompatible. But because of the way Kube reads/writes
works the resource deployments will automatically be shifted to the preferred
group version of destination. Hence extensions workloads will be migrated as
apps workloads in the above example.

However, the controller checks through the list of incompatible GVK calculated
by the differ and checks if an object[1] of any of the incompatible gvk's exist
in the cluster and report the GVK if the resource exists. This creates a problem,
as extensions workloads are identified from the source cluster.

Hence the problem is for kinds with multiple groups if at least one group
exists at the destination, the kind will be able to migrate and the controller
should remove all the incompatible GVKs with this kind

Solution

The controller should only complain if, an incompatible GVK exists for a kind with
no other group version available to migrate.

I propose adding a method here[2], to filter the list of incompatible GVKs which
do not have a compatible group kind alternative available for Valero to migrate.
This filtered list will be the correct reflection of GVK's which cannot be migrated,
as required

  1. https://github.com/konveyor/mig-controller/blob/d6e18043d84b21b82af530e930f7af41c0e6c1f2/pkg/gvk/gvk.go#L191
  2. https://github.com/konveyor/mig-controller/blob/d6e18043d84b21b82af530e930f7af41c0e6c1f2/pkg/gvk/gvk.go#L53
@jwmatthews
Copy link
Contributor

jwmatthews commented May 11, 2020

As we work this be careful to keep in mind that it's not always true that the same Kind between 2 groups represents one being deprecated, as is with deployment and groups of {extensions, apps}.

There are other cases where same Kind name is reused in 2 different Groups that do not intend to represent the same thing, example with KNative and 'Route' being under 2 different/incompatible groups.

`$ oc api-resources|grep route

routes route.openshift.io true Route

routes rt serving.knative.dev true Route
`

@alaypatel07
Copy link
Contributor Author

alaypatel07 commented May 11, 2020

+1, thanks @jwmatthews for pointing this out. I think the answer is not as simple as filter the kind for uniqueness. Under the hoods, Velero maintains the list of known GVR's for kinds that might have multiple groups as cohabitating Resources[1], we should implement our filter based on what Velero does. I will try to update the issue with something along those lines

  1. https://github.com/vmware-tanzu/velero/blob/4b0f654a1e322b5f76533fbf28c456934a95bde1/pkg/backup/backup.go#L89-L88

@alaypatel07
Copy link
Contributor Author

alaypatel07 commented May 11, 2020

Based on discussion with @djwhatle and pain points shared by @jwmatthews the new proposed solution is as follows:

  1. Introduce a new data structure for co-habitating resources like upstream velero here[1]
  2. Have a sort method on the APIResourceList like here[2]
  3. Call the sort method here[3]
  4. Skip the loop if this is a co-habitating resource and if it is seen before[4], if it is co-habitation resource and not seen before mark it as seen, something similar to this[4]

The above steps will make sure compareResources does not compare co-habitating resources twice and pick other groups in favor of extension. In the example used in the issue, deployment.apps will be favored instead of deployments.extensionsand if deployments.apps is seen an processed, compareResources will not add deployments.extensions to list of missing GVK's. This will solved the underlying issue in a manner that is compatible with upstream Velero.

Note 1: Long term, this code should be shared and we should work upstream to refactor this as exported functions to be imported and used by us in mig-controller.

Note 2: Need to explore the co-habitating API specific to OpenShift like DeploymentConfigs. In order to scope this down, we might take this as a follow-up.

References:

  1. https://github.com/vmware-tanzu/velero/blob/4b0f654a1e322b5f76533fbf28c456934a95bde1/pkg/backup/backup.go#L88-L87
  2. https://github.com/vmware-tanzu/velero/blob/0d97f9400e88d3251ac1367820d5e968fe1a210d/pkg/discovery/helper.go#L214
  3. https://github.com/konveyor/mig-controller/blob/d6e18043d84b21b82af530e930f7af41c0e6c1f2/pkg/gvk/gvk.go#L243
  4. https://github.com/konveyor/mig-controller/blob/d6e18043d84b21b82af530e930f7af41c0e6c1f2/pkg/gvk/gvk.go#L246
  5. https://github.com/vmware-tanzu/velero/blob/30ca0e4322dcc9fd9b4ba9560e9245df81db8d7a/pkg/backup/item_collector.go#L146-L157

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. release-1.2.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants