-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ClusterRelocate to move ClusterDeployments between Hive instances #1011
ClusterRelocate to move ClusterDeployments between Hive instances #1011
Conversation
98f2b15
to
a473798
Compare
} | ||
|
||
for _, cd := range clusterDeployments.Items { | ||
if cd.Annotations[constants.RelocatingAnnotation] != clusterRelocate.Name && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wasn't clear to me. Should it be if annotation != name OR !labelMatch ? I suppose the question is, what exactly is this condition catching?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is meant to filter out the ClusterDeployments that are not related to the ClusterRelocate. If the ClusterDeployment does not have a relocating annotation matching the ClusterRelocate AND does not have a label matching the ClusterRelocate, then the ClusterDeployment is not queued.
a473798
to
be97dcd
Compare
This has been tested now and is ready for further review. |
@@ -431,6 +431,18 @@ type CertificateBundleStatus struct { | |||
Generated bool `json:"generated"` | |||
} | |||
|
|||
// RelocateStatus is the status of a cluster relocate | |||
type RelocateStatus string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noting the annotation it's used in would be useful context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not seeing the annotation it's used in mentioned here, something missed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I added it to ClusterRelocateStatus
but not here. That brings to light that I have a ClusterRelocateStatus
and a RelocateStatus
. That is too confusing. I need to rename one or both of those.
if err := r.copyResource(dnsZone, destClient, false, logger); err != nil { | ||
return errors.Wrap(err, "failed to copy dnszone") | ||
} | ||
} | ||
|
||
// copy clusterdeployment | ||
if err := r.copyResource(cd, destClient, true, logger); err != nil { | ||
return errors.Wrap(err, "failed to copy clusterdeployment") | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the naked curly block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep the logger
variable isolate to inside the scope, in case there is other code added after the code block later that may use logger
and not want the type and resource field. But, now I see that I replaced the function-scoped logger
variable instead of creating a new logger
variable.
17031c1
to
91144bf
Compare
|
||
"github.com/pkg/errors" | ||
log "github.com/sirupsen/logrus" | ||
"sigs.k8s.io/controller-runtime/pkg/handler" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the grouping of the imports seems inconsistent. only one controller-runtime import is by itself.
Despite the minor annotation thing I'm going to lgtm and see if we can get this to merge today. |
/lgtm |
/retest Please review the full test history for this PR and help us cut down flakes. |
1 similar comment
/retest Please review the full test history for this PR and help us cut down flakes. |
Let me fix the annotation comment and address Joel's comment and squash everything down. |
/hold |
91144bf
to
c753c45
Compare
/hold cancel |
26fec3c
to
cb2aa73
Compare
When a ClusterRelocate has a label selector that matches with a ClusterDeployment, the clusterrelocate controller will relocate the matching ClusterDeployment. 1. Set the hive.openshift.io/relocate annotation to outgoing on the ClusterDeployment and the DNSZone. 2. Copy secrets, configmaps, machinesets, syncsets, and syncidentityproviders. 3. Copy the DNSZone for the ClusterDeployment, with the relocate annotation set to incoming. 4. Copy the ClusterDeployment, with the relocate annotation set to incoming. 5. Set the relocate annotation on the DNSZone and ClusterDeployemnt to complete. When a ClusterDeployment with a relocate annotation set to incoming is reconciled by the clusterRelocate controller, the relocate annotation is removed from the DNSZone and the ClusterDeployment. This indicates that the relocate has completed on the destination side. When a ClusterDeployment has the relocated annotation, no controllers will do any mutation of the remote cluster (such as syncing syncsets). When the ClusterDeployment has the relocated annotation, then controllers will not run finalizer code when the ClusterDeployment is deleted. Add a hive_cluster_relocations counter vec that tracks the total number of successful cluster relocations. The metric has a single cluster_relocate label that is the name of the ClusterRelocate directing the relocation. Add a hive_aborted_cluster_relocations counter vec that tracks the total number of aborted cluster relocations. The metrics has two labels: cluster_relocate and reason. The cluster_relocate label is the name of the ClusterRelocate directing the aborted relocation. The reason label is the reason why the relocate was aborted. Possible values for the reason label are "no_match", "multiple_matches", and "new_match". The number of failing cluster relocates can be seen by looking at the hive_cluster_deployments_conditions metric and filtering on a value of RelocationFailed for the condition label. https://issues.redhat.com/browse/CO-653
cb2aa73
to
dc6f5db
Compare
/test unit |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dgoodwin, staebler The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Please review the full test history for this PR and help us cut down flakes. |
1 similar comment
/retest Please review the full test history for this PR and help us cut down flakes. |
When a ClusterRelocate has a label selector that matches with a ClusterDeployment, the clusterrelocate controller will relocate the matching ClusterDeployment.
When a ClusterDeployment with a relocate annotation set to incoming is reconciled by the clusterRelocate controller, the relocate annotation is removed from the DNSZone and the ClusterDeployment. This indicates that the relocate has completed on the destination side.
When a ClusterDeployment has the relocated annotation, no controllers will do any mutation of the remote cluster (such as syncing syncsets). When the ClusterDeployment has the relocated annotation, then controllers will not run finalizer code when the ClusterDeployment is deleted.
Add a hive_cluster_relocations counter vec that tracks the total number of successful cluster relocations. The metric has a single
cluster_relocate label that is the name of the ClusterRelocate directing the relocation.
Add a hive_aborted_cluster_relocations counter vec that tracks the total number of aborted cluster relocations. The metrics has two labels: cluster_relocate and reason. The cluster_relocate label is the name of the ClusterRelocate directing the aborted relocation. The reason label is the reason why the relocate was aborted. Possible values for the reason label are "no_match", "multiple_matches", and "new_match".
The number of failing cluster relocates can be seen by looking at the hive_cluster_deployments_conditions metric and filtering on
a value of RelocationFailed for the condition label.
https://issues.redhat.com/browse/CO-653