Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ordered initialization for the DRPC condition array #920

Merged
merged 2 commits into from Aug 3, 2023

Conversation

BenamarMk
Copy link
Member

@BenamarMk BenamarMk commented Jun 8, 2023

Introduce ordered initialization for condition array, ensuring consistent location. Updates preserve condition order.

Fixes this bug1 and this bug2

@BenamarMk BenamarMk force-pushed the fix-status-condition-ordering branch from b5e7494 to c7e6792 Compare June 8, 2023 20:04
Copy link
Member

@nirs nirs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to preserve the order? Is this required to fix some issue?

existingCondition.Status != newCondition.Status ||
existingCondition.ObservedGeneration != newCondition.ObservedGeneration ||
existingCondition.Reason != newCondition.Reason ||
existingCondition.Message != newCondition.Message {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be nice to extract this to conditionsEqual() (like https://pkg.go.dev/bytes#Equal) later.

@BenamarMk
Copy link
Member Author

BenamarMk commented Jun 12, 2023

Why do we need to preserve the order? Is this required to fix some issue?

@nirs I updated the description to include the bugs that this PR fixes.
The order is not required, however, as mentioned in the bug comment it solves a cosmetic problem when showing DRPC output in wide option mode. It is due to this line. I couldn't find a way to print the PeerReady status without hardcoding the index.

@nirs
Copy link
Member

nirs commented Jun 12, 2023

@BenamarMk you wrote in the bug:

This is a cosmetic issue. The status Unknown is misleading. It actually belongs to the Available condition. The issue occurs specifically when hub recovery is performed. The DRPC status conditions might be rebuilt out of order. We depend on PeerReady condition to be at index 1.

Why do we care about the index of the condition and not using the condition (unique) name?

How this change ensures the order of the conditions?

@BenamarMk
Copy link
Member Author

BenamarMk commented Jun 14, 2023

Why do we care about the index of the condition and not using the condition (unique) name?

This line needs a hardcoded index for the PeerReady column to show in -o wide option. Let me know if you have a way to achieve what we want without hardcoding it.

How this change ensures the order of the conditions?

Once the conditions are added to the status, they will not be removed.

controllers/drplacementcontrol_controller.go Show resolved Hide resolved
@@ -107,7 +107,7 @@ func (d *DRPCInstance) RunInitialDeployment() (bool, error) {

if homeCluster == "" {
err := fmt.Errorf("PreferredCluster not set. Placement (%v)", d.userPlacement)
d.setDRPCCondition(&d.instance.Status.Conditions, rmn.ConditionAvailable, d.instance.Generation,
addOrUpdateCondition(&d.instance.Status.Conditions, rmn.ConditionAvailable, d.instance.Generation,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why replace calls to d.setDRPCCondition() with addOrUpdateCondition()?

If this is required refactoring to implement ensureDRPCConditionsInited() it is
better to do in a separate commit. Mixing refactoring and bug fix in the same
commit make it harder to understand the change both in review or 6 month later
when someone tries to understand the history when fixing another issue.

If this change is not required to fix the bug, better not mix it with the fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A fair point.

return false
}

// Initial creation of the DRPC status condition. This will also preserve the ordering of conditions in the array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be useful to explain here why the ordering of the condition matters, and link to the relevant bug caused by not ensuring the order.

Benamar Mekhissi added 2 commits June 14, 2023 17:39
Signed-off-by: Benamar Mekhissi <bmekhiss@ibm.com>
Introduce ordered initialization for the condition array in the DRPC module.
This change ensures that the conditions are consistently located in the array,
allowing for easier management and maintenance. Additionally, the updates made
to the code now properly preserve the order of the conditions, providing more
reliable and predictable behavior.

Signed-off-by: Benamar Mekhissi <bmekhiss@ibm.com>
@BenamarMk BenamarMk force-pushed the fix-status-condition-ordering branch from c7e6792 to 4dd8e37 Compare June 14, 2023 21:45
Copy link
Member

@nirs nirs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for splitting the commits! it is much more clear now.

// Initial creation of the DRPC status condition. This will also preserve the ordering of conditions in the array
func ensureDRPCConditionsInited(conditions *[]metav1.Condition, observedGeneration int64, message string) {
const DRPCTotalConditions = 2
if len(*conditions) == DRPCTotalConditions {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constant is not helping, it does not provide more info and it is not used elsewhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. It is not used anywhere, but it is helping a little bit in two ways:

  1. It is readable. IOW, now you know what 2 means. And the const tells you that it is really for readability only and the value of 2 is a constant expectation. Meaning, the condition array can't have more than 2 or less than 2.
  2. Staisfies the linter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we think maximum number of conditions is important and helpful, why not add a module constant? It will make the code nicer even if we don't use it elsewhere. The current form looks like trying to make the linter happy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We absolutely can. Every time you stare at a block of code for a long time, you end up thinking of a better way to rewrite it. This is just the nature of software engineering. At this moment, I will leave it as is for this PR.

@@ -1759,3 +1761,30 @@ func addOrUpdateCondition(conditions *[]metav1.Condition, conditionType string,

return false
}

// Initial creation of the DRPC status condition. This will also preserve the ordering of conditions in the array
func ensureDRPCConditionsInited(conditions *[]metav1.Condition, observedGeneration int64, message string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to accept a separate observedGeneration and message? Since this is used only
at initialization, we always use drpc.Generation and "Initialization". We could accept a single argument: drpc and use its fields. This makes the function impossible to use incorrectly. Do we really the need to flexibility to call with any generation or message?

Status: metav1.ConditionTrue,
LastTransitionTime: time,
Message: message,
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works only if both conditions are missing, only rmn.ConditionAvailable exists, or both exist. Is it possible that only rmn.ConditionPeerReady exits and it is at index 0 when this is called?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While true, this does not occur at present, and not post this PR. The added logic to copy the existing condition in the right order maybe unwanted post this change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works only if both conditions are missing, only rmn.ConditionAvailable exists, or both exist. Is it possible that only rmn.ConditionPeerReady exits and it is at index 0 when this is called?

At the start, we want the ConditionAvailable to be at index 0, and ConditionPeerReady to be at index 1. And that's what this patch is doing. So there is no window for ConditionPeerReady to exist before ConditionAvailable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a comment describing this assumption. Typically you assert() about this so if you assumption is wrong you find it during testing.

@ShyamsundarR ShyamsundarR added the high Issue is of high priority and needs attention label Jul 3, 2023
Type: rmn.ConditionAvailable,
Reason: string(rmn.Initiating),
ObservedGeneration: observedGeneration,
Status: metav1.ConditionTrue,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we start with Unknown as the initial condition status, true always gives me the jitters in other parts of the code where we compare at times the status and the generation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like to start with the status set to Unknown as well. But during testing, there was a case where that will break. I don't remember that one. In any case, True won't hurt. Especially, on the initial run of the DRPC instance, the world is good until it isn't.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a comment to consider starting with Unknown state. Starting with fake "good" state causes lot of trouble in other case, like restarting minikube env. We get fake status for about 30 seconds from everything and this is very hard to manage. It may not be relevant to drpc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, start positive, until there is a need to switch. Starting with Unknown will just cause more trouble for other clients. The UI is one of them.

@ShyamsundarR
Copy link
Member

Merging this for issue progress, please track and close other comments as issues in the future.

@ShyamsundarR ShyamsundarR merged commit e9a78fc into RamenDR:main Aug 3, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high Issue is of high priority and needs attention
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants