Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Timestamp for Health Status (#16972) #18660

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

mkieweg
Copy link

@mkieweg mkieweg commented Jun 14, 2024

This PR contains an implementation of the enhancement proposal #16972.
Closes #16972

Checklist:

  • Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this does not need to be in the release notes.
  • The title of the PR states what changed and the related issues number (used for the release note).
  • The title of the PR conforms to the Toolchain Guide
  • I've included "Closes [ISSUE #]" or "Fixes [ISSUE #]" in the description to automatically close the associated issue.
  • I've updated both the CLI and UI to expose my feature, or I plan to submit a second PR with them.
  • Does this PR require documentation updates?
  • I've updated documentation as required by this PR.
  • I have signed off all my commits as required by DCO
  • I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
  • My build is green (troubleshooting builds).
  • My new feature complies with the feature status guidelines.
  • I have added a brief description of why this PR is necessary and/or what this PR solves.
  • Optional. My organization is added to USERS.md.
  • Optional. For bug fixes, I've indicated what older releases this fix should be cherry-picked into (this may or may not happen depending on risk/complexity).

@mkieweg mkieweg requested a review from a team as a code owner June 14, 2024 10:52
@mkieweg mkieweg marked this pull request as draft June 14, 2024 10:59
@mkieweg mkieweg marked this pull request as ready for review June 17, 2024 10:23
Copy link
Member

@jannfis jannfis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have a couple of initial things, PTAL :)

controller/health.go Outdated Show resolved Hide resolved
controller/health.go Outdated Show resolved Hide resolved
Copy link
Contributor

@svghadi svghadi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mkieweg for the PR. I have left a suggestion. I think below section also needs an update.

if orig.Status.Health.Status != newStatus.Health.Status {
message := fmt.Sprintf("Updated health status: %s -> %s", orig.Status.Health.Status, newStatus.Health.Status)
ctrl.logAppEvent(orig, argo.EventInfo{Reason: argo.EventReasonResourceUpdated, Type: v1.EventTypeNormal}, message, context.TODO())
}
var newAnnotations map[string]string
if orig.GetAnnotations() != nil {
newAnnotations = make(map[string]string)
for k, v := range orig.GetAnnotations() {
newAnnotations[k] = v
}
delete(newAnnotations, appv1.AnnotationKeyRefresh)
}
patch, modified, err := diff.CreateTwoWayMergePatch(
&appv1.Application{ObjectMeta: metav1.ObjectMeta{Annotations: orig.GetAnnotations()}, Status: orig.Status},
&appv1.Application{ObjectMeta: metav1.ObjectMeta{Annotations: newAnnotations}, Status: *newStatus}, appv1.Application{})

Looks like CreateTwoWayMergePatch will always detect a drift here due to the new timestamp field and try to patch the status when there is no actual change in app health status. We might need to explicitly set it to old/orig value if there is not change in app health status. Update: not required.

controller/health.go Outdated Show resolved Hide resolved
@mkieweg
Copy link
Author

mkieweg commented Jun 19, 2024

Thanks @mkieweg for the PR. I have left a suggestion. I think below section also needs an update.

if orig.Status.Health.Status != newStatus.Health.Status {
message := fmt.Sprintf("Updated health status: %s -> %s", orig.Status.Health.Status, newStatus.Health.Status)
ctrl.logAppEvent(orig, argo.EventInfo{Reason: argo.EventReasonResourceUpdated, Type: v1.EventTypeNormal}, message, context.TODO())
}
var newAnnotations map[string]string
if orig.GetAnnotations() != nil {
newAnnotations = make(map[string]string)
for k, v := range orig.GetAnnotations() {
newAnnotations[k] = v
}
delete(newAnnotations, appv1.AnnotationKeyRefresh)
}
patch, modified, err := diff.CreateTwoWayMergePatch(
&appv1.Application{ObjectMeta: metav1.ObjectMeta{Annotations: orig.GetAnnotations()}, Status: orig.Status},
&appv1.Application{ObjectMeta: metav1.ObjectMeta{Annotations: newAnnotations}, Status: *newStatus}, appv1.Application{})

Looks like CreateTwoWayMergePatch will always detect a drift here due to the new timestamp field and try to patch the status when there is no actual change in app health status. We might need to explicitly set it to old/orig value if there is not change in app health status.

Thanks for pointing that out, I'll give it a look

@svghadi
Copy link
Contributor

svghadi commented Jun 19, 2024

Hi @mkieweg, I think you can ignore my previous suggestion regarding changes due to CreateTwoWayMergePatch in persistAppStatus func. When I looked at the code again, I see the values passed to persistAppStatus func are actually deep copy of each other

app := origApp.DeepCopy()

and will differ only when explicitly updated, which happens at 2 places. We need to ensure timestamp is updated at these places (looks like 2. is already handled in your change).

  1. app.Status.Health.Status = health.HealthStatusUnknown

  2. app.Status.Health = *compareResult.healthStatus

@mkieweg
Copy link
Author

mkieweg commented Jun 19, 2024

Hi @mkieweg, I think you can ignore my previous suggestion regarding changes due to CreateTwoWayMergePatch in persistAppStatus func. When I looked at the code again, I see the values passed to persistAppStatus func are actually deep copy of each other

app := origApp.DeepCopy()

and will differ only when explicitly updated, which happens at 2 places. We need to ensure timestamp is updated at these places (looks like 2. is already handled in your change).

1. https://github.com/argoproj/argo-cd/blob/029b5acd5462e153b72fe0f7b1c5fa38e311d05c/controller/appcontroller.go#L1545

2. https://github.com/argoproj/argo-cd/blob/029b5acd5462e153b72fe0f7b1c5fa38e311d05c/controller/appcontroller.go#L1634

Hi @svghadi, I've added setting the timestamp in case we set the health status to unknown. The second one is indeed already handled by CompareAppState

@mkieweg mkieweg requested review from jannfis and svghadi June 20, 2024 10:45
@jessesuen
Copy link
Member

jessesuen commented Jun 21, 2024

We actually used to have a timestamp, but the updating of the timestamp as part of every app reconciliation would cause a ton of K8s API server pressure. I need to find the PR that removed this, but this could actually cause a performance problem.

@jessesuen
Copy link
Member

I found the old issue. It is here: #1340

Basically, we used to update status.observedAt, but this update would cause immense pressure on K8s API.

@jessesuen
Copy link
Member

this could actually cause a performance problem.

I took a closer look at this implementation, and noticed that we don't needlessly update timestamps (as we did with the problematic status.observedAt field)

			// If the status didn't change, we don't want to update the timestamp
			if healthStatus.Status == statuses[i].Health.Status {
				now = lastTransitionTime
			}

So my performance concerns may be alleviated.

@blakepettersson
Copy link
Member

In the E2E test TestNamespacedConfigMap the output for some reason has changed from my-map Synced configmap/my-map created to my-map Synced Healthy configmap/my-map created.

Copy link
Contributor

@svghadi svghadi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran some tests and found some corner cases that need to be handled:

  1. Timestamp for individual resources is updated to Application's LastTransitionTime even when there is no change in the resource status.
  2. The timestamp for Application's health is not updated when the status changes from Non Healthy to Healthy state.

Comment on lines 66 to 71
// If the status didn't change, we don't want to update the timestamp
if healthStatus.Status == statuses[i].Health.Status {
now = lastTransitionTime
}

resHealth := appv1.HealthStatus{Status: healthStatus.Status, Message: healthStatus.Message, LastTransitionTime: now}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section sets health of individual resources of the application. If there is no status change, I feel it should be set to lastTransitionTime of that particular resource instead of setting it to application's lastTransitionTime, as the values may not necessarily be the same.

Suggested change
// If the status didn't change, we don't want to update the timestamp
if healthStatus.Status == statuses[i].Health.Status {
now = lastTransitionTime
}
resHealth := appv1.HealthStatus{Status: healthStatus.Status, Message: healthStatus.Message, LastTransitionTime: now}
// If the status didn't change, we don't want to update the timestamp
if healthStatus.Status == statuses[i].Health.Status {
now = statuses[i].Health.LastTransitionTime
}
resHealth := appv1.HealthStatus{Status: healthStatus.Status, Message: healthStatus.Message, LastTransitionTime: now}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me have a look and get back to you on this. I vaguely remember, that the current implementation only persists the timestamp on an application level. So while setting it on the resource level here might make sense, but I'm not sure if it has any benefit if the rest of the implementation remains unchanged. Which I would advocate for at the moment.

Copy link
Contributor

@svghadi svghadi Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the code. The resource-level timestamp is indeed difficult to achieve due to the current implementation.
Since the resource timestamp will contain the app health timestamp, I feel it may not be that useful. How about we skip it at the resource level? Just have it at the application health level, i.e. .status.health. If you change the timestamp to a pointer type, I think we can ignore it at resource level and it won't show up in .status.resources.

Comment on lines 89 to 95
if persistResourceHealth {
appHealth.LastTransitionTime = statuses[i].Health.LastTransitionTime
} else {
appHealth.LastTransitionTime = now
}
} else if healthStatus.Status == health.HealthStatusHealthy {
appHealth.LastTransitionTime = lastTransitionTime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to handle case where status transition happens from Non Healthy -> Healthy. Since we have access to old status values, would it make sense to set the time value outside the for loop after the new status of application is calculated? Something like

diff --git a/controller/health.go b/controller/health.go
index b1c457776..e1a2c5d19 100644
--- a/controller/health.go
+++ b/controller/health.go
@@ -86,15 +86,11 @@ func setApplicationHealth(resources []managedResource, statuses []appv1.Resource
 
 		if health.IsWorse(appHealth.Status, healthStatus.Status) {
 			appHealth.Status = healthStatus.Status
-			if persistResourceHealth {
-				appHealth.LastTransitionTime = statuses[i].Health.LastTransitionTime
-			} else {
-				appHealth.LastTransitionTime = now
-			}
-		} else if healthStatus.Status == health.HealthStatusHealthy {
-			appHealth.LastTransitionTime = lastTransitionTime
 		}
 	}
+	if persistResourceHealth {
+          if app.Status.Health.Status == appHealth.Status {
+		  appHealth.LastTransitionTime = lastTransitionTime
+          } else {
+                 appHealth.LastTransitionTime = metav1.Now()
+          }
+	}
 	if persistResourceHealth {
 		app.Status.ResourceHealthSource = appv1.ResourceHealthLocationInline
 	} else {

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've managed to repro the issue. Unfortunately it persists with your suggestion. I'll look into it deeper tomorrow

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added another test to appcontroller_test.go where the issue you're describing doesn't happen. I think the code works as expected, but maybe you can have another look :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran a manual test by creating a pod with delayed start. The time is updated for Missing->Progressing state change but it doesn't get updated when state moves from Progressing->Healthy.

timestamp-no-update.mov

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's interesting. Let me have a closer look

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some more extensive testing. Currently they break because I couldn't find a way to make them work with the way we're using mocks. Any suggestions on how to make TestUpdateHealthStatusProgression work are welcome

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've managed to repro the issue. Unfortunately it persists with your suggestion.

Hmm, I haven't been able to repro this after applying @svghadi's suggestion.

kubectl get app guestbook -o jsonpath='{.status.health}'
{"lastTransitionTime":"2024-06-27T11:13:06Z","status":"Progressing"}
kubectl get app guestbook -o jsonpath='{.status.health}'
{"lastTransitionTime":"2024-06-27T11:13:31Z","status":"Healthy"}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkieweg I added some changes which should fix your tests

Copy link
Member

@blakepettersson blakepettersson Jun 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hiddeco wrote in #16972 that this doesn't quite account for all edge cases.

While I am super supportive of this being added, I have an edge case that I think falls into this category but is not covered by just this addition.

Within Argo CD, the sync operations are completely separated from the application health. However, in certain scenarios, you do want to know if the current state of the health check is the outcome of a previous sync operation.

With the proposal, this would become possible by comparing the .finishedAt from the .status.operationState against the .lastUpdateTime. This, however, only works when you sync workload resources, which means that it will not help in scenarios where you only have non-workload resources, which will always end up Healthy without causing a transition.
...
I wonder if it would be an idea to introduce a "heartbeat" kind of timestamp that updates at most every 30 seconds (to not cause any API pressure) to indicate the "freshness" of the observation even without it transitioning.

By modifying @svghadi's suggestion slightly I think that would be achievable by doing this (haven't tested this so caveat emptor):

diff --git a/controller/health.go b/controller/health.go
index b1c457776..e1a2c5d19 100644
--- a/controller/health.go
+++ b/controller/health.go
@@ -86,15 +86,11 @@ func setApplicationHealth(resources []managedResource, statuses []appv1.Resource
 
 		if health.IsWorse(appHealth.Status, healthStatus.Status) {
 			appHealth.Status = healthStatus.Status
-			if persistResourceHealth {
-				appHealth.LastTransitionTime = statuses[i].Health.LastTransitionTime
-			} else {
-				appHealth.LastTransitionTime = now
-			}
-		} else if healthStatus.Status == health.HealthStatusHealthy {
-			appHealth.LastTransitionTime = lastTransitionTime
 		}
 	}

 	if persistResourceHealth {
+          if app.Status.Health.Status == appHealth.Status && time.Since(lastTransitionTime.Time) < 30*time.Second {
+		  appHealth.LastTransitionTime = lastTransitionTime
+          } else {
+                 appHealth.LastTransitionTime = metav1.Now()
+          }
 		app.Status.ResourceHealthSource = appv1.ResourceHealthLocationInline
 	} else {

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion however would be to have the heartbeat timestamp as a separate field (as e.g. "last observed" or "heartbeat"), as otherwise it could create a false impression that a transition did take place.

controller/appcontroller_test.go Show resolved Hide resolved
controller/appcontroller_test.go Outdated Show resolved Hide resolved
controller/appcontroller_test.go Outdated Show resolved Hide resolved
@svghadi
Copy link
Contributor

svghadi commented Sep 3, 2024

Hi @mkieweg, did you get a chance to work on this again? If you're busy with other tasks, I'm happy to jump in and continue with it.

@mkieweg
Copy link
Author

mkieweg commented Sep 3, 2024

Hi @svghadi, sorry I had to pause working on this due to shifting priorities. I was planning on picking this back up this week. I'm not sure how much time I can actually commit, so if you're happy to take over that would be okay with me

@blakepettersson
Copy link
Member

@svghadi I can also help out if needed

@svghadi
Copy link
Contributor

svghadi commented Sep 3, 2024

@mkieweg - Sure. Could I get collaborator access to your argo-cd fork, or should I cherry-pick the changes and continue in a new PR?

@blakepettersson - Thank you. Should we address @hiddeco’s use case in this implementation, or would it be better to keep the scope limited and handle it in a separate PR?

@mkieweg
Copy link
Author

mkieweg commented Sep 3, 2024

@svghadi I've added you as a contributor to my fork. Can do the same for @blakepettersson if needed

@blakepettersson
Copy link
Member

@blakepettersson - Thank you. Should we address @hiddeco’s use case in this implementation, or would it be better to keep the scope limited and handle it in a separate PR?

Ideally I'd like to see his use case addressed in this PR as well, since we have a particular use case in mind for it (it's related to Kargo).

@blakepettersson
Copy link
Member

Can do the same for @blakepettersson if needed

Sounds good to me, thanks!

@svghadi
Copy link
Contributor

svghadi commented Sep 4, 2024

Hi @blakepettersson , I was thinking about how to incorporate @hiddeco 's use case into this PR. It seems a bit challenging without introducing a new field, as @mkieweg's use case requires a timestamp that doesn't update periodically.

I propose to add a lastUpdateTime timestamp to app.status.health indicating how long the current status is active for.

In contrast, @hiddeco 's use case needs a periodically updating status timestamp.

introduce a "heartbeat" kind of timestamp that updates at most every 30 seconds (to not cause any API pressure) to indicate the "freshness" of the observation even without it transitioning.

Here are my thoughts:

  • Introduce a new observedAt field under application health, which updates periodically to indicate freshness.
  • Make this configurable to address @jessesuen 's concern about performance impact.

The health status would look something like this:

status:
  health:
    lastTransitionTime: "2024-09-03T08:51:32Z"
    observedAt: "2024-09-03T08:51:32Z"   # configurable 
    status: Progressing

Thoughts/suggestions?

@blakepettersson
Copy link
Member

blakepettersson commented Sep 4, 2024

Hi @svghadi,

I agree that we'll need two separate fields, one for the health status and one to indicate "freshness".

Introduce a new observedAt field under application health, which updates periodically to indicate freshness.

👍

Make this configurable to address @jessesuen 's concern about performance impact.

I think it's fine for now to keep it at max 30 secs - if there is a need to configure this at a later stage we can address that in another PR.

@svghadi
Copy link
Contributor

svghadi commented Sep 12, 2024

Hi @blakepettersson, the PR is ready for review.

Changes:

  • Adds a new .status.health.lastTransitionTime field that reflects when the last health status change occurred.
  • Adds a new .status.health.observedAt field that is periodically updated (every 30s) to reflect the freshness of the observed state.
  • A new event handler is set up to update the observedAt field every 30s. It enqueues events into the appRefreshQueue with a comparison option that ensures only the status is refreshed without performing a resource-level comparison. If there are concerns regarding overloading the queue, we can setup a new queue to process these events.

The screen grab below shows that observedAt is updated every 30s without affecting the lastTransitionTime field.

Screen.Recording.2024-09-12.at.6.53.06.PM.mov

Copy link
Member

@blakepettersson blakepettersson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (pending the minor lint fixes), thanks @svghadi!

}

// We are interested only in refreshing the status, not performing a resource-level comparison.
ctrl.requestAppRefresh(newApp.QualifiedName(), ComparisonWithNothing.Pointer(), nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to perhaps only do the refresh while a sync is taking place for the app? As it stands now, this will refresh every 30 seconds throughout the app's lifecycle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry, I am not quite following. Are you suggesting updating the .status.health.observedAt field only when the app is syncing? The refresh done here(every 30s) uses ComparisonWithNothing option, and AFAIU, it just rebuilds the app's resource tree without doing any manifest level comparison. However, I’m not sure how expensive that operation is.

Copy link
Member

@blakepettersson blakepettersson Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting updating the .status.health.observedAt field only when the app is syncing?

Yes exactly 😄

However, I’m not sure how expensive that operation is.

For a single app it's likely negligible, but @crenshaw-dev raised the concern that if there were say 10k apps, every 30 secs there would be 10k requests hammering the K8s API server (updating the .status.health.observedAt field). It's unclear if that is an actual concern or not, but something to keep in mind. The hope is to sidestep that concern by only performing this refresh during app sync.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a fair concern.

The hope is to sidestep that concern by only performing this refresh during app sync.

I believe we can achieve this without significant changes to the current implementation by removing the new AddEventHandlerWithResyncPeriod event handler. However, without a periodic refresh, I am not sure how much value this new field provides. Existing fields already provide the same information that .status.health.observedAt would, if updated only during syncs.

Another approach could be to enable updates to this field during app sync by default and put the periodic update behind a flag.

Copy link
Member

@crenshaw-dev crenshaw-dev Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without a periodic refresh, I am not sure how much value this new field provides

My understanding of the 30s refresh is that it's just a "safety net." My assumption would be that the health is constantly being refreshed based changes in the cluster cache and that the 30s refresh would only catch updates that were somehow missed by the cluster cache, a case which I'd expect to be rare.

Copy link
Member

@crenshaw-dev crenshaw-dev Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I read above that the intent is to bump the observedAt timestamp to prove that the health has been evaluated more recently than the last sync operation.

I worry about accruing tech debt (with a feature behind a flag) or incurring a performance penalty (with a feature without a flag) to provide a service answering the question "has this app been evaluated to be healthy since the app was synced?" There are already a few ways to do this:

  1. Wait a reasonable amount of time after the app has synced to observe the health (makes it very likely the health was evaluated)
  2. Request an app refresh and then observe the health (guarantees the health was evaluated)
  3. Add a post-sync resource hook to the app (ties the sync success/failure to resource health)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From both the perspective of a user and as an engineer trying to reliably extract information from ArgoCD based on the status it reflects, I do not see how any of these solve the problem reliably or without introducing new actions to take (defeating the whole purpose of the status of a Kubernetes object?).

They feel more like workarounds for something that's an observability issue at its core due to design choices made in both the API and controller, which my suggestion was attempting to address. If the primary concern is API pressure, I am happy with any alternative that would be updated frequently even if no transition occurs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure... I think it's nice for Argo CD to produce a fairly-recent timestamp for the most recent observed health. I just wanna make sure it's worth the performance and maintenance costs as compared to other options. Do we have some sense for a typical comparison_with_nothing_ms distribution on a fairly-busy Argo instance? Is there reason to believe we might need the heartbeat time to be configurable or to include jitter? I'm in favor of spending CPU cycles on useful stuff, I just think we should first have a clear sense for how many CPU cycles we're spending.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, looks like that checkpoint was only added in 2.13, so I don't have any stats from Intuit.

I'm not super worried about it tbh. I'd just recommend trying it out on some relatively busy Argo instance and watching the effect on both the k8s API and Argo controller performance.

@svghadi
Copy link
Contributor

svghadi commented Oct 24, 2024

Hi @blakepettersson , did you get a chance to look into @crenshaw-dev suggestion on performance test? Should we consider splitting the PR to unblock the other feature which doesn't need performance testing.

@blakepettersson
Copy link
Member

Hi @svghadi sadly no-one in our team has had the time to do so - I would support splitting this PR into two parts so we could at least move forward with the least contentious part, and then we will need to find some time to do performance testing of the ObservedAt field.

mkieweg and others added 12 commits November 3, 2024 09:40
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
Co-authored-by: Blake Pettersson <blake.pettersson@gmail.com>
Signed-off-by: Manuel Kieweg <2939765+mkieweg@users.noreply.github.com>
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
Signed-off-by: Manuel Kieweg <mail@manuelkieweg.de>
Due to implementation limitations, setting LastTransitionTime at the resource level is challenging.
Converting it to a pointer type allows it to be skipped at the resource level and prevents it from appearing
in .status.resources of the Application CR. Additionally, it doesn’t provide much value or have a known
use case right now.

Signed-off-by: Siddhesh Ghadi <sghadi1203@gmail.com>
@svghadi
Copy link
Contributor

svghadi commented Nov 3, 2024

Hi @blakepettersson, @crenshaw-dev, I have removed the concerning ObservedAt field changes from this PR. Can you do a fresh review? I will create a separate PR with ObservedAt field changes to address @hiddeco use-case post merge of this PR.

hs.status = "Degraded"
end
end
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to override statuses?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, you can use a modernized way of specifying healthchecks (not that it matters here), e.g. https://argo-cd.readthedocs.io/en/latest/operator-manual/health/#argocd-app

for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
deployment.SetLabels(map[string]string{"status": tc.initialStatus})
ctrl.processAppRefreshQueueItem()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have to trigger it manually instead of requesting app refresh?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Ready for final review
Development

Successfully merging this pull request may close these issues.

Add Timestamps to Health Status
8 participants