-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
check-endpoints: handle out of order results #917
check-endpoints: handle out of order results #917
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: sanchezl The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
73a064d
to
ac42f38
Compare
ac42f38
to
56f66d9
Compare
/test e2e-aws |
/test e2e-aws |
4 similar comments
/test e2e-aws |
/test e2e-aws |
/test e2e-aws |
/test e2e-aws |
@@ -186,7 +185,7 @@ func isDNSError(err error) bool { | |||
|
|||
// manageStatusLogs returns a status update function that updates the PodNetworkConnectivityCheck.Status's | |||
// Successes/Failures logs reflect the results of the check. | |||
func manageStatusLogs(check *operatorcontrolplanev1alpha1.PodNetworkConnectivityCheck, checkErr error, latency *trace.LatencyInfo) []v1alpha1helpers.UpdateStatusFunc { | |||
func manageStatusLogs(check *operatorcontrolplanev1alpha1.PodNetworkConnectivityCheck, checkErr error, latency *trace.LatencyInfo) ([]v1alpha1helpers.UpdateStatusFunc, time.Time) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update doc. This time is the time the check started?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
// UpdatesManager manages a queue of updates. The lock must be obtained before | ||
// invoking any of the methods. | ||
type UpdatesManager interface { | ||
sync.Locker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no direct embedding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
// Add an update to the queue. There is a delay equal to the size of the sorting window before | ||
// updates are made available on the queue to allow for updates submitted out of order within | ||
// the sorting window to be sorted by timestamp. | ||
func (u *updatesManager) Add(timestamp time.Time, updates ...v1alpha1helpers.UpdateStatusFunc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you managing the lock outside of the method? I'd rather manage the lock locally here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored.
// outside of a sorting window, anchored on one end by the latest update, for processing. | ||
type updatesManager struct { | ||
sync.Mutex | ||
window time.Duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to doc these. They aren't obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// Add an update to the queue. There is a delay equal to the size of the sorting window before | ||
// updates are made available on the queue to allow for updates submitted out of order within | ||
// the sorting window to be sorted by timestamp. | ||
func (u *updatesManager) Add(timestamp time.Time, updates ...v1alpha1helpers.UpdateStatusFunc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in followup #924.
_, _, err := v1alpha1helpers.UpdateStatus(ctx, c.client, c.name, c.updates...) | ||
c.updates.Lock() | ||
defer c.updates.Unlock() | ||
if len(c.updates.Queue()) > 20 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this function return a copy so that you don't need to manage the lock here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this function return a copy so that you don't need to manage the lock here.
oh, blech. The Clear
is being caught up in here too.
How about just making UpdateStatus
native on the updatemanager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored.
}) | ||
|
||
latestTimestamp := u.timestamps[len(u.timestamps)-1] | ||
tmp := u.timestamps[:0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm re-using the array backing the u.timstamps slice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm re-using the array backing the u.timstamps slice.
let's just make another and burn th e memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
latestTimestamp := u.timestamps[len(u.timestamps)-1] | ||
tmp := u.timestamps[:0] | ||
for _, timestamp := range u.timestamps { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, you need to comment this. Looks like this an attempt to having a sliding window of results based on time. your window needs to be at least one second larger than the delay though.
Do you need to delay the events too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The window is initialized as delay + check period
, so in this case 10s (conn timeout) + 1s (check period).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See followup PR where I've tried to improve this further: #924.
56f66d9
to
079e7a0
Compare
/test e2e-aws |
/retest |
When a check results in a latency longer than the check period (1s), the result of a check is reported after the result of subsequent checks. This PR introduces a delay to give the long running checks a chance to be processed in the correct order.
Further enhancements in the followup PR: #924