Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log warnings when too many leases are active #3957

Merged
merged 4 commits into from
Feb 14, 2018
Merged

Conversation

gobins
Copy link
Contributor

@gobins gobins commented Feb 10, 2018

This could be one way of solving #3903 .
If this looks alright, I could add a tunable var to set the threshold.

@jefferai
Copy link
Member

Thanks for digging into this!

One immediate piece of feedback: this should really live inside the expiration manager. The reason is that the check for c.expiration not being nil is racy since the state lock isn't grabbed -- but there's no real reason for it to be grabbed, since we already know the lifecycle of the expiration manager. By moving it there, it can just start when the expiration manager starts and stop when it stops.

The good news is that the expiration manager already has a function that is invoked periodically, called emitMetrics, and furthermore the pending lock is already grabbed in it! That makes it a great place to do this logging. The main issue is that it's invoked every second. So you'd want to put an atomic (for race safety) counter into the expiration manager that is incremented on every hit to emitMetrics; when it gets above a threshold, emit the warning (if needed) and then reset the counter.

@gobins
Copy link
Contributor Author

gobins commented Feb 11, 2018

Thanks for the review @jefferai . I have moved the check into the ExpirationManager emitMetrics function. I have set it to log every 60 seconds. Let me know if this looks alright.

@vishalnayak vishalnayak added this to the 0.9.4 milestone Feb 12, 2018
atomic.AddUint32(&m.leaseCheckCounter, 1)
}
} else {
atomic.StoreUint32(&m.leaseCheckCounter, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This else clause is not necessary.

Copy link
Contributor Author

@gobins gobins Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was an edge case where the counter will not get reset if the value was flapping over the threshold. But, thinking about it now, it is insignificant to warrant the else loop running all the time. I will remove it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't get reset, but I think that's totally fine. It won't display more often than once every 60 seconds, it just may display less if it's flapping.

// Check if lease count is greater than the threshold
if num > maxLeaseThreshold {
if atomic.LoadUint32(&m.leaseCheckCounter) > 59 {
m.logger.Warn("expiration: lease count exceeds maximum lease threshold")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "...exceeds warning lease threshold" is nicer or people might think that they have hit a hard max and might hit issues immediately.

quitContext: c.activeContext,
coreStateLock: &c.stateLock,
quitContext: c.activeContext,
leaseCheckCounter: 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this to be initialized?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but doesn't hurt, going to merge!

@jefferai
Copy link
Member

Thanks so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants