-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(kubernetes): Improve failure mode for unreachable cluster #3770
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
We currently cache any call to get the namespaces for an account with an expiry time of 30 s using a memoized supplier. When a cluster is unreachable, the call to get the cluster's namespaces will hang and eventually time out; we then log a warning and return an empty array of namesapces. If the call to kubectl returns an error, we don't cache the empty list return value, so every call to get namespaces will call kubectl. This leads to a bad failure mode where a slow/unresponsive cluster leads to more calls than a fast/responsive cluster. To address this, when a call to get namespaces returns an error, cache the empty list we're returning for the same amount of time as a successful call.
We're currently using a guava memoized supplier for calls to get namespaces and crds in a cluster, with an expiration time of 30s. The way the guava memoizer works is to record the timestamp at the time it starts executing the supplier function rather than when the function completes. This means that if the function to get namespaces takes more than 30s, we never get a cache hit at all because the entry has expired by the time it is added to the cache. This leads to cases where the cache is least effective when it is most necessary. Instead write a small Memoizer class that wraps a caffeine cache, as caffeine caches mark cache entries at the time of insertion to the cache (after the work is finished) rather than when the work starts. Use this for caching kubectl calls instead of the guava cache.
This is intended to be a small incremental improvement to the failure mode for unreachable clusters. Some possible further improvements:
|
maggieneterval
approved these changes
Jun 10, 2019
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!!!!! 🕺
justinrlee
pushed a commit
to justinrlee/clouddriver
that referenced
this pull request
Jun 12, 2019
…aker#3770) * fix(kubernetes): Improve failure mode for unreachable cluster We currently cache any call to get the namespaces for an account with an expiry time of 30 s using a memoized supplier. When a cluster is unreachable, the call to get the cluster's namespaces will hang and eventually time out; we then log a warning and return an empty array of namesapces. If the call to kubectl returns an error, we don't cache the empty list return value, so every call to get namespaces will call kubectl. This leads to a bad failure mode where a slow/unresponsive cluster leads to more calls than a fast/responsive cluster. To address this, when a call to get namespaces returns an error, cache the empty list we're returning for the same amount of time as a successful call. * fix(kubernetes): Use custom memoizer for kubectl calls We're currently using a guava memoized supplier for calls to get namespaces and crds in a cluster, with an expiration time of 30s. The way the guava memoizer works is to record the timestamp at the time it starts executing the supplier function rather than when the function completes. This means that if the function to get namespaces takes more than 30s, we never get a cache hit at all because the entry has expired by the time it is added to the cache. This leads to cases where the cache is least effective when it is most necessary. Instead write a small Memoizer class that wraps a caffeine cache, as caffeine caches mark cache entries at the time of insertion to the cache (after the work is finished) rather than when the work starts. Use this for caching kubectl calls instead of the guava cache.
@spinnakerbot cherry-pick 1.14 |
spinnakerbot
pushed a commit
that referenced
this pull request
Jun 27, 2019
* fix(kubernetes): Improve failure mode for unreachable cluster We currently cache any call to get the namespaces for an account with an expiry time of 30 s using a memoized supplier. When a cluster is unreachable, the call to get the cluster's namespaces will hang and eventually time out; we then log a warning and return an empty array of namesapces. If the call to kubectl returns an error, we don't cache the empty list return value, so every call to get namespaces will call kubectl. This leads to a bad failure mode where a slow/unresponsive cluster leads to more calls than a fast/responsive cluster. To address this, when a call to get namespaces returns an error, cache the empty list we're returning for the same amount of time as a successful call. * fix(kubernetes): Use custom memoizer for kubectl calls We're currently using a guava memoized supplier for calls to get namespaces and crds in a cluster, with an expiration time of 30s. The way the guava memoizer works is to record the timestamp at the time it starts executing the supplier function rather than when the function completes. This means that if the function to get namespaces takes more than 30s, we never get a cache hit at all because the entry has expired by the time it is added to the cache. This leads to cases where the cache is least effective when it is most necessary. Instead write a small Memoizer class that wraps a caffeine cache, as caffeine caches mark cache entries at the time of insertion to the cache (after the work is finished) rather than when the work starts. Use this for caching kubectl calls instead of the guava cache.
Cherry pick successful: #3822 |
ezimanyi
added a commit
that referenced
this pull request
Jun 27, 2019
…#3823) * fix(kubernetes): Improve failure mode for unreachable cluster We currently cache any call to get the namespaces for an account with an expiry time of 30 s using a memoized supplier. When a cluster is unreachable, the call to get the cluster's namespaces will hang and eventually time out; we then log a warning and return an empty array of namesapces. If the call to kubectl returns an error, we don't cache the empty list return value, so every call to get namespaces will call kubectl. This leads to a bad failure mode where a slow/unresponsive cluster leads to more calls than a fast/responsive cluster. To address this, when a call to get namespaces returns an error, cache the empty list we're returning for the same amount of time as a successful call. * fix(kubernetes): Use custom memoizer for kubectl calls We're currently using a guava memoized supplier for calls to get namespaces and crds in a cluster, with an expiration time of 30s. The way the guava memoizer works is to record the timestamp at the time it starts executing the supplier function rather than when the function completes. This means that if the function to get namespaces takes more than 30s, we never get a cache hit at all because the entry has expired by the time it is added to the cache. This leads to cases where the cache is least effective when it is most necessary. Instead write a small Memoizer class that wraps a caffeine cache, as caffeine caches mark cache entries at the time of insertion to the cache (after the work is finished) rather than when the work starts. Use this for caching kubectl calls instead of the guava cache.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
fix(kubernetes): Improve failure mode for unreachable cluster
We currently cache any call to get the namespaces for an account with an expiry time of 30 s using a memoized supplier.
When a cluster is unreachable, the call to get the cluster's namespaces will hang and eventually time out; we then log a warning and return an empty array of namesapces.
If the call to kubectl returns an error, we don't cache the empty list return value, so every call to get namespaces will call kubectl. This leads to a bad failure mode where a slow/unresponsive cluster leads to more calls than a fast/responsive cluster.
To address this, when a call to get namespaces returns an error, cache the empty list we're returning for the same amount of time as a successful call.
fix(kubernetes): Use custom memoizer for kubectl calls
We're currently using a guava memoized supplier for calls to get namespaces and crds in a cluster, with an expiration time of 30s.
The way the guava memoizer works is to record the timestamp at the time it starts executing the supplier function rather than when the function completes. This means that if the function to get namespaces takes more than 30s, we never get a cache hit at all because the entry has expired by the time it is added to the cache. This leads to cases where the cache is least effective when it is most necessary.
Instead write a small Memoizer class that wraps a caffeine cache, as caffeine caches mark cache entries at the time of insertion to the cache (after the work is finished) rather than when the work starts. Use this for caching kubectl calls instead of the guava cache.