Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-14615: Implement CPU Utilization Based Circuit Breaker #1737

Merged
merged 14 commits into from Aug 20, 2020

Conversation

atris
Copy link
Contributor

@atris atris commented Aug 11, 2020

This commit introduces CPU based circuit breaker. This circuit breaker
tracks the average CPU load per minute and triggers if the value exceeds
a configurable value.

This commit also adds a specific control flag for Memory Circuit Breaker
to allow enabling/disabling the same.

This commit introduces CPU based circuit breaker. This circuit breaker
tracks the average CPU load per minute and triggers if the value exceeds
a configurable value.

This commit also adds a specific control flag for Memory Circuit Breaker
to allow enabling/disabling the same.
@noblepaul
Copy link
Contributor

noblepaul commented Aug 11, 2020

The configuration can be as simple as

<circuitBreaker memoryCircuitBreakerThresholdPct="" isCpuCircuitBreakerEnabled="true" memoryCircuitBreakerThresholdPct ="" cpuCircuitBreakerThresholdPct="">

This way you can just read all the attributes all at once from the PluginInfo .
CircuitBreaker should be a type of plugin. It should be an interface

@atris
Copy link
Contributor Author

atris commented Aug 11, 2020

The configuration can be as simple as

<circuitBreaker memoryCircuitBreakerThresholdPct="" isCpuCircuitBreakerEnabled="true" memoryCircuitBreakerThresholdPct ="" cpuCircuitBreakerThresholdPct="">

This way you can just read all the attributes all at once from the PluginInfo .
CircuitBreaker should be a type of plugin. It should be an interface

As discussed offline, I will refactor circuit breaker infrastructure to use PluginInfo as a part of 8.7 (hence will leave this PR's JIRA open for that effort). Not proceeding with that effort in this PR.

@atris atris requested review from anshumg and madrob August 11, 2020 16:08
@atris atris requested a review from madrob August 13, 2020 05:11
@atris
Copy link
Contributor Author

atris commented Aug 13, 2020

@madrob Any further thoughts on this?

* that data to take a decision. Ideally, we should be able to cache the value
* locally and only query once the minute has elapsed. However, that will introduce
* more complexity than the current structure and might not get us major performance
* wins. If this ever becomes a performance bottleneck, that can be considered.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Javadoc for OSMxBean says "This method is designed to provide a hint about the system load and may be queried frequently." Not sure what "frequently" means here, though...

It will be interesting to see the dynamic behavior of this breaker - I'm somewhat worried that the 1-min average may be too unstable and we end up flip-flopping between the states too often (eg. every dozen requests or so). Depending on the volume of momentary load (eg. an ongoing large merge or split shard) the 1-min average may exceed a threshold even though it doesn't properly reflect a sustained longer-term overload that we should worry about. Average load is a convenient lie ;)

OTOH that's the only method we have in the OS MXBean, so we would have to compute that longer-term average ourselves, which is messy... so I guess we'll see .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, caching doesn't have to be complicated .. you simply check the elapsed time since last request and if it's longer than timeout you refresh the value first, it's probably less than 10 lines of code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I think this is a very good point. If load average only updates every 5 seconds, that seems like a huge interval where we're going to block query execution. Let's think about how we can get better granularity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Javadocs state that the average load is updated per one minute -- so I am not sure how we can get a granularity better than that. Its unfortunate that this unit is not configurable (and I believe it must be internally caching as well since it talks about being prepared for "querying frequently". Any thoughts on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking more about this, the load is an average of one minute. If there is a transient spike -- should it not be smoothened out over the minute so that the average value is reasonable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh ... I see there's still some misunderstanding about this. The call itself is directly passed to the native method that invokes stdlib getloadavg, which in turn reads these values from the /proc pseudo-fs. So, the cost is truly minimal and the call doesn't block - if it turns out that it's still too costly to call for every request then we can introduce some timeout-based caching.

These averages are so called exponentially weighted moving averages, so indeed a 1-min average has traces of past load values from outside the 1-min window, which helps in smoothing it. This may turn out to be sufficient to avoid false positives due to short-term spikes (such as large merges). Linux loadavg represents to some degree a combined CPU + disk IO load, so indeed intensive IO operations will affect it.

We always have an option to use Codahale Meter to easily calculate 5- and 15-min EWMAs if it turns out that we're getting too many false positives. Until then users can configure higher thresholds, thus reducing the number of false positives at the cost of higher contention.

solr/core/src/java/org/apache/solr/core/SolrConfig.java Outdated Show resolved Hide resolved
solr/solr-ref-guide/src/circuit-breakers.adoc Outdated Show resolved Hide resolved
@atris
Copy link
Contributor Author

atris commented Aug 17, 2020

@madrob @sigram Please see the next iteration

@atris atris requested a review from madrob August 17, 2020 18:21
@atris atris requested a review from madrob August 18, 2020 04:13
solr/core/src/java/org/apache/solr/core/SolrConfig.java Outdated Show resolved Hide resolved
* that data to take a decision. Ideally, we should be able to cache the value
* locally and only query once the minute has elapsed. However, that will introduce
* more complexity than the current structure and might not get us major performance
* wins. If this ever becomes a performance bottleneck, that can be considered.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh ... I see there's still some misunderstanding about this. The call itself is directly passed to the native method that invokes stdlib getloadavg, which in turn reads these values from the /proc pseudo-fs. So, the cost is truly minimal and the call doesn't block - if it turns out that it's still too costly to call for every request then we can introduce some timeout-based caching.

These averages are so called exponentially weighted moving averages, so indeed a 1-min average has traces of past load values from outside the 1-min window, which helps in smoothing it. This may turn out to be sufficient to avoid false positives due to short-term spikes (such as large merges). Linux loadavg represents to some degree a combined CPU + disk IO load, so indeed intensive IO operations will affect it.

We always have an option to use Codahale Meter to easily calculate 5- and 15-min EWMAs if it turns out that we're getting too many false positives. Until then users can configure higher thresholds, thus reducing the number of false positives at the cost of higher contention.

solr/solr-ref-guide/src/circuit-breakers.adoc Show resolved Hide resolved
solr/solr-ref-guide/src/circuit-breakers.adoc Outdated Show resolved Hide resolved
Copy link
Contributor

@sigram sigram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks!

Copy link
Contributor

@madrob madrob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor comments, this looks pretty good


=== CPU Utilization Based Circuit Breaker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sigram we could call this Load Avg Based Circuit Breaker, WDYT? Maybe that is too much implementation detail?

solr/server/solr/configsets/_default/conf/solrconfig.xml Outdated Show resolved Hide resolved
@atris atris merged commit 2f37f40 into apache:master Aug 20, 2020
@atris
Copy link
Contributor Author

atris commented Aug 20, 2020

Thank you @madrob and @ab!

atris added a commit that referenced this pull request Oct 16, 2020
This commit introduces CPU based circuit breaker. This circuit breaker
tracks the average CPU load per minute and triggers if the value exceeds
a configurable value.

This commit also adds a specific control flag for Memory Circuit Breaker
to allow enabling/disabling the same.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants