Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HierarchyCircuitBreakerService #6739

Merged
merged 1 commit into from Jul 28, 2014

Conversation

dakrone
Copy link
Member

@dakrone dakrone commented Jul 4, 2014

Adds a breaker for request BigArrays, which are used for parent/child
queries as well as some aggregations. Certain operations like Netty HTTP
responses and transport responses increment the breaker, but will not
trip.

This also changes the output of the nodes' stats endpoint to show the
parent breaker as well as the fielddata and request breakers.

There are a number of new settings for breakers now:

indices.breaker.total.limit: starting limit for all memory-use breaker,
defaults to 70%

indices.breaker.fielddata.limit: starting limit for fielddata breaker,
defaults to 60%
indices.breaker.fielddata.overhead: overhead for fielddata breaker
estimations, defaults to 1.03

(the fielddata breaker settings also use the backwards-compatible
setting indices.fielddata.breaker.limit and
indices.fielddata.breaker.overhead)

indices.breaker.request.limit: starting limit for request breaker,
defaults to 40%
indices.breaker.request.overhead: request breaker estimation overhead,
defaults to 1.0

The breaker service infrastructure is now generic and opens the path to
adding additional circuit breakers in the future.

Fixes #6129

@dakrone
Copy link
Member Author

dakrone commented Jul 4, 2014

Note that this is currently marked as a breaking change because I've changed the way that /_node/stats returns circuit breaker information. I think the new way of representing it is much better, but I'd like to discuss whether we want to add additional work to this in order to preserve the old node stats format as well.

@jpountz
Copy link
Contributor

jpountz commented Jul 8, 2014

The settings start with indices.breaker while Java packages are common.breaker and indices.memory.breaker. Should we put Java code in the indices.breaker package so that the logger name matches the configuration options?

/**
* @return the maximum size this circuit breaker can grow to
*/
public long getMaximum();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document how getMaximum differs from getLimit?

@dakrone
Copy link
Member Author

dakrone commented Jul 9, 2014

@jpountz I agree it would fit better there, I've moved the classes and incorporated the other feedback you asked for as well.

@s1monw s1monw added v1.4.0 and removed v1.3.0 labels Jul 9, 2014
@s1monw
Copy link
Contributor

s1monw commented Jul 9, 2014

moving out to 1.4 for now

@@ -364,38 +366,62 @@ public T set(long index, T value) {

final PageCacheRecycler recycler;
final AtomicLong ramBytesUsed;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of this thing was to do circuit breaking, I think we can remove it now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ramBytesUsed is still required to keep track of the used memory so it can be decremented when released.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the used memory is only used for testing so that would be ok to remove it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh okay, yes, since we test the breaker is reset we no longer need this in the tests also, I'll remove it.

@jpountz
Copy link
Contributor

jpountz commented Jul 16, 2014

This looks better but I'm not very happy with this boolean on every BigArrays method. I would rather have one instance that does accounting+tripping and another one that would only do accounting. API-wise maybe what we need is for the singleton BigArrays (the one created by Guice) to only do accounting (no breaking), and then have a BigArrays withCircuitBreaker(CircuitBreaker breaker) method that would return a BigArrays instance on the same page recycler but that would do circuit-breaking with the provided breaker?

Regarding circuit breaking, I'd like to see if we can make it simpler. For instance, do we actually need overheads, limits or the ability to disable distribution?

I think we also need to have logic in ElasticsearchIntegrationTest that makes sure that all breakers (but the one that is used for networking because of the shared cluster) don't use any bytes after a test finishes.

@jpountz jpountz removed the review label Jul 16, 2014
@dakrone
Copy link
Member Author

dakrone commented Jul 17, 2014

... have a BigArrays withCircuitBreaker(CircuitBreaker breaker) method that would return a BigArrays instance on the same page recycler but that would do circuit-breaking with the provided breaker?

I think this would work and make it a bit simpler, I'll try to do this.

Regarding circuit breaking, I'd like to see if we can make it simpler. For instance, do we actually need overheads, limits or the ability to disable distribution?

I think we can get rid of maximum sizes and just use the limit as the max, but keep minimum sizes for breakers (so you can force space to always be available for some field data or aggregations).

For overhead, it's a nob in the event that the field data estimations don't estimate correctly, so I think it's important to keep. It's not used for BigArrays though, so it's set to 1.0. I'm not sure it (overheads) should be removed either.

I think we also need to have logic in ElasticsearchIntegrationTest that makes sure that all breakers (but the one that is used for networking because of the shared cluster) don't use any bytes after a test finishes.

I added the logic to do this already, see: https://github.com/elasticsearch/elasticsearch/pull/6739/files#diff-e6761c31a9d75e74580a93c5aeb3aaa3R198

@jpountz
Copy link
Contributor

jpountz commented Jul 17, 2014

I think we can get rid of maximum sizes and just use the limit as the max, but keep minimum sizes for breakers (so you can force space to always be available for some field data or aggregations).

Agreed on minimum sizes, they are important.

For overhead, it's a nob in the event that the field data estimations don't estimate correctly.

But does it really help to overestimate memory usage reports by 3%?

I added the logic to do this already, see: https://github.com/elasticsearch/elasticsearch/pull/6739/files#diff-e6761c31a9d75e74580a93c5aeb3aaa3R198

I had missed it, great!

@dakrone
Copy link
Member Author

dakrone commented Jul 17, 2014

But does it really help to overestimate memory usage reports by 3%?

The additional 3% is to err on the side of estimating too high rather than too low. I can change the estimation function to include the extra 3%, but having a configurable value in case my estimation isn't accurate for some people is important in my opinion, since it's dynamically changeable and the estimation formula is not.

@jpountz
Copy link
Contributor

jpountz commented Jul 17, 2014

Is it something that could be done outside of the circuit-breaking API only for field data? In the PerValueEstimator impls?

@dakrone
Copy link
Member Author

dakrone commented Jul 17, 2014

@jpountz I added the change to add .withCircuitBreaker() to BigArrays instead of passing through the boolean value, it's definitely cleaner. Let me know if this is what you meant.

Still working on the other changes.

public int compare(CircuitBreaker b1, CircuitBreaker b2) {
long b1Free = b1.getMaximum() - b1.getUsed();
long b2Free = b2.getMaximum() - b2.getUsed();
if (b1Free > b2Free) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't you use Long.compare(b1Free, b2Free)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good idea, I will use that.

@dakrone
Copy link
Member Author

dakrone commented Jul 18, 2014

@jpountz pushed a couple of new commits that remove the concept of a limit and uses min and max only.

I also removed configuring the Parent breaker entirely (now its only purpose is to track the total memory usage across all breakers and the number of times the breaker tripped and wasn't able to redistribute). I think this makes it a lot simpler.

@dakrone
Copy link
Member Author

dakrone commented Jul 18, 2014

Is it something that could be done outside of the circuit-breaking API only for field data? In the PerValueEstimator impls?

So the issue with this is that the PerValueEstimator can use really small amounts (like 8 or 16) for numeric terms and multiplying by 1.03 in estimation means they value is still rounded back to 8 or 16. I originally made these numbers doubles instead of longs, but not losing precision was terrible, which is why it's back to using the overhead parameter.

public void circuitBreak(String fieldName, long bytesNeeded);

/**
* add bytes to the breaker and maybe trip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also need to document if there is a circuit break then bytes will not be accounted? (that's what the implementation seems to do?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow, the bytes are still accounted, they just aren't allowed to increase. In the case of fielddata an exception is thrown and the generated data never makes it into the cache, for bigarrays the exception is thrown and memory is released in the .close() method of AbstractArray.java

this.bigarrays = bigarrays.withCircuitBreaking();
} else {
this.bigarrays = bigarrays;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we leave that responsibility to the callers of this constructor?

@jpountz
Copy link
Contributor

jpountz commented Jul 21, 2014

@dakrone Agreed that it is much simpler and on the need to keep the overhead! Thanks!

I left some minor comments about code cosmetics. The last thing I'd like to discuss is the redistribution logic, but other than that, this looks good to me!

@dakrone
Copy link
Member Author

dakrone commented Jul 23, 2014

@jpountz I think I've addressed all of the comments.

I also changed this entirely to use the parent-check method we talked about. Instead of redistributing the children's limits, each child has a set limit while still checking the global ("parent") breaker hasn't been tripped.

Changed the default limits to 60% for field data and 40% for bigarrays/requests, overall limit is 70%.

@dakrone dakrone added the review label Jul 23, 2014
this.ordinal = ordinal;
}

public int getOrdinal() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename it to something else? This is close to ordinal which also exists on enums but should not be used for serialization as it would change if we added/removed entries in that enum.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll rename this.

@jpountz
Copy link
Contributor

jpountz commented Jul 24, 2014

@dakrone Left minor comments. I think it's close now.

@jpountz jpountz removed the review label Jul 24, 2014
@jpountz
Copy link
Contributor

jpountz commented Jul 24, 2014

LGTM

@dakrone dakrone changed the title Add RedistributingCircuitBreakerService Add HierarchyCircuitBreakerService Jul 25, 2014
@dakrone
Copy link
Member Author

dakrone commented Jul 25, 2014

Squashed this, rebased on master and resolved all the conflicts (there were quite a few since this PR was opened 21 days ago and we released a new version of ES between now and then).

I also added documentation and changed the version check to be 1.4.0 instead of 2.0.0.

I'm planning on merging this Monday since it received Adrien's +1 if there are no more comments by then.

Adds a breaker for request BigArrays, which are used for parent/child
queries as well as some aggregations. Certain operations like Netty HTTP
responses and transport responses increment the breaker, but will not
trip.

This also changes the output of the nodes' stats endpoint to show the
parent breaker as well as the fielddata and request breakers.

There are a number of new settings for breakers now:

`indices.breaker.total.limit`: starting limit for all memory-use breaker,
defaults to 70%

`indices.breaker.fielddata.limit`: starting limit for fielddata breaker,
defaults to 60%
`indices.breaker.fielddata.overhead`: overhead for fielddata breaker
estimations, defaults to 1.03

(the fielddata breaker settings also use the backwards-compatible
setting `indices.fielddata.breaker.limit` and
`indices.fielddata.breaker.overhead`)

`indices.breaker.request.limit`: starting limit for request breaker,
defaults to 40%
`indices.breaker.request.overhead`: request breaker estimation overhead,
defaults to 1.0

The breaker service infrastructure is now generic and opens the path to
adding additional circuit breakers in the future.

Fixes elastic#6129

Conflicts:
	src/main/java/org/elasticsearch/index/fielddata/IndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/IndexFieldDataService.java
	src/main/java/org/elasticsearch/index/fielddata/RamAccountingTermsEnum.java
	src/main/java/org/elasticsearch/index/fielddata/ordinals/GlobalOrdinalsBuilder.java
	src/main/java/org/elasticsearch/index/fielddata/ordinals/InternalGlobalOrdinalsBuilder.java
	src/main/java/org/elasticsearch/index/fielddata/plain/AbstractIndexOrdinalsFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/DisabledIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/IndexIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/NonEstimatingEstimator.java
	src/main/java/org/elasticsearch/index/fielddata/plain/PackedArrayIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/ParentChildIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/SortedSetDVOrdinalsIndexFieldData.java
	src/main/java/org/elasticsearch/node/internal/InternalNode.java
	src/test/java/org/elasticsearch/index/aliases/IndexAliasesServiceTests.java
	src/test/java/org/elasticsearch/index/codec/CodecTests.java
	src/test/java/org/elasticsearch/index/fielddata/AbstractFieldDataTests.java
	src/test/java/org/elasticsearch/index/fielddata/IndexFieldDataServiceTests.java
	src/test/java/org/elasticsearch/index/mapper/MapperTestUtils.java
	src/test/java/org/elasticsearch/index/query/IndexQueryParserFilterCachingTests.java
	src/test/java/org/elasticsearch/index/query/SimpleIndexQueryParserTests.java
	src/test/java/org/elasticsearch/index/query/guice/IndexQueryParserModuleTests.java
	src/test/java/org/elasticsearch/index/search/FieldDataTermsFilterTests.java
	src/test/java/org/elasticsearch/index/search/child/ChildrenConstantScoreQueryTests.java
	src/test/java/org/elasticsearch/index/similarity/SimilarityTests.java
@dakrone dakrone merged commit 6abe4c9 into elastic:master Jul 28, 2014
@dakrone dakrone changed the title Add HierarchyCircuitBreakerService [BREAKER] Add HierarchyCircuitBreakerService Jul 29, 2014
@jpountz jpountz removed the review label Jul 29, 2014
@clintongormley clintongormley changed the title [BREAKER] Add HierarchyCircuitBreakerService Circuit Breaker: Add HierarchyCircuitBreakerService Sep 8, 2014
@dakrone dakrone deleted the 6129-use-breaker-for-bigarrays branch September 9, 2014 13:48
@clintongormley clintongormley added the :Core/Infra/Circuit Breakers Track estimates of memory consumption to prevent overload label Jun 6, 2015
@clintongormley clintongormley changed the title Circuit Breaker: Add HierarchyCircuitBreakerService Add HierarchyCircuitBreakerService Jun 6, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Switch memory breaking in process requests to use CircuitBreaker infrastructure
4 participants