Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HierarchyCircuitBreakerService #6739

Merged
merged 1 commit into from Jul 28, 2014

Conversation

Projects
None yet
4 participants
@dakrone
Copy link
Member

commented Jul 4, 2014

Adds a breaker for request BigArrays, which are used for parent/child
queries as well as some aggregations. Certain operations like Netty HTTP
responses and transport responses increment the breaker, but will not
trip.

This also changes the output of the nodes' stats endpoint to show the
parent breaker as well as the fielddata and request breakers.

There are a number of new settings for breakers now:

indices.breaker.total.limit: starting limit for all memory-use breaker,
defaults to 70%

indices.breaker.fielddata.limit: starting limit for fielddata breaker,
defaults to 60%
indices.breaker.fielddata.overhead: overhead for fielddata breaker
estimations, defaults to 1.03

(the fielddata breaker settings also use the backwards-compatible
setting indices.fielddata.breaker.limit and
indices.fielddata.breaker.overhead)

indices.breaker.request.limit: starting limit for request breaker,
defaults to 40%
indices.breaker.request.overhead: request breaker estimation overhead,
defaults to 1.0

The breaker service infrastructure is now generic and opens the path to
adding additional circuit breakers in the future.

Fixes #6129

@dakrone

This comment has been minimized.

Copy link
Member Author

commented Jul 4, 2014

Note that this is currently marked as a breaking change because I've changed the way that /_node/stats returns circuit breaker information. I think the new way of representing it is much better, but I'd like to discuss whether we want to add additional work to this in order to preserve the old node stats format as well.

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Jul 8, 2014

The settings start with indices.breaker while Java packages are common.breaker and indices.memory.breaker. Should we put Java code in the indices.breaker package so that the logger name matches the configuration options?

@jpountz

View changes

src/main/java/org/elasticsearch/common/breaker/CircuitBreaker.java Outdated
/**
* @return the maximum size this circuit breaker can grow to
*/
public long getMaximum();

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 8, 2014

Contributor

Can you document how getMaximum differs from getLimit?

@jpountz

View changes

src/main/java/org/elasticsearch/common/breaker/CircuitBreaker.java Outdated
* add bytes to the breaker and maybe trip
* @throws CircuitBreakingException
*/
public double addEstimateBytesAndMaybeBreak(long bytes, String fieldName) throws CircuitBreakingException;

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 8, 2014

Contributor

Can you document the return value?

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 8, 2014

Contributor

Maybe fieldName should be renamed to something more generic to not feel specific to field data loading?

@jpountz

View changes

src/main/java/org/elasticsearch/common/util/BigArrays.java Outdated
* negative, or checkBreaker is false, the breaker will be adjusted
* without tripping
*/
public void adjustBreaker(long delta, boolean checkBreaker) {

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 8, 2014

Contributor

can we make it pkg-private to make sure it is not misused?

@dakrone

This comment has been minimized.

Copy link
Member Author

commented Jul 9, 2014

@jpountz I agree it would fit better there, I've moved the classes and incorporated the other feedback you asked for as well.

@s1monw s1monw added v1.4.0 and removed v1.3.0 labels Jul 9, 2014

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Jul 9, 2014

moving out to 1.4 for now

@jpountz

View changes

src/main/java/org/elasticsearch/common/util/BigArrays.java Outdated
@@ -364,38 +366,62 @@ public T set(long index, T value) {

final PageCacheRecycler recycler;
final AtomicLong ramBytesUsed;

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 16, 2014

Contributor

The purpose of this thing was to do circuit breaking, I think we can remove it now

This comment has been minimized.

Copy link
@dakrone

dakrone Jul 16, 2014

Author Member

ramBytesUsed is still required to keep track of the used memory so it can be decremented when released.

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 21, 2014

Contributor

I think the used memory is only used for testing so that would be ok to remove it?

This comment has been minimized.

Copy link
@dakrone

dakrone Jul 21, 2014

Author Member

Ahh okay, yes, since we test the breaker is reset we no longer need this in the tests also, I'll remove it.

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Jul 16, 2014

This looks better but I'm not very happy with this boolean on every BigArrays method. I would rather have one instance that does accounting+tripping and another one that would only do accounting. API-wise maybe what we need is for the singleton BigArrays (the one created by Guice) to only do accounting (no breaking), and then have a BigArrays withCircuitBreaker(CircuitBreaker breaker) method that would return a BigArrays instance on the same page recycler but that would do circuit-breaking with the provided breaker?

Regarding circuit breaking, I'd like to see if we can make it simpler. For instance, do we actually need overheads, limits or the ability to disable distribution?

I think we also need to have logic in ElasticsearchIntegrationTest that makes sure that all breakers (but the one that is used for networking because of the shared cluster) don't use any bytes after a test finishes.

@jpountz jpountz removed the review label Jul 16, 2014

@dakrone

This comment has been minimized.

Copy link
Member Author

commented Jul 17, 2014

... have a BigArrays withCircuitBreaker(CircuitBreaker breaker) method that would return a BigArrays instance on the same page recycler but that would do circuit-breaking with the provided breaker?

I think this would work and make it a bit simpler, I'll try to do this.

Regarding circuit breaking, I'd like to see if we can make it simpler. For instance, do we actually need overheads, limits or the ability to disable distribution?

I think we can get rid of maximum sizes and just use the limit as the max, but keep minimum sizes for breakers (so you can force space to always be available for some field data or aggregations).

For overhead, it's a nob in the event that the field data estimations don't estimate correctly, so I think it's important to keep. It's not used for BigArrays though, so it's set to 1.0. I'm not sure it (overheads) should be removed either.

I think we also need to have logic in ElasticsearchIntegrationTest that makes sure that all breakers (but the one that is used for networking because of the shared cluster) don't use any bytes after a test finishes.

I added the logic to do this already, see: https://github.com/elasticsearch/elasticsearch/pull/6739/files#diff-e6761c31a9d75e74580a93c5aeb3aaa3R198

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Jul 17, 2014

I think we can get rid of maximum sizes and just use the limit as the max, but keep minimum sizes for breakers (so you can force space to always be available for some field data or aggregations).

Agreed on minimum sizes, they are important.

For overhead, it's a nob in the event that the field data estimations don't estimate correctly.

But does it really help to overestimate memory usage reports by 3%?

I added the logic to do this already, see: https://github.com/elasticsearch/elasticsearch/pull/6739/files#diff-e6761c31a9d75e74580a93c5aeb3aaa3R198

I had missed it, great!

@dakrone

This comment has been minimized.

Copy link
Member Author

commented Jul 17, 2014

But does it really help to overestimate memory usage reports by 3%?

The additional 3% is to err on the side of estimating too high rather than too low. I can change the estimation function to include the extra 3%, but having a configurable value in case my estimation isn't accurate for some people is important in my opinion, since it's dynamically changeable and the estimation formula is not.

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Jul 17, 2014

Is it something that could be done outside of the circuit-breaking API only for field data? In the PerValueEstimator impls?

@dakrone

This comment has been minimized.

Copy link
Member Author

commented Jul 17, 2014

@jpountz I added the change to add .withCircuitBreaker() to BigArrays instead of passing through the boolean value, it's definitely cleaner. Let me know if this is what you meant.

Still working on the other changes.

@s1monw

View changes

src/main/java/org/elasticsearch/indices/breaker/RedistributingCircuitBreakerService.java Outdated
public int compare(CircuitBreaker b1, CircuitBreaker b2) {
long b1Free = b1.getMaximum() - b1.getUsed();
long b2Free = b2.getMaximum() - b2.getUsed();
if (b1Free > b2Free) {

This comment has been minimized.

Copy link
@s1monw

s1monw Jul 17, 2014

Contributor

can't you use Long.compare(b1Free, b2Free)?

This comment has been minimized.

Copy link
@dakrone

dakrone Jul 17, 2014

Author Member

Yes, good idea, I will use that.

@dakrone

This comment has been minimized.

Copy link
Member Author

commented Jul 18, 2014

@jpountz pushed a couple of new commits that remove the concept of a limit and uses min and max only.

I also removed configuring the Parent breaker entirely (now its only purpose is to track the total memory usage across all breakers and the number of times the breaker tripped and wasn't able to redistribute). I think this makes it a lot simpler.

@dakrone

This comment has been minimized.

Copy link
Member Author

commented Jul 18, 2014

Is it something that could be done outside of the circuit-breaking API only for field data? In the PerValueEstimator impls?

So the issue with this is that the PerValueEstimator can use really small amounts (like 8 or 16) for numeric terms and multiplying by 1.03 in estimation means they value is still rounded back to 8 or 16. I originally made these numbers doubles instead of longs, but not losing precision was terrible, which is why it's back to using the overhead parameter.

public void circuitBreak(String fieldName, long bytesNeeded);

/**
* add bytes to the breaker and maybe trip

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 21, 2014

Contributor

I think we also need to document if there is a circuit break then bytes will not be accounted? (that's what the implementation seems to do?)

This comment has been minimized.

Copy link
@dakrone

dakrone Jul 21, 2014

Author Member

Not sure I follow, the bytes are still accounted, they just aren't allowed to increase. In the case of fielddata an exception is thrown and the generated data never makes it into the cache, for bigarrays the exception is thrown and memory is released in the .close() method of AbstractArray.java

@jpountz

View changes

src/main/java/org/elasticsearch/common/breaker/CircuitBreaker.java Outdated
}

public static void writeTo(Name name, StreamOutput out) throws IOException {
out.writeString(name.toString());

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 21, 2014

Contributor

Can you give each of them a numeric id instead? This will allow to rename the enum constants without breaking the bw compat of the stream

This comment has been minimized.

Copy link
@dakrone

dakrone Jul 21, 2014

Author Member

I'll do this, good idea!

@jpountz

View changes

src/main/java/org/elasticsearch/common/bytes/PagedBytesReference.java Outdated
@@ -78,7 +84,7 @@ public BytesReference slice(int from, int length) {
throw new ElasticsearchIllegalArgumentException("can't slice a buffer with length [" + length() + "], with slice parameters from [" + from + "], length [" + length + "]");
}

return new PagedBytesReference(bigarrays, bytearray, offset + from, length);
return new PagedBytesReference(bigarrays, bytearray, offset + from, length, checkBreaker);

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 21, 2014

Contributor

This seems redundant with BigArrays.checkBreaker?

@jpountz

View changes

src/main/java/org/elasticsearch/common/io/stream/BytesStreamOutput.java Outdated
this.bigarrays = bigarrays.withCircuitBreaking();
} else {
this.bigarrays = bigarrays;
}

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 21, 2014

Contributor

Should we leave that responsibility to the callers of this constructor?

@jpountz

View changes

src/main/java/org/elasticsearch/indices/breaker/CircuitBreakerStats.java Outdated
@@ -82,6 +103,13 @@ public void readFrom(StreamInput in) throws IOException {
} else {
this.trippedCount = -1;
}
if (in.getVersion().onOrAfter(Version.V_2_0_0)) {

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 21, 2014

Contributor

not 1.4?

This comment has been minimized.

Copy link
@dakrone

dakrone Jul 21, 2014

Author Member

This PR includes HTTP-response-breaking changes, not sure we'd want to include it in 1.4 or not. I can adjust this once we decide which release this will go into.

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 24, 2014

Contributor

ok

@jpountz

View changes

src/main/java/org/elasticsearch/indices/breaker/RedistributingCircuitBreakerService.java Outdated
* to be rethrown. This method is synchronized so that multiple breakers
* tripping does not cause redistributions to overwrite each other
*/
public synchronized void breakOrRedistributeLimits(ChildMemoryCircuitBreaker breaker, CircuitBreakingException e) throws CircuitBreakingException {

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 21, 2014

Contributor

Do we really need redistribution? It feels to me that if we need it it somehow means that the min/max settings were not set correctly in the first place.

@jpountz

View changes

src/main/java/org/elasticsearch/search/aggregations/metrics/tophits/TopHitsContext.java Outdated
@@ -339,7 +339,7 @@ public PageCacheRecycler pageCacheRecycler() {

@Override
public BigArrays bigArrays() {
return context.bigArrays();
return context.bigArrays().withCircuitBreaking();

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 21, 2014

Contributor

This should not be required as the BigArrays from the agg context comes from the search context which already enables circuit breaking?

This comment has been minimized.

Copy link
@dakrone

dakrone Jul 21, 2014

Author Member

Yep, I'll remove this.

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Jul 21, 2014

@dakrone Agreed that it is much simpler and on the need to keep the overhead! Thanks!

I left some minor comments about code cosmetics. The last thing I'd like to discuss is the redistribution logic, but other than that, this looks good to me!

@dakrone

This comment has been minimized.

Copy link
Member Author

commented Jul 23, 2014

@jpountz I think I've addressed all of the comments.

I also changed this entirely to use the parent-check method we talked about. Instead of redistributing the children's limits, each child has a set limit while still checking the global ("parent") breaker hasn't been tripped.

Changed the default limits to 60% for field data and 40% for bigarrays/requests, overall limit is 70%.

@dakrone dakrone added the review label Jul 23, 2014

@jpountz

View changes

src/main/java/org/elasticsearch/common/breaker/CircuitBreaker.java Outdated
this.ordinal = ordinal;
}

public int getOrdinal() {

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 24, 2014

Contributor

Can you rename it to something else? This is close to ordinal which also exists on enums but should not be used for serialization as it would change if we added/removed entries in that enum.

This comment has been minimized.

Copy link
@dakrone

dakrone Jul 24, 2014

Author Member

Sure, I'll rename this.

@jpountz

View changes

src/main/java/org/elasticsearch/indices/breaker/CircuitBreakerService.java Outdated
*/
public CircuitBreaker getBreaker() {
return getBreaker(CircuitBreaker.Name.FIELDDATA);
}

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 24, 2014

Contributor

I'm not sure the fielddata breaker makes more sense as a default than another breaker? Should we remove this method and make the breaker to use explicit?

This comment has been minimized.

Copy link
@dakrone

dakrone Jul 24, 2014

Author Member

I agree, I'll remove this and force the caller to specify which breaker.

@jpountz

View changes

src/test/java/org/elasticsearch/common/util/BigArraysTests.java Outdated
array.close();
assertEquals(0, bigArrays.sizeInBytes());

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 24, 2014

Contributor

Can you add back assertions in these tests using the circuit breaker?

This comment has been minimized.

Copy link
@dakrone

dakrone Jul 24, 2014

Author Member

Yep! Good idea :)

@jpountz

View changes

src/test/java/org/elasticsearch/test/cache/recycler/MockBigArrays.java Outdated
public MockBigArrays(Settings settings, PageCacheRecycler recycler) {
super(settings, recycler);
public MockBigArrays(Settings settings, PageCacheRecycler recycler, CircuitBreakerService breakerService) {
super(settings, recycler, breakerService);
long seed;
try {
seed = SeedUtils.parseSeed(RandomizedContext.current().getRunnerSeedAsString());

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 24, 2014

Contributor

You need to override withCircuitBreaking so that it returns a MockBigArrays as well

This comment has been minimized.

Copy link
@dakrone

dakrone Jul 24, 2014

Author Member

Ahh yes, good catch, I'll add that.

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2014

@dakrone Left minor comments. I think it's close now.

@jpountz jpountz removed the review label Jul 24, 2014

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2014

LGTM

@dakrone dakrone changed the title Add RedistributingCircuitBreakerService Add HierarchyCircuitBreakerService Jul 25, 2014

@dakrone

This comment has been minimized.

Copy link
Member Author

commented Jul 25, 2014

Squashed this, rebased on master and resolved all the conflicts (there were quite a few since this PR was opened 21 days ago and we released a new version of ES between now and then).

I also added documentation and changed the version check to be 1.4.0 instead of 2.0.0.

I'm planning on merging this Monday since it received Adrien's +1 if there are no more comments by then.

Add HierarchyCircuitBreakerService
Adds a breaker for request BigArrays, which are used for parent/child
queries as well as some aggregations. Certain operations like Netty HTTP
responses and transport responses increment the breaker, but will not
trip.

This also changes the output of the nodes' stats endpoint to show the
parent breaker as well as the fielddata and request breakers.

There are a number of new settings for breakers now:

`indices.breaker.total.limit`: starting limit for all memory-use breaker,
defaults to 70%

`indices.breaker.fielddata.limit`: starting limit for fielddata breaker,
defaults to 60%
`indices.breaker.fielddata.overhead`: overhead for fielddata breaker
estimations, defaults to 1.03

(the fielddata breaker settings also use the backwards-compatible
setting `indices.fielddata.breaker.limit` and
`indices.fielddata.breaker.overhead`)

`indices.breaker.request.limit`: starting limit for request breaker,
defaults to 40%
`indices.breaker.request.overhead`: request breaker estimation overhead,
defaults to 1.0

The breaker service infrastructure is now generic and opens the path to
adding additional circuit breakers in the future.

Fixes #6129

Conflicts:
	src/main/java/org/elasticsearch/index/fielddata/IndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/IndexFieldDataService.java
	src/main/java/org/elasticsearch/index/fielddata/RamAccountingTermsEnum.java
	src/main/java/org/elasticsearch/index/fielddata/ordinals/GlobalOrdinalsBuilder.java
	src/main/java/org/elasticsearch/index/fielddata/ordinals/InternalGlobalOrdinalsBuilder.java
	src/main/java/org/elasticsearch/index/fielddata/plain/AbstractIndexOrdinalsFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/DisabledIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/IndexIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/NonEstimatingEstimator.java
	src/main/java/org/elasticsearch/index/fielddata/plain/PackedArrayIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/ParentChildIndexFieldData.java
	src/main/java/org/elasticsearch/index/fielddata/plain/SortedSetDVOrdinalsIndexFieldData.java
	src/main/java/org/elasticsearch/node/internal/InternalNode.java
	src/test/java/org/elasticsearch/index/aliases/IndexAliasesServiceTests.java
	src/test/java/org/elasticsearch/index/codec/CodecTests.java
	src/test/java/org/elasticsearch/index/fielddata/AbstractFieldDataTests.java
	src/test/java/org/elasticsearch/index/fielddata/IndexFieldDataServiceTests.java
	src/test/java/org/elasticsearch/index/mapper/MapperTestUtils.java
	src/test/java/org/elasticsearch/index/query/IndexQueryParserFilterCachingTests.java
	src/test/java/org/elasticsearch/index/query/SimpleIndexQueryParserTests.java
	src/test/java/org/elasticsearch/index/query/guice/IndexQueryParserModuleTests.java
	src/test/java/org/elasticsearch/index/search/FieldDataTermsFilterTests.java
	src/test/java/org/elasticsearch/index/search/child/ChildrenConstantScoreQueryTests.java
	src/test/java/org/elasticsearch/index/similarity/SimilarityTests.java

@dakrone dakrone merged commit 6abe4c9 into elastic:master Jul 28, 2014

@dakrone dakrone changed the title Add HierarchyCircuitBreakerService [BREAKER] Add HierarchyCircuitBreakerService Jul 29, 2014

@jpountz jpountz removed the review label Jul 29, 2014

@clintongormley clintongormley changed the title [BREAKER] Add HierarchyCircuitBreakerService Circuit Breaker: Add HierarchyCircuitBreakerService Sep 8, 2014

@dakrone dakrone deleted the dakrone:6129-use-breaker-for-bigarrays branch Sep 9, 2014

@clintongormley clintongormley changed the title Circuit Breaker: Add HierarchyCircuitBreakerService Add HierarchyCircuitBreakerService Jun 6, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.