Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax error handling in transport disk usage API #84774

Merged
merged 3 commits into from
Mar 8, 2022

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Mar 8, 2022

Relax the failure handling to cover other exceptions such as BroadcastShardOperationFailedException.

An assertion was tripped while @nik9000 was integrating the disk usage API to Rally.

» [2022-03-03T13:15:00,065][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [runTask-0] fatal error in thread [elasticsearch[runTask-0][analyze][T#1]], exiting
»  java.lang.AssertionError: unknown response [[logs-191998/AM-V3iMAS4W1CceHvQtajg][[logs-191998][2]] org.elasticsearch.action.support.broadcast.BroadcastShardOperationFailedException:]
»       at org.elasticsearch.action.admin.indices.diskusage.TransportAnalyzeIndexDiskUsageAction.newResponse(TransportAnalyzeIndexDiskUsageAction.java:115) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
»       at org.elasticsearch.action.admin.indices.diskusage.TransportAnalyzeIndexDiskUsageAction.newResponse(TransportAnalyzeIndexDiskUsageAction.java:42) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
»       at org.elasticsearch.action.support.broadcast.TransportBroadcastAction$AsyncBroadcastAction.finishHim(TransportBroadcastAction.java:258) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
»       at org.elasticsearch.action.support.broadcast.TransportBroadcastAction$AsyncBroadcastAction.onOperation(TransportBroadcastAction.java:206) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
»       at org.elasticsearch.action.support.broadcast.TransportBroadcastAction$AsyncBroadcastAction$1.handleResponse(TransportBroadcastAction.java:186) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
»       at org.elasticsearch.action.support.broadcast.TransportBroadcastAction$AsyncBroadcastAction$1.handleResponse(TransportBroadcastAction.java:178) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
»       at org.elasticsearch.transport.TransportService$4.handleResponse(TransportService.java:718) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
»       at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1339) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]
»       at org.elasticsearch.transport.TransportService$DirectResponseChannel.processResponse(TransportService.java:1417) ~[elasticsearch-8.2.0-SNAPSHOT.jar:8.2.0-SNAPSHOT]

@dnhatn dnhatn added >bug v8.2.0 v8.1.1 v7.17.2 :Search/Search Search-related issues that do not fall into other categories labels Mar 8, 2022
@dnhatn dnhatn marked this pull request as ready for review March 8, 2022 19:05
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Mar 8, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@dnhatn dnhatn requested a review from ywelsch March 8, 2022 19:05
@elasticsearchmachine
Copy link
Collaborator

Hi @dnhatn, I've created a changelog YAML for you.

@@ -89,9 +90,12 @@ protected AnalyzeDiskUsageShardResponse shardOperation(AnalyzeDiskUsageShardRequ
final CancellableTask cancellableTask = (CancellableTask) task;
final Runnable checkForCancellation = cancellableTask::ensureNotCancelled;
final IndexShard shard = indicesService.indexServiceSafe(shardId.getIndex()).getShard(shardId.id());
shard.store().incRef();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move this to the InternalEngine instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened #84776 for this.

@dnhatn dnhatn changed the title Increase store ref before analyzing disk usage Relax error handling in transport disk usage API Mar 8, 2022
Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't really comment on the code because I don't know the area but it does seem to fix my problem somewhat.

Now I'm getting rejected execution exceptions, but that's better!

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I'm pretty sure I can approve this one!

@dnhatn
Copy link
Member Author

dnhatn commented Mar 8, 2022

Thanks Nik!

@dnhatn dnhatn added the auto-backport-and-merge Automatically create backport pull requests and merge when ready label Mar 8, 2022
@dnhatn dnhatn merged commit 423a0a5 into elastic:master Mar 8, 2022
@dnhatn dnhatn deleted the fix_disk_usage branch March 8, 2022 21:22
@dnhatn dnhatn added v8.1.2 and removed v8.1.1 labels Mar 17, 2022
dnhatn added a commit that referenced this pull request Mar 26, 2022
Relax the failure handling to cover other exceptions such as
BroadcastShardOperationFailedException.
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Mar 26, 2022
Relax the failure handling to cover other exceptions such as
BroadcastShardOperationFailedException.
@dnhatn dnhatn removed the auto-backport-and-merge Automatically create backport pull requests and merge when ready label Mar 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v7.17.2 v8.1.2 v8.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants