Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClusterInfoService should wipe local cache upon unknown exceptions #9449

Closed
wants to merge 4 commits into from

Conversation

bleskes
Copy link
Contributor

@bleskes bleskes commented Jan 27, 2015

The InternalClusterInfoService reaches out to the nodes to get information about their disk usage and shard store size. Upon a node level error we currently remove the node info from the local cache. We should also clear the cache when we run into an error on the action level (excluding any info from all nodes).

This also adds settings for the timeout used when waiting for nodes.

…eptions

 The InternalClusterInfoService reaches out to the nodes to get information about their disk usage and shard store size. Upon a node level error we currently remove the node info from the local cache. We should also clear the cache when we run into an error on the action level (excluding any info from all nodes).

 This also adds settings for the timeout used when waiting for nodes.
@@ -140,6 +151,11 @@ public void onMaster() {
}
}

// called from tests as well
void updateOnce() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs actual javadocs, just looking at it is doesn't tell that it executes the job with rescheduling false (you have to read the constructor argument for ClusterInfoUpdateJob to see that

for (DiscoveryNode node : internalTestCluster.clusterService().state().getNodes()) {
mockTransportService.addDelegate(node, new MockTransportService.DelegateTransport(mockTransportService.original()) {
@Override
public void sendRequest(DiscoveryNode node, long requestId, String action, TransportRequest request, TransportRequestOptions options) throws IOException, TransportException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dude! Linebreaks! Strive for 80 columns! (or at least 100 or 120) :D

@bleskes
Copy link
Contributor Author

bleskes commented Jan 27, 2015

@dakrone pushed another commit. thx

@dakrone
Copy link
Member

dakrone commented Jan 27, 2015

LGTM

bleskes added a commit that referenced this pull request Jan 27, 2015
…eptions

 The InternalClusterInfoService reaches out to the nodes to get information about their disk usage and shard store size. Upon a node level error we currently remove the node info from the local cache. We should also clear the cache when we run into an error on the action level (excluding any info from all nodes).

 This also adds settings for the timeout used when waiting for nodes.

Closes #9449
bleskes added a commit that referenced this pull request Jan 27, 2015
…eptions

 The InternalClusterInfoService reaches out to the nodes to get information about their disk usage and shard store size. Upon a node level error we currently remove the node info from the local cache. We should also clear the cache when we run into an error on the action level (excluding any info from all nodes).

 This also adds settings for the timeout used when waiting for nodes.

Closes #9449
@bleskes bleskes closed this in 9ac6d78 Jan 27, 2015
@bleskes bleskes deleted the cluster_info_clear_on_unknown branch January 27, 2015 21:40
@clintongormley clintongormley changed the title Internal: ClusterInfoService should wipe local cache upon unknown exceptions ClusterInfoService should wipe local cache upon unknown exceptions Jun 7, 2015
mute pushed a commit to mute/elasticsearch that referenced this pull request Jul 29, 2015
…eptions

 The InternalClusterInfoService reaches out to the nodes to get information about their disk usage and shard store size. Upon a node level error we currently remove the node info from the local cache. We should also clear the cache when we run into an error on the action level (excluding any info from all nodes).

 This also adds settings for the timeout used when waiting for nodes.

Closes elastic#9449
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants