-
Notifications
You must be signed in to change notification settings - Fork 8.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warning Courier Fetch: #3221
Comments
same here: "Courier Fetch: 5 of 129 shards failed." elasticsearch.log says: [2015-03-02 11:08:02,345][DEBUG][action.search.type ] [es1] [otrs-2015.03][0], node[cmGk6z9BQXyycYMURhmu8A], [R], s[STARTED]: Failed to execute [org.elasticsearch.action.search.SearchRequest@3b6d8170] lastShard [true] how to fix failed shards? |
update: just found out that this is maybe a problem of our move to another elasticsearch cluster. all old indexes had only 1 primary shard. now we are using the standard of 5 shards. so the new automaticly created index "otrs-2015.03" is using 5 shards. is this a problem for kibana? |
just solved my problem. |
This is not a Kibana error, but rather an elasticsearch issue. You have failing or otherwise unavailable primary shards. |
Not completly true. All of my shards were OK. I also saw the same error when elasticsearch.yml option "threadpool.search.queue_size" ist set to to low... Maybe kibanas error message could be improved... |
+1 to @monotek statement e.g. if you only have two indexes, one index with one template, and other with the different one, and you setup "*" pattern in settings, then you will receive "Courier Fetch: 1 of 2 shards failed." warning. Yes, it is incorrect settings, but at least the message is misleading |
...and it is definitely not elasticsearch issue. |
Even if the state of my cluster is green, it's still returning |
and it doenst happen with kibana3 |
We ran into this as well. By using the developer tools in the browser to look at the response of the
|
I've encountered the same issue, you can check elasticsearch log for details. |
And what if i don't have threadpool.search.queue_size in my elasticsearch.yml? I can't find it, but i also have this shards failed error which has ruined my indexes :( |
-----BEGIN PGP SIGNED MESSAGE----- just, add it... Regards André Bauer Am 16.06.2015 um 09:49 schrieb Yzord:
iQEcBAEBCAAGBQJVf+E4AAoJEAdIc/zkolSroYwH/jYsW4e1VO6yJsdxi2KtMIxF |
Do you know the right string to add? Is it threadpool.search.queue_size: high |
-----BEGIN PGP SIGNED MESSAGE----- I use it like this: Search poolthreadpool.search.type: fixed Regards André Bauer Am 16.06.2015 um 10:48 schrieb Yzord:
iQEcBAEBCAAGBQJVf/jSAAoJEAdIc/zkolSrdakIALTG68Dy/c561hXpYtxAtCEq |
actually, doing threadpool.search.queue_size: 10000 fixes it. Changing search_size is probably not a good idea. And type defaults to fixed. |
Hi All, I hava the same problem here, I post the issue in Elasticsearch forum (https://discuss.elastic.co/t/metric-aggregations-how-to-divide-value/27630/1), can some one help? Jason |
threadpool.search.queue_size: 10000 fixed it for me also. I'd love to know more about how to tune the queue_size parameter for search instead of arbitrarily using 10k if anyone has some insight into how to start there... |
ditto here.. adding the line below fixed this issue for me. threadpool.search.queue_size: 10000 Mentioned here in the ES docs too. https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-threadpool.html |
threadpool.search.queue_size:2000 did it for me. Thanks for the screenshot @spuder, the chrome debug messages were extremely helpful in figuring out what I needed to set the queue_size to. I was able to increment it slowly from 200 up to 2000 until it was resolved on my system without having to jump up to 10K and I confirmed the issue each time with the log message:
|
+1 from me too threadpool.search.queue_size: 10000 |
To clarify: this is neither a Kibana nor an Elasticsearch problem. The root cause of these errors is missing (or badly allocated) resources for Elasticsearch. Some questions to start with:
|
At the moment i have querys which hit 53 indexes (*5 shards = 265 shards). So if 1 shard means 1 thread it would be good idea to go away from 5 shards default per index? I startet with 1 or 2 core VMs but have now 2 Nodes with 4 Cores and 16 GB Ram. So maybe i could go down from threadpool.search.queue_size: 10000 anyway. |
Yes - but also to remain <30g per Shard
Which means you have 2 * ( ( 4 * 3 ) / 2 ) + 1 = 14 search threads (over both nodes) |
This would mean the default threadpool.search.queue_size: of 1000 should be enough, or not? Just tried it an commented the setting out but got instantly the old errors in Kibana like: "Courier Fetch: 55 of 265 shards failed." I use threadpool.search.queue_size: 5000 now. |
How many queries are being sent in parallel - if you have a dashboard with 4 visualizations, this will be (at least) 4 * 265 threads. |
Ah. Good to know! So for longterm usage i will consider going back to 1 shard per index and maybe also reindex my old data to 1 index per year instead of 1 index per month. |
On which node should threadpool.search.queue_size be applied? Client nodes? Or is this queue on the data nodes? |
Where should this be applied ?? |
I know this thread is closed but just for the record, I also got a misleading failed shard message that was actually a template error in a scripted field |
coooool~ |
thread_pool.search.queue_size: 10000 |
I had this issue because I already had an index with metricbeat before importing the Kibana dashboards. After deleting all the dashboards, visualizations and indexes I've reimported the dashboards from metricbeat and then started it again to have an index. |
I encountered this error message as well, on a green cluster . By using dev tools as per #3221 (comment), I saw the error message:
This is certainly a poor error message given the underlying problem was just a query entry issue... |
recently I got the warning:
Courier Fetch: 17 of 100 shards failed.
what cause it and what can I do to fix it?
thanks Akiva
The text was updated successfully, but these errors were encountered: