Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression for search query in single node cluster #31877

Closed
ctrlaltdel opened this issue Jul 6, 2018 · 9 comments

Comments

@ctrlaltdel
Copy link

commented Jul 6, 2018

Elasticsearch version (bin/elasticsearch --version): 6.3.1

Plugins installed: []

JVM version (java -version): OpenJDK Runtime Environment (build 10.0.1+10-Ubuntu-3ubuntu1)

OS version (uname -a if on a Unix-like system): Ubuntu 18.04

Linux es-perf 4.13.0-25-generic #29-Ubuntu SMP Mon Jan 8 21:14:41 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:

A single search query running on a single node elasticsearch cluster is much slower with elasticsearch 6.3.1 than it was with 5.5.3 while it would be expected to take the same time.

An inefficient use of the CPU due to the limit of concurrent shard requests per search request (implemented in #25632) is the likely culprit.

Steps to reproduce:

The following steps were tested in a virtual machine with 8 cores and 16 GB of RAM.

  1. Install elasticsearch 5.5.3 from deb package
  2. Create indices with test data using populate.sh (see below)
  3. Run a single match_all search query while showing CPU usage
$ ./query.sh
Elasticsearch version 5.5.3

%Cpu(s): 55.7 us,  4.0 sy,  0.0 ni, 31.0 id,  9.1 wa,  0.0 hi,  0.0 si,  0.1 st
%Cpu(s):  1.1 us,  0.0 sy,  0.0 ni, 98.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s):  0.0 us,  1.1 sy,  0.0 ni, 98.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

GET /_search
{"query":{"match_all":{}}}

%Cpu(s):  1.1 us,  2.2 sy,  0.0 ni, 96.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s): 76.5 us,  0.0 sy,  0.0 ni, 23.5 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s): 85.9 us,  1.2 sy,  0.0 ni, 12.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s):100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
{
  "hits": 670830085,
  "took": 334,
  "_shards": {
    "total": 100,
    "successful": 100,
    "failed": 0
  }
}
%Cpu(s): 24.7 us,  1.1 sy,  0.0 ni, 74.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s):  0.0 us,  1.1 sy,  0.0 ni, 98.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s):  1.1 us,  1.1 sy,  0.0 ni, 96.7 id,  1.1 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s):  1.1 us,  0.0 sy,  0.0 ni, 98.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
$
  1. Upgrade to 6.3.1 by installing new package and restarting the service
  2. Run the same query again
$ ./query.sh
Elasticsearch version 6.3.1

%Cpu(s): 54.8 us,  4.0 sy,  0.0 ni, 32.0 id,  9.1 wa,  0.0 hi,  0.0 si,  0.1 st
%Cpu(s):  2.2 us,  1.1 sy,  0.0 ni, 96.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

GET /_search
{"query":{"match_all":{}}}

%Cpu(s):  0.0 us,  1.1 sy,  0.0 ni, 98.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s): 25.6 us,  1.1 sy,  0.0 ni, 72.2 id,  0.0 wa,  0.0 hi,  1.1 si,  0.0 st
%Cpu(s): 52.8 us,  1.1 sy,  0.0 ni, 46.1 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s): 62.1 us,  0.0 sy,  0.0 ni, 37.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s): 59.8 us,  1.1 sy,  0.0 ni, 39.1 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s): 58.8 us,  0.0 sy,  0.0 ni, 41.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s): 56.5 us,  0.0 sy,  0.0 ni, 43.5 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s): 44.3 us,  2.3 sy,  0.0 ni, 53.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
{
  "hits": 670830085,
  "took": 727,
  "_shards": {
    "total": 100,
    "successful": 100,
    "skipped": 0,
    "failed": 0
  }
}
%Cpu(s):  1.1 us,  2.3 sy,  0.0 ni, 96.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s):  0.0 us,  1.1 sy,  0.0 ni, 98.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s):  1.1 us,  1.1 sy,  0.0 ni, 97.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s):  1.1 us,  1.1 sy,  0.0 ni, 97.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
$
  • Run the same again while setting max_concurrent_shard_requests to 16
$ ./query.sh ?max_concurrent_shard_requests=16
Elasticsearch version 6.3.1

%Cpu(s): 53.4 us,  3.9 sy,  0.0 ni, 33.8 id,  8.8 wa,  0.0 hi,  0.0 si,  0.1 st
%Cpu(s):  1.1 us,  1.1 sy,  0.0 ni, 97.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s):  1.1 us,  2.3 sy,  0.0 ni, 96.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

GET /_search?max_concurrent_shard_requests=16
{"query":{"match_all":{}}}

%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s): 78.2 us,  2.3 sy,  0.0 ni, 19.5 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s):100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s):100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
{
  "hits": 670830085,
  "took": 309,
  "_shards": {
    "total": 100,
    "successful": 100,
    "skipped": 0,
    "failed": 0
  }
}
%Cpu(s):  5.7 us,  2.3 sy,  0.0 ni, 92.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s):  1.1 us,  0.0 sy,  0.0 ni, 98.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s):  1.1 us,  0.0 sy,  0.0 ni, 98.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu(s):  1.1 us,  1.1 sy,  0.0 ni, 97.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
$

Summary:

Version Query took (ms) Peak user CPU usage Extra query parameter
5.5.3 334 100 %
6.3.1 727 62 % (only 5 threads running on a 8-core CPU)
6.3.1 309 100 % max_concurrent_shard_requests=16

Test scripts:

populate.sh:

#!/bin/bash

# Create index-1 with 10M dummy documents

for j in $(seq 0 10); do
        for i in $(seq 0 1000000); do echo -e '{ "index" : { "_index" : "index-1", "_type" : "doc" } }\n{ "message" : "Hello World!" }'; done | curl -s -X POST "localhost:9200/_bulk" -H 'Content-Type: application/json' --data-binary @- | jq '{errors: .errors, took: .took}'
done

# Clone index-1

for index in $(seq 2 20); do
        curl -X POST "localhost:9200/_reindex" -H 'Content-Type: application/json' -d"{\"source\": {\"index\": \"index-1\"}, \"dest\": {\"index\": \"index-$index\"}}" &
done

query.sh:

#!/bin/bash

echo -n "Elasticsearch version "
curl -s localhost:9200 | jq -r .version.number
echo

curl -s -X POST "localhost:9200/_cache/clear" > /dev/null

QUERY='{"query":{"match_all":{}}}'

top -d 0.1 -b | grep Cpu &

sleep 0.5
echo
echo "GET /_search$1"
echo $QUERY
echo
curl -s -X GET "localhost:9200/_search$1" -H 'Content-Type: application/json' -d $QUERY | jq "{hits: .hits.total, took: .took, _shards: ._shards}"
sleep 0.5

kill $(jobs -p)
wait

@ctrlaltdel ctrlaltdel changed the title Performance regression for search query in single node cluster since 5.6.0 Performance regression for search query in single node cluster Jul 6, 2018

@elasticmachine

This comment has been minimized.

Copy link
Collaborator

commented Jul 9, 2018

Pinging @elastic/es-search-aggs

@ctrlaltdel

This comment has been minimized.

Copy link
Author

commented Jul 13, 2018

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Jul 13, 2018

@ctrlaltdel I see this as a feature rather than a bug. With that many shards on a single node, having this protection is important to keep things reasonable with many concurrent users. To me the problem is more the number of shards per node here.

@ctrlaltdel

This comment has been minimized.

Copy link
Author

commented Jul 13, 2018

@jpountz thanks for the comment.

My typical use-case here is an ELK stack running on a single machine and serving as a search engine for syslog messages. There's usually only a single query running at a given time which is coming from a kibana dashboard.

In this case running on a 24 cores machine, typical query time is now about 5 times slower when running latest version of elasticsearch than it was with 5.5.3.

Could you please elaborate on how changing the number of shards here would improve performance? AFAIK, since #25632 was merged, a single query can only run on 5 threads (so use maximum 5 cores) per node by default. The only way to improve this would be to actually increase the number of shards per index.

Unfortunately there's now way to tell kibana to set the max_concurrent_shard_requests parameter by default and it makes use of _msearch queries that don't support it anyway.

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Aug 14, 2018

My typical use-case here is an ELK stack running on a single machine and serving as a search engine for syslog messages. There's usually only a single query running at a given time which is coming from a kibana dashboard.

I agree it's a shame you can't use all your resources here. This usecase isn't typical in the way that you don't gain concurrency via multiple requests. Think of 5 requests hitting your server at the same time you will have 25 requests at most hitting your node which is a good protection. Yet, your case want to maximize resource utilization per node and it's a shame that kibana can't trigger it. I wonder if we should allow multi search to override this and then make kibana expose it? @jpountz WDYT?

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Aug 14, 2018

+1

@s1monw s1monw self-assigned this Aug 20, 2018

s1monw added a commit to s1monw/elasticsearch that referenced this issue Aug 21, 2018
Expose `max_concurrent_shard_requests` in `_msearch`
Today `_msearch` doesn't allow modifying the `max_concurrent_shard_requests`
per sub search request. This change adds support for setting this parameter on
all sub-search requests in an `_msearch`.

Relates to elastic#31877
@s1monw

This comment has been minimized.

Copy link
Contributor

commented Aug 21, 2018

I created a PR to expose this setting on _msearch and opened a kibana issue to expose it. I will keep this issue open until we have code committed in both elasticsearch and kibana.

@ctrlaltdel

This comment has been minimized.

Copy link
Author

commented Aug 21, 2018

@s1monw looks good, thanks a lot for taking care of this issue :)

s1monw added a commit that referenced this issue Aug 22, 2018
Expose `max_concurrent_shard_requests` in `_msearch` (#33016)
Today `_msearch` doesn't allow modifying the `max_concurrent_shard_requests`
per sub search request. This change adds support for setting this parameter on
all sub-search requests in an `_msearch`.

Relates to #31877
s1monw added a commit that referenced this issue Aug 22, 2018
Expose `max_concurrent_shard_requests` in `_msearch` (#33016)
Today `_msearch` doesn't allow modifying the `max_concurrent_shard_requests`
per sub search request. This change adds support for setting this parameter on
all sub-search requests in an `_msearch`.

Relates to #31877
@s1monw

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2018

this has been integrated in kibana. I am closing this.

@s1monw s1monw closed this Sep 11, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.