Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit the number of concurrent requests per node #31206

Merged
merged 5 commits into from
Jun 11, 2018

Conversation

s1monw
Copy link
Contributor

@s1monw s1monw commented Jun 8, 2018

With max_concurrent_shard_requests we used to throttle / limit
the number of concurrent shard requests a high level search request
can execute per node. This had several problems since it limited the
number on a global level based on the number of nodes. This change
now throttles the number of concurrent requests per node while still
allowing concurrency across multiple nodes.

Closes #31192

With `max_concurrent_shard_requests` we used to throttle / limit
the number of concurrent shard requests a high level search request
can execute per node. This had several problems since it limited the
number on a global level based on the number of nodes. This change
now throttles the number of concurrent requests per node while still
allowing concurrency across multiple nodes.

Closes elastic#31192
@s1monw s1monw added >enhancement :Search/Search Search-related issues that do not fall into other categories v7.0.0 labels Jun 8, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you had a search/replace issue but I like it in general. Given that it is an expert option I don't think we need sophisticated backward compatibility, let's just add a note about it in the breaking changes for 7.0?


synchronized void add(Runnable runnable) {
if (queue == null) { // create this lazily
queue = new LinkedList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not that it matters much but ArrayDeque should be better?

}

synchronized void add(Runnable runnable) {
if (queue == null) { // create this lazily
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does it need to be lazy?

@@ -258,7 +254,7 @@ protected void doExecute(Task task, SearchRequest searchRequest, ActionListener<
}
String[] aliases = aliasFilter.getAliases();
String[] finalIndices = aliases.length == 0 ? new String[] {shardId.getIndexName()} : aliases;
// here we have to map the filters to the UUID since from now on we use the uuid for the lookup
// here we have to pendingExecutionsPerNode the filters to the UUID since from now on we use the uuid for the lookup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

search/replace issue?

try {
executePhaseOnShard(shardIt, shard, new SearchActionListener<FirstResult>(new SearchShardTarget(shard
.currentNodeId(),
shardIt.shardId(), shardIt.getClusterAlias(), shardIt.getOriginalIndices()), shardIndex) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation of the two above lines doesn't help readability

@@ -199,7 +199,7 @@ public void scrollId(String scrollId) {
* If profiling was enabled, this returns an object containing the profile results from
* each shard. If profiling was not enabled, this will return null
*
* @return The profile results or an empty map
* @return The profile results or an empty pendingExecutionsPerNode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

search/replace issue?

@@ -90,7 +90,7 @@ public final int getNumReducePhases() {

/**
* Returns the profile results for this search response (including all shards).
* An empty map is returned if profiling was not enabled
* An empty pendingExecutionsPerNode is returned if profiling was not enabled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

search/replace issue?

@@ -203,8 +203,8 @@ public RemoteClusterService getRemoteClusterService() {
}

/**
* Return a map of nodeId to pending number of search requests.
* This is a snapshot of the current pending search and not a live map.
* Return a pendingExecutionsPerNode of nodeId to pending number of search requests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

search/replace issue?

fork(() -> onShardFailure(shardIndex, shard, shard.currentNodeId(), shardIt, e));
}
};
if (pendingExecutions == null || pendingExecutions.tryAcquire()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe remove this tryAcquire and go through the else block directly? I think it's simpler if the management of the semaphore permits are contained in PendingExecutions?

@s1monw
Copy link
Contributor Author

s1monw commented Jun 8, 2018

@jpountz I addressed your comments

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@s1monw s1monw merged commit f825a53 into elastic:master Jun 11, 2018
@s1monw s1monw deleted the fix_max_con_search_request branch June 11, 2018 13:53
dnhatn added a commit that referenced this pull request Jun 14, 2018
* master:
  Remove RestGetAllAliasesAction (#31308)
  Temporary fix for broken build
  Reenable Checkstyle's unused import rule (#31270)
  Remove remaining unused imports before merging #31270
  Fix non-REST doc snippet
  [DOC] Extend SQL docs
  Immediately flush channel after writing to buffer (#31301)
  [DOCS] Shortens ML API intros
  Use quotes in the call invocation (#31249)
  move security ingest processors to a sub ingest directory (#31306)
  Add 5.6.11 version constant.
  Fix version detection.
  SQL: Whitelist SQL utility class for better scripting (#30681)
  [Docs] All Rollup docs experimental, agg limitations, clarify DeleteJob (#31299)
  CCS: don't proxy requests for already connected node (#31273)
  Mute ScriptedMetricAggregatorTests testSelfReferencingAggStateAfterMap
  [test] opensuse packaging turn up debug logging
  Add unreleased version 6.3.1
  Removes experimental tag from scripted_metric aggregation (#31298)
  [Rollup] Metric config parser must use builder so validation runs (#31159)
  [ML] Check licence when datafeeds use cross cluster search  (#31247)
  Add notion of internal index settings (#31286)
  Test: Remove broken yml test feature (#31255)
  REST hl client: cluster health to default to cluster level (#31268)
  [ML] Update test thresholds to account for changes to memory control (#31289)
  Log warnings when cluster state publication failed to some nodes (#31233)
  Fix AntFixture waiting condition (#31272)
  Ignore numeric shard count if waiting for ALL (#31265)
  [ML] Implement new rules design (#31110)
  index_prefixes back-compat should test 6.3 (#30951)
  Core: Remove plain execute method on TransportAction (#30998)
  Update checkstyle to 8.10.1 (#31269)
  Set analyzer version in PreBuiltAnalyzerProviderFactory (#31202)
  Modify pipelining handlers to require full requests (#31280)
  Revert upgrade to Netty 4.1.25.Final (#31282)
  Use armored input stream for reading public key (#31229)
  Fix Netty 4 Server Transport tests. Again.
  REST hl client: adjust wait_for_active_shards param in cluster health (#31266)
  REST high-level Client: remove deprecated API methods (#31200)
  [DOCS] Mark SQL feature as experimental
  [DOCS] Updates machine learning custom URL screenshots (#31222)
  Fix naming conventions check for XPackTestCase
  Fix security Netty 4 transport tests
  Fix race in clear scroll (#31259)
  [DOCS] Clarify audit index settings when remote indexing (#30923)
  Delete typos in SAML docs (#31199)
  REST high-level client: add Cluster Health API (#29331)
  [ML][TEST] Mute tests using rules (#31204)
  Support RequestedAuthnContext (#31238)
  SyncedFlushResponse to implement ToXContentObject (#31155)
  Add Get Aliases API to the high-level REST client (#28799)
  Remove some line length supressions (#31209)
  Validate xContentType in PutWatchRequest. (#31088)
  [INGEST] Interrupt the current thread if evaluation grok expressions take too long (#31024)
  Suppress extras FS on caching directory tests
  Revert "[DOCS] Added 6.3 info & updated the upgrade table. (#30940)"
  Revert "Fix snippets in upgrade docs"
  Fix snippets in upgrade docs
  [DOCS] Added 6.3 info & updated the upgrade table. (#30940)
  LLClient: Support host selection (#30523)
  Upgrade to Netty 4.1.25.Final (#31232)
  Enable custom credentials for core REST tests (#31235)
  Move ESIndexLevelReplicationTestCase to test framework (#31243)
  Encapsulate Translog in Engine (#31220)
  HLRest: Add get index templates API (#31161)
  Remove all unused imports and fix CRLF (#31207)
  [Tests] Fix self-referencing tests
  [TEST] Fix testRecoveryAfterPrimaryPromotion
  [Docs] Remove mention pattern files in Grok processor (#31170)
  Use stronger write-once semantics for Azure repository (#30437)
  Don't swallow exceptions on replication (#31179)
  Limit the number of concurrent requests per node (#31206)
  Call ensureNoSelfReferences() on _agg state variable after scripted metric agg script executions (#31044)
  Move java version checker back to its own jar (#30708)
  [test] add fix for rare virtualbox error (#31212)
javanna added a commit to javanna/elasticsearch that referenced this pull request May 20, 2019
Some of the docs were outdated as they did not mention that the limit is
not per node. Also, The default value changed.

Relates to elastic#31206
javanna added a commit that referenced this pull request May 23, 2019
Some of the docs were outdated as they did not mention that the limit is
not per node. Also, The default value changed.

Relates to #31206
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
Some of the docs were outdated as they did not mention that the limit is
not per node. Also, The default value changed.

Relates to elastic#31206
javanna added a commit to javanna/elasticsearch that referenced this pull request Jun 11, 2019
Some of the docs were outdated as they did not mention that the limit is
not per node. Also, The default value changed.

Relates to elastic#31206
javanna added a commit that referenced this pull request Jun 12, 2019
Some of the docs were outdated as they did not mention that the limit is
not per node. Also, The default value changed.

Relates to #31206
javanna added a commit that referenced this pull request Jun 12, 2019
Some of the docs were outdated as they did not mention that the limit is
not per node. Also, The default value changed.

Relates to #31206
javanna added a commit that referenced this pull request Jun 12, 2019
Some of the docs were outdated as they did not mention that the limit is
not per node. Also, The default value changed.

Relates to #31206
javanna added a commit that referenced this pull request Jun 12, 2019
Some of the docs were outdated as they did not mention that the limit is
not per node. Also, The default value changed.

Relates to #31206
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants