Improve performance of extracting warning value #24114

jasontedor · 2017-04-14T16:07:50Z

When building headers for a REST response, we de-duplicate the warning headers based on the actual warning value. The current implementation of this uses a capturing regular expression that is prone to excessive backtracking. In cases a request involves a large number of warnings, this extraction can be a severe performance penalty. An example where this can arise is a bulk indexing request that utilizes a deprecated feature (e.g., using deprecated forms of boolean values). This commit is an attempt to address this performance regression. We already know the format of the warning header, so we do not need to use a regular expression to parse it but rather can parse it by hand to extract the warning value. This gains back the vast majority of the performance lost due to the usage of a deprecated feature. There is still a performance loss due to logging the deprecation message but we do not address that concern in this commit.

Closes #24018

When building headers for a REST response, we de-duplicate the warning headers based on the actual warning value. The current implementation of this uses a capturing regular expression that is prone to excessive backtracking. In cases a request involves a large number of warnings, this extraction can be a severe performance penalty. An example where this can arise is a bulk indexing request that utilizes a deprecated feature (e.g., using deprecated forms of boolean values). This commit is an attempt to address this performance regression. We already know the format of the warning header, so we do not need to use a regular expression to parse it but rather can parse it by hand to extract the warning value. This gains back the vast majority of the performance lost due to the usage of a deprecated feature. There is still a performance loss due to logging the deprecation message but we do not address that concern in this commit.

When building headers for a REST response, we de-duplicate the warning headers based on the actual warning value. The current implementation of this uses a capturing regular expression that is prone to excessive backtracking. In cases a request involves a large number of warnings, this extraction can be a severe performance penalty. An example where this can arise is a bulk indexing request that utilizes a deprecated feature (e.g., using deprecated forms of boolean values). This commit is an attempt to address this performance regression. We already know the format of the warning header, so we do not need to use a regular expression to parse it but rather can parse it by hand to extract the warning value. This gains back the vast majority of the performance lost due to the usage of a deprecated feature. There is still a performance loss due to logging the deprecation message but we do not address that concern in this commit. Relates #24114

jasontedor · 2017-04-14T16:21:11Z

Thank you @jpountz.

This commit improves the performance of warning value extraction in the low-level REST client, and is similar to the approach taken in #24114. There are some differences since the low-level REST client might be connected to Elasticsearch through a proxy that injects its own warnings.

This commit improves the performance of warning value extraction in the low-level REST client, and is similar to the approach taken in elastic#24114. There are some differences since the low-level REST client might be connected to Elasticsearch through a proxy that injects its own warnings.

jasontedor added :Core/Infra/Core Core issues without another label >bug review v5.3.1 v5.4.0 v6.0.0-alpha1 labels Apr 14, 2017

jasontedor requested a review from jpountz April 14, 2017 16:07

jpountz approved these changes Apr 14, 2017

View reviewed changes

jasontedor merged commit 09efdc3 into elastic:master Apr 14, 2017

jasontedor deleted the warning-value-performance branch April 14, 2017 16:21

jasontedor mentioned this pull request Apr 27, 2017

Crashes after upgrading to 5.3.0 #23955

Closed

darrenfoong mentioned this pull request Dec 15, 2019

Improve warning value extraction performance in Response #50208

Merged

This was referenced Feb 3, 2020

[meta] 7.6 release elastic/elasticsearch-net#4340

Closed

[meta] 7.6 release elastic/elasticsearch-net#4341

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of extracting warning value #24114

Improve performance of extracting warning value #24114

jasontedor commented Apr 14, 2017

jasontedor commented Apr 14, 2017

Improve performance of extracting warning value #24114

Improve performance of extracting warning value #24114

Conversation

jasontedor commented Apr 14, 2017

jasontedor commented Apr 14, 2017