Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce ES response size through use of filter_path #1153

Closed
jsvd opened this issue Sep 28, 2023 · 3 comments · Fixed by #1154
Closed

Reduce ES response size through use of filter_path #1153

jsvd opened this issue Sep 28, 2023 · 3 comments · Fixed by #1154
Assignees

Comments

@jsvd
Copy link
Contributor

jsvd commented Sep 28, 2023

see elastic/beats#36275

during the _bulk request if "errors,items.*.error,items.*.status" is applied to filter_path, it should reduce > 90% of the response size from ES, reducing bandwitdh usage and also making response processing faster (less json to deserialize).

@flexitrev
Copy link

Cool, can we track performance implications on Logstash instances as this gets rolled out?

@roaksoax roaksoax changed the title reduce ES response size through use of filter_path Reduce ES response size through use of filter_path Sep 28, 2023
@jsvd
Copy link
Contributor Author

jsvd commented Oct 4, 2023

For those wanting to test it out, current versions of Logstash allow configuring the bulk path, where the filter_path can be set:

output {
  elasticsearch {
    hosts => [ ... ]
    bulk_path => '_bulk?filter_path=errors,items.*.error,items.*.status'
  }
}

@strawgate
Copy link

I did a quick initial benchmark with 2GB of data in a simple file input/elasticsearch output pipeline and I see a roughly 5% drop in CPU usage and a 75% drop in response traffic from Elasticsearch.

robbavey added a commit to robbavey/logstash-output-elasticsearch that referenced this issue Oct 20, 2023
This commit sets the `filter_path` query parameter when sending messages
to Elasticsearch using the bulk API. This should significantly reduce
the size of the query response from Elasticsearch, which should help
reduce bandwidth usage, and improve response processing speed due to
the lesser amount of JSON to deserialize

Resolves: logstash-plugins#1153
jsvd pushed a commit that referenced this issue Oct 25, 2023
This commit sets the `filter_path` query parameter when sending messages
to Elasticsearch using the bulk API. This should significantly reduce
the size of the query response from Elasticsearch, which should help
reduce bandwidth usage, and improve response processing speed due to
the lesser amount of JSON to deserialize

* Add query to expected results to fix integration tests
* Remork PR to respect exiting `filter_path` settings in custom bulk endpoints
* Add doc and comment explaining filter_path addition to /_bulk

Resolves: #1153
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants