Skip to content

Commit

Permalink
x-pack/filebeat/input/httpjson: drop response bodies at end of execut…
Browse files Browse the repository at this point in the history
…ion (#38116)

The response bodies of the first and last responses were being held in a
closed-over variable resulting in high static memory loads in some
situations. The bodies are not used between periodic executions with the
documentation stating that only cursor values are persisted across
restarts. The difference in behaviour between using the body field over
a restart versus over a sequence of executions in the same run make them
unsafe, so clarify the persistence behaviour in the documentation and
free the bodies at the end of an execution.

A survey of integrations that use the httpjson input did not identify
any that are using behaviour that is being removed, but we will need to
keep an eye on cases that may have been missed. In general, if
persistence is being depended on, the cursor should be being used.

(cherry picked from commit 353dab3)
  • Loading branch information
efd6 authored and mergify[bot] committed Feb 23, 2024
1 parent 2fc1cb5 commit 5f393cf
Show file tree
Hide file tree
Showing 3 changed files with 26 additions and 1 deletion.
19 changes: 19 additions & 0 deletions CHANGELOG.next.asciidoc
Expand Up @@ -47,6 +47,25 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff]

*Filebeat*

- Fix nil pointer dereference in the httpjson input {pull}37591[37591]
- [Gcs Input] - Added missing locks for safe concurrency {pull}34914[34914]
- Fix the ignore_inactive option being ignored in Filebeat's filestream input {pull}34770[34770]
- Fix TestMultiEventForEOFRetryHandlerInput unit test of CometD input {pull}34903[34903]
- Add input instance id to request trace filename for httpjson and cel inputs {pull}35024[35024]
- Fixes "Can only start an input when all related states are finished" error when running under Elastic-Agent {pull}35250[35250] {issue}33653[33653]
- [system] sync system/auth dataset with system integration 1.29.0. {pull}35581[35581]
- [GCS Input] - Fixed an issue where bucket_timeout was being applied to the entire bucket poll interval and not individual bucket object read operations. Fixed a map write concurrency issue arising from data races when using a high number of workers. Fixed the flaky tests that were present in the GCS test suit. {pull}35605[35605]
- Fixed concurrency and flakey tests issue in azure blob storage input. {issue}35983[35983] {pull}36124[36124]
- Fix panic when sqs input metrics getter is invoked {pull}36101[36101] {issue}36077[36077]
- Fix handling of Juniper SRX structured data when there is no leading junos element. {issue}36270[36270] {pull}36308[36308]
- Fix Filebeat Cisco module with missing escape character {issue}36325[36325] {pull}36326[36326]
- Added a fix for Crowdstrike pipeline handling process arrays {pull}36496[36496]
- Fix m365_defender cursor value and query building. {pull}37116[37116]
- Fix TCP/UDP metric queue length parsing base. {pull}37714[37714]
- Update github.com/lestrrat-go/jwx dependency. {pull}37799[37799]
- [threatintel] MISP pagination fixes {pull}37898[37898]
- Fix file handle leak when handling errors in filestream {pull}37973[37973]
- Prevent HTTPJSON holding response bodies between executions. {issue}35219[35219] {pull}38116[38116]

*Heartbeat*

Expand Down
2 changes: 1 addition & 1 deletion x-pack/filebeat/docs/inputs/input-httpjson.asciidoc
Expand Up @@ -117,7 +117,7 @@ The state has the following elements:
- `body`: A map containing the body. References the next request body when used in <<request-transforms-headers>> or <<response-pagination>> configuration sections, and to the last response body when used in <<response-transforms>> or <<response-split>> configuration sections.
- `cursor`: A map containing any data the user configured to be stored between restarts (See <<cursor>>).

All of the mentioned objects are only stored at runtime, except `cursor`, which has values that are persisted between restarts.
All of the mentioned objects are only stored at runtime during the execution of the periodic request, except `cursor`, which has values that are persisted between periodic request and restarts.

[[transforms]]
==== Transforms
Expand Down
6 changes: 6 additions & 0 deletions x-pack/filebeat/input/httpjson/input.go
Expand Up @@ -153,6 +153,12 @@ func run(ctx v2.Context, cfg config, pub inputcursor.Publisher, crsr *inputcurso
trCtx.cursor.load(crsr)

doFunc := func() error {
defer func() {
// Clear response bodies between evaluations.
trCtx.firstResponse.body = nil
trCtx.lastResponse.body = nil
}()

log.Info("Process another repeated request.")

startTime := time.Now()
Expand Down

0 comments on commit 5f393cf

Please sign in to comment.