Skip to content

Commit

Permalink
fix high IO after sudden filebeat stop (elastic#35893)
Browse files Browse the repository at this point in the history
In case of corrupted log file (which has good chances to happen in case
of sudden unclean system shutdown), we set a flag which causes us to
checkpoint immediately, but never do anything else besides that. This
causes filebeat to just checkpoint on each log operation (therefore
causing a high IO load on the server and also causing filebeat to fall
behind).

This change resets the logInvalid flag after a successful checkpointing.
  • Loading branch information
emmanueltouzery committed May 3, 2024
1 parent 4e6d762 commit 911ed90
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 0 deletions.
4 changes: 4 additions & 0 deletions libbeat/statestore/backend/memlog/diskstore.go
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,10 @@ func (s *diskstore) mustCheckpoint() bool {
return s.logInvalid || s.checkpointPred(s.logFileSize)
}

func (s *diskstore) resetLogState() {
s.logInvalid = false
}

func (s *diskstore) Close() error {
if s.logFile != nil {
// always sync log file on ordinary shutdown.
Expand Down
3 changes: 3 additions & 0 deletions libbeat/statestore/backend/memlog/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,9 @@ func (s *store) logOperation(op op) error {
// appending the log operation.
// idea: make append configurable and retry checkpointing with backoff.
_ = s.disk.LogOperation(op)
} else {
// after successfully checkpointing, reset the logInvalid flag.
s.disk.resetLogState()
}

return err
Expand Down

0 comments on commit 911ed90

Please sign in to comment.