Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3 output: resolve stale-detection races #19

Merged

Conversation

yaauie
Copy link
Contributor

@yaauie yaauie commented Dec 23, 2022

This is largely a forward-port of logstash-plugins/logstash-output-s3#252 with some minor changes to deal with the integrated plugin's usage of the java-native ConcurrentHashMap in place of the stand-alone plugin's ruby-native Concurrent::Map.

Refactor of S3::FileRepository to avoid several closely-related race conditions:

  • prevent get_factory() from yielding a factory that was mid-deletion by the stale watcher, which could cause the plugin crash due to the file no longer existing on disk. This is solved by marking a factory's prefix wrapper as deleted while the stale watcher has exclusive access to it, and checking for deletion status before yielding exclusive access to a prefix wrapper's factory.
  • introduce each_factory, which avoids creating new factories or yielding deleted ones.
  • refactor each_files to use new each_factory to avoid yielding files whose factories have been deleted.
  • void-return methods now explicitly emit nil to prevent accidental leaks of synchronization-required resources.

Additionally, S3#rotate_if_needed was migrated to use the now-safer S3::FileRepository#each_factory that avoids initializing new factories (and therefore avoids creating empty files on disk after the existing ones had been stale-reaped).

This is largely a forward-port of logstash-plugins/logstash-output-s3#252 with
some minor changes to deal with the integrated plugin's usage of the
java-native `ConcurrentHashMap` in place of the stand-alone plugin's
ruby-native `Concurrent::Map`.

Refactor of `S3::FileRepository` to avoid several closely-related race
conditions:

 - prevent `get_factory()` from yielding a factory that was
   mid-deletion by the stale watcher, which could cause the plugin
   crash due to the file no longer existing on disk. This is solved
   by marking a factory's prefix wrapper as deleted while the stale
   watcher has exclusive access to it, and checking for deletion
   status before yielding exclusive access to a prefix wrapper's factory.
 - introduce `each_factory`, which _avoids_ creating new factories
   or yielding deleted ones.
 - refactor `each_files` to use new `each_factory` to avoid yielding
   files whose factories have been deleted.
 - void-return methods now explicitly emit `nil` to prevent accidental
   leaks of synchronization-required resources.

Additionally, `S3#rotate_if_needed` was migrated to use the now-safer
`S3::FileRepository#each_factory` that _avoids_ initializing new
factories (and therefore avoids creating empty files on disk after
the existing ones had been stale-reaped).
Copy link
Contributor

@mashhurs mashhurs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CHANGELOG.md Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@yaauie yaauie merged commit 4f3b1fe into logstash-plugins:main Dec 23, 2022
@yaauie yaauie deleted the s3-output-stale-detection-race-conditions branch December 23, 2022 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants