Skip to content

Commit 1bc8a75

Browse files
authored
file_chunk: add stricter checks for broken meta files (#4998)
**Which issue(s) this PR fixes**: * Related to #3970 **What this PR does / why we need it**: This PR improves meta file corruption checking. The meta file contains at least the following field values. https://github.com/fluent/fluentd/blob/fa2eb58922e1c36f83bf1d5243b325a860f72864/lib/fluent/plugin/buffer/file_chunk.rb#L249-L254 This PR reinforces #1874. Without this changes, it might causes following error when launch fluentd every time with broken meta file: ``` 2025-06-06 12:11:26 +0900 [error]: unexpected error while checking flushed chunks. ignored. error_class=NoMethodError error="undefined method '<' for nil" 2025-06-06 12:11:26 +0900 [error]: /Users/watson/src/fluentd/lib/fluent/plugin/output.rb:1479:in 'block in Fluent::Plugin::Output#enqueue_thread_run' 2025-06-06 12:11:26 +0900 [error]: /Users/watson/src/fluentd/lib/fluent/plugin/buffer.rb:548:in 'block in Fluent::Plugin::Buffer#enqueue_all' 2025-06-06 12:11:26 +0900 [error]: /Users/watson/src/fluentd/lib/fluent/plugin/buffer.rb:542:in 'Array#each' 2025-06-06 12:11:26 +0900 [error]: /Users/watson/src/fluentd/lib/fluent/plugin/buffer.rb:542:in 'Fluent::Plugin::Buffer#enqueue_all' 2025-06-06 12:11:26 +0900 [error]: /Users/watson/src/fluentd/lib/fluent/plugin/output.rb:1479:in 'Fluent::Plugin::Output#enqueue_thread_run' 2025-06-06 12:11:26 +0900 [error]: /Users/watson/src/fluentd/lib/fluent/plugin_helper/thread.rb:78:in 'block in Fluent::PluginHelper::Thread#thread_create' ``` If the timekey value is corrupted, the above error occurs. Since there is no appropriate way to check timekey directly, check `id`, `c`, and `m` fields instead. This is because when timekey is broken, other fields may also be broken. It might be possible that the `@size` is 0. `@unique_id`, `@created_at`, and `@modified_at` are set when FileChunk is initialized, so they definitely have some values. I think these fields should be written in meta file. So, this PR adds the `id`, `c`, and `m` fields check. Previously, it operates using default value if metadata was broken. However, it can miss the corruption and result in unexpected errors. So, this PR enhances the detection of broken metadata files instead of using defalut value. This change has backward compatible with v0.14 behavior. **Docs Changes**: Not necessarily required. **Release Note**: buf_file: reinforce buffer file corruption check --------- Signed-off-by: Shizuo Fujita <fujita@clear-code.com>
1 parent 1a773b0 commit 1bc8a75

File tree

2 files changed

+37
-3
lines changed

2 files changed

+37
-3
lines changed

lib/fluent/plugin/buffer/file_chunk.rb

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -219,13 +219,17 @@ def restore_metadata(bindata)
219219
# old type of restore
220220
data = Fluent::MessagePackFactory.msgpack_unpacker(symbolize_keys: true).feed(bindata).read rescue {}
221221
end
222+
raise FileChunkError, "invalid meta data" if data.nil? || !data.is_a?(Hash)
223+
raise FileChunkError, "invalid unique_id" unless data[:id]
224+
raise FileChunkError, "invalid created_at" unless data[:c].to_i > 0
225+
raise FileChunkError, "invalid modified_at" unless data[:m].to_i > 0
222226

223227
now = Fluent::Clock.real_now
224228

225-
@unique_id = data[:id] || self.class.unique_id_from_path(@path) || @unique_id
229+
@unique_id = data[:id]
226230
@size = data[:s] || 0
227-
@created_at = data.fetch(:c, now.to_i)
228-
@modified_at = data.fetch(:m, now.to_i)
231+
@created_at = data[:c]
232+
@modified_at = data[:m]
229233

230234
@metadata.timekey = data[:timekey]
231235
@metadata.tag = data[:tag]

test/plugin/test_buf_file.rb

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1322,6 +1322,36 @@ def compare_log(plugin, msg)
13221322
assert { File.exist?("#{@bufdir}/backup/worker0/#{@id_output}/#{@d.dump_unique_id_hex(c2id)}.log") }
13231323
end
13241324

1325+
test '#resume backups enqueued broken metadata which has broken id, c, m fields' do
1326+
setup_plugins({'path' => @bufpath})
1327+
cid, path = create_first_chunk('q')
1328+
metadata = File.read(path + '.meta')
1329+
File.open(path + '.meta', 'wb') { |f| f.write(metadata[0..6] + "\0" * (metadata.size - 6)) } # create enqueued broken meta file
1330+
1331+
Fluent::SystemConfig.overwrite_system_config('root_dir' => @bufdir) do
1332+
@p.start
1333+
end
1334+
1335+
compare_log(@p, 'enqueued meta file is broken')
1336+
assert { not File.exist?(path) }
1337+
assert { File.exist?("#{@bufdir}/backup/worker0/#{@id_output}/#{@d.dump_unique_id_hex(cid)}.log") }
1338+
end
1339+
1340+
test '#resume backups enqueued broken metadata by truncated' do
1341+
setup_plugins({'path' => @bufpath})
1342+
cid, path = create_first_chunk('q')
1343+
metadata = File.read(path + '.meta')
1344+
File.open(path + '.meta', 'wb') { |f| f.write(metadata[0..-2]) } # create enqueued broken meta file with last byte truncated
1345+
1346+
Fluent::SystemConfig.overwrite_system_config('root_dir' => @bufdir) do
1347+
@p.start
1348+
end
1349+
1350+
compare_log(@p, 'enqueued meta file is broken')
1351+
assert { not File.exist?(path) }
1352+
assert { File.exist?("#{@bufdir}/backup/worker0/#{@id_output}/#{@d.dump_unique_id_hex(cid)}.log") }
1353+
end
1354+
13251355
test '#resume throws away broken chunk with disable_chunk_backup' do
13261356
setup_plugins({'path' => @bufpath, 'disable_chunk_backup' => true})
13271357
c1id, _ = create_first_chunk('b')

0 commit comments

Comments
 (0)