Skip to content

Conversation

@eschabell
Copy link
Collaborator

@eschabell eschabell commented Nov 24, 2025

Updated blob input plugin documentation with examples. Fixes #2185.

Summary by CodeRabbit

  • Documentation
    • Significantly expanded Blob input plugin docs with a clear functional overview, step‑by‑step behavior, and state-tracking details.
    • Reworked configuration guidance for all parameters (aliases, paths/globs, scanning/upload options, buffering, logging, storage, threading, tags) and added practical use cases and multiple example configurations.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: Eric D. Schabell <eric@schabell.org>
@eschabell eschabell self-assigned this Nov 24, 2025
@eschabell eschabell requested review from a team as code owners November 24, 2025 08:44
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 24, 2025

Walkthrough

Expanded and clarified the Blob input plugin documentation: detailed behavior, all configuration parameters, operational notes (database/WAL, storage), use cases, examples (YAML and fluent-bit.conf), and command-line/config usage.

Changes

Cohort / File(s) Change Summary
Documentation Enhancement
pipeline/inputs/blob.md
Comprehensive rewrite and expansion: detailed plugin overview, parameter descriptions (alias, database_file, exclude_pattern, log_level, log_suppress_interval, mem_buf_limit, path, routable, scan_refresh_interval, storage.* options, tag, threaded., upload_), "How it works", "Database file", "Use cases", multiple examples (YAML & fluent-bit.conf), and concrete usage snippets.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Scanner as Directory Scanner
    participant DB as State DB (WAL)
    participant Reader as File Reader
    participant Parser as Record Creator
    participant Forwarder as Fluent Bit Forwarder

    Note over Scanner,DB: Periodic scan (scan_refresh_interval)
    Scanner->>DB: query processed state
    DB-->>Scanner: list unprocessed files
    Scanner->>Reader: open & read file
    Reader->>Parser: produce record(s) (binary payload)
    Parser->>DB: persist processing state (WAL entry)
    Parser->>Forwarder: forward record with `tag`
    Forwarder-->>Scanner: acknowledge (implicit)
    Note over DB: Optional upload_* actions or file post-processing occur after forwarding
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Single documentation file changed, consistent expansion and clarification.
  • Review focus:
    • Accuracy of parameter semantics (units, defaults).
    • Examples and YAML/fluent-bit.conf correctness.
    • Database/WAL behavior description and storage options.

Suggested labels

waiting-on-review

Suggested reviewers

  • esmerel
  • alexakreizinger

Poem

🐰 I hopped through docs to tidy the trail,

Files scanned and tagged without fail,
DB keeps track, WAL hums along,
Examples ready, neat and strong,
Hop, deploy — logs now sing their song! 🥕📦

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: updating blob input plugin documentation and adding examples, directly addressing the linked issue #2185.
Linked Issues check ✅ Passed The pull request fulfills the objectives from issue #2185 by expanding documentation, correcting issues, and adding comprehensive sections with examples and use cases.
Out of Scope Changes check ✅ Passed All changes are scoped to blob input plugin documentation improvements, with no unrelated modifications to code or other plugin documentation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
pipeline/inputs/blob.md (1)

16-16: Reduce repetitive sentence structure in mem_buf_limit description.

Three consecutive sentences begin with "If", which impacts readability. Restructure to improve flow.

-| `mem_buf_limit`                     | Set a memory buffer limit for the input plugin instance in bytes. If the limit is reached, the plugin will pause until the buffer is drained. If set to `0`, the buffer limit is disabled. If the plugin has enabled filesystem buffering, this limit won't apply. The value must be according to the [Unit Size](../../administration/configuring-fluent-bit/unit-sizes.md) specification. | `0`      |
+| `mem_buf_limit`                     | Set a memory buffer limit for the input plugin instance in bytes. When the limit is reached, the plugin will pause until the buffer is drained. Set to `0` to disable the buffer limit. Note: filesystem buffering may override this limit. The value must be according to the [Unit Size](../../administration/configuring-fluent-bit/unit-sizes.md) specification. | `0`      |
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d548103 and a44d59c.

📒 Files selected for processing (1)
  • pipeline/inputs/blob.md (2 hunks)
🧰 Additional context used
🪛 GitHub Actions: Lint PRs
pipeline/inputs/blob.md

[warning] 3-3: FluentBit.Simplicity: Avoid words like "useful" that imply ease of use, because the user may find this action difficult.


[warning] 11-11: FluentBit.Simplicity: Avoid words like "Useful" that imply ease of use, because the user may find this action difficult.


[warning] 18-18: FluentBit.Simplicity: Avoid words like "Useful" that imply ease of use, because the user may find this action difficult.

🪛 GitHub Check: runner / vale
pipeline/inputs/blob.md

[warning] 61-61:
[vale] reported by reviewdog 🐶
[FluentBit.Simplicity] Avoid words like "useful" that imply ease of use, because the user may find this action difficult.

Raw Output:
{"message": "[FluentBit.Simplicity] Avoid words like "useful" that imply ease of use, because the user may find this action difficult.", "location": {"path": "pipeline/inputs/blob.md", "range": {"start": {"line": 61, "column": 26}}}, "severity": "WARNING"}


[failure] 19-19:
[vale] reported by reviewdog 🐶
[FluentBit.Latin] Use 'for example' instead of 'e.g.,'.

Raw Output:
{"message": "[FluentBit.Latin] Use 'for example' instead of 'e.g.,'.", "location": {"path": "pipeline/inputs/blob.md", "range": {"start": {"line": 19, "column": 293}}}, "severity": "ERROR"}


[warning] 18-18:
[vale] reported by reviewdog 🐶
[FluentBit.Simplicity] Avoid words like "Useful" that imply ease of use, because the user may find this action difficult.

Raw Output:
{"message": "[FluentBit.Simplicity] Avoid words like "Useful" that imply ease of use, because the user may find this action difficult.", "location": {"path": "pipeline/inputs/blob.md", "range": {"start": {"line": 18, "column": 171}}}, "severity": "WARNING"}


[warning] 11-11:
[vale] reported by reviewdog 🐶
[FluentBit.Simplicity] Avoid words like "Useful" that imply ease of use, because the user may find this action difficult.

Raw Output:
{"message": "[FluentBit.Simplicity] Avoid words like "Useful" that imply ease of use, because the user may find this action difficult.", "location": {"path": "pipeline/inputs/blob.md", "range": {"start": {"line": 11, "column": 104}}}, "severity": "WARNING"}


[warning] 3-3:
[vale] reported by reviewdog 🐶
[FluentBit.Simplicity] Avoid words like "useful" that imply ease of use, because the user may find this action difficult.

Raw Output:
{"message": "[FluentBit.Simplicity] Avoid words like "useful" that imply ease of use, because the user may find this action difficult.", "location": {"path": "pipeline/inputs/blob.md", "range": {"start": {"line": 3, "column": 228}}}, "severity": "WARNING"}

🪛 LanguageTool
pipeline/inputs/blob.md

[style] ~16-~16: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...t to 0, the buffer limit is disabled. If the plugin has enabled filesystem buffe...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🔇 Additional comments (1)
pipeline/inputs/blob.md (1)

123-250: Comprehensive examples with realistic configurations.

The three example scenarios (database tracking, file exclusion with filesystem storage, and post-processing file actions) effectively demonstrate practical use cases. Concrete paths and properly formatted YAML/conf pairs provide clear implementation guidance.

Signed-off-by: Eric D. Schabell <eric@schabell.org>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
pipeline/inputs/blob.md (4)

3-3: Replace subjective language "useful" with objective phrasing.

This word was flagged in the previous review and should be replaced per the style guide to avoid implying ease of use.

-The _Blob_ input plugin monitors a directory and processes binary (blob) files. It scans the specified path at regular intervals, reads binary files, and forwards them as records through the Fluent Bit pipeline. This plugin is useful for processing binary log files, artifacts, or any binary data that needs to be collected and forwarded to outputs.
+The _Blob_ input plugin monitors a directory and processes binary (blob) files. It scans the specified path at regular intervals, reads binary files, and forwards them as records through the Fluent Bit pipeline. This plugin applies to processing binary log files, artifacts, or any binary data that needs to be collected and forwarded to outputs.

11-11: Replace "Useful" with objective language in alias parameter description.

This word was flagged in the previous review and violates the style guide.

-| `alias`                             | Sets an alias for multiple instances of the same input plugin. Useful when you need to run multiple blob input instances with different configurations.                                                                                                                                                                                                  | _none_   |
+| `alias`                             | Sets an alias for multiple instances of the same input plugin. When running multiple blob input instances with different configurations, each can be assigned a unique alias for identification.                                                                                                                                                                                                  | _none_   |

18-18: Revise routable parameter description for objective language.

The current wording "Use this for testing" is imperative rather than objective. Align with the previous review suggestion for clarity.

-| `routable`                          | If `true`, the data generated by the plugin can be forwarded to other plugins or outputs. If `false`, the data will be discarded. Use this for testing or when you want to process data but not forward it.                                                                                                                                                | `true`   |
+| `routable`                          | If `true`, the data generated by the plugin can be forwarded to other plugins or outputs. If `false`, the data will be discarded. Set to `false` for testing scenarios or when processing data without forwarding to outputs.                                                                                                                                | `true`   |

59-66: Revise "Use cases" section heading for style guide compliance.

The current phrasing "The Blob input plugin common use cases are:" is grammatically awkward. The previous review suggested more objective phrasing.

-The Blob input plugin common use cases are:
+The Blob input plugin applies to the following scenarios:
🧹 Nitpick comments (1)
pipeline/inputs/blob.md (1)

16-16: Reduce repetitive sentence structure; three consecutive sentences begin with "If".

While the content is accurate, consider rewording to improve readability and reduce the repetitive structure flagged by LanguageTool.

-| `mem_buf_limit`                     | Set a memory buffer limit for the input plugin instance in bytes. If the limit is reached, the plugin will pause until the buffer is drained. If set to `0`, the buffer limit is disabled. If the plugin has enabled filesystem buffering, this limit won't apply. The value must be according to the [Unit Size](../../administration/configuring-fluent-bit/unit-sizes.md) specification. | `0`      |
+| `mem_buf_limit`                     | Set a memory buffer limit for the input plugin instance in bytes. When the limit is reached, the plugin pauses until the buffer is drained; set to `0` to disable. Filesystem buffering, if enabled, will not be subject to this limit. The value must be according to the [Unit Size](../../administration/configuring-fluent-bit/unit-sizes.md) specification. | `0`      |
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a44d59c and 7648716.

📒 Files selected for processing (1)
  • pipeline/inputs/blob.md (2 hunks)
🧰 Additional context used
🪛 GitHub Check: runner / vale
pipeline/inputs/blob.md

[warning] 11-11:
[vale] reported by reviewdog 🐶
[FluentBit.Simplicity] Avoid words like "Useful" that imply ease of use, because the user may find this action difficult.

Raw Output:
{"message": "[FluentBit.Simplicity] Avoid words like "Useful" that imply ease of use, because the user may find this action difficult.", "location": {"path": "pipeline/inputs/blob.md", "range": {"start": {"line": 11, "column": 104}}}, "severity": "WARNING"}


[warning] 3-3:
[vale] reported by reviewdog 🐶
[FluentBit.Simplicity] Avoid words like "useful" that imply ease of use, because the user may find this action difficult.

Raw Output:
{"message": "[FluentBit.Simplicity] Avoid words like "useful" that imply ease of use, because the user may find this action difficult.", "location": {"path": "pipeline/inputs/blob.md", "range": {"start": {"line": 3, "column": 228}}}, "severity": "WARNING"}

🪛 LanguageTool
pipeline/inputs/blob.md

[style] ~16-~16: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...t to 0, the buffer limit is disabled. If the plugin has enabled filesystem buffe...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🔇 Additional comments (1)
pipeline/inputs/blob.md (1)

33-66: Excellent expansion of documentation with new sections and examples.

The additions of "How it works", "Database file", and "Use cases" sections significantly improve clarity. The examples in both YAML and fluent-bit.conf formats are well-structured, practical, and demonstrate realistic configurations. The database file explanation with SQLite3/WAL details is particularly helpful for users operating Fluent Bit in production.

@eschabell
Copy link
Collaborator Author

Ai reviews passed so merging.

@eschabell eschabell merged commit a060b43 into fluent:master Nov 24, 2025
8 checks passed
@eschabell eschabell deleted the erics_in_blob_updates branch November 24, 2025 08:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Blob input plugin - general updates to proved more complete documentation

1 participant