-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Bug Report
Describe the bug
We are running fluent-bit on kubernetes pods to upload logs to s3 buckets. We are using a multipart upload approach but we see lot of small files that are being uploaded to s3. We are running fluent-bit as follows:
- There are n publisher pods that forward logs to a single aggregator pod on a certain port.
- This aggregator pod listens on the port and writes to s3 bucket.
Issues:
- We are using multi-part s3 upload yet we are seeing a lot of small files(in Kbs) that are getting uploaded to the s3 bucket.
- Objects are created separately for each publisher in the s3 bucket, there is no merging happening.
The ideal state we would like to achieve is where we have logs merged from all the publishers into a single s3 object till the total_file_size limit is hit and then keep repeating this process.
Please help with the fluent-bit configuration that would let us achieve this.
Our configuration is as below:
PUBLISHER :
input.conf: |
[INPUT]
Name tail
Tag kube.<namespace_name>.<pod_name>.<container_name>.<docker_id>
Tag_Regex (?<pod_name>a-z0-9?(.a-z0-9?))(?<namespace_name>
[^_]+)(?<container_name>.+)-(?<docker_id>[a-z0-9]{64})
Path /var/log/containers/.log
Parser cri
DB /var/log/flb_kube_system_pub.db
Mem_Buf_Limit 100MB
Buffer_Chunk_Size 32k
Buffer_Max_Size 32k
Skip_Long_Lines On
Refresh_Interval 5
output.conf: |
[OUTPUT]
Name forward
Match kube.*
Host ${AGGREGATOR_HOST}
Port ${AGGREGATOR_PORT}
AGGREGATOR :
input.conf: |
[INPUT]
Name forward
Listen 0.0.0.0
Port ${AGGREGATOR_PORT}
Buffer_Chunk_Size 1M
Buffer_Max_Size 6M
output.conf: |
[OUTPUT]
Name s3
Match kube.*
bucket test-bucket
region us-west-1
s3_key_format /security/unknown/%Y/%m/%d/
s3_key_format_tag_delimiters .
total_file_size 5M
upload_timeout 3m
Your Environment
- Version used:
- amazon/aws-for-fluent-bit:2.27.0
- Configuration:
- Environment name and version (e.g. Kubernetes? What version?):
Kubernetes:
Client Version: v1.25.2
Kustomize Version: v4.5.7
Server Version: v1.24.0 - Operating System and version:
Linux 4.19.84-33.70.amzn2.x86_64 - Filters and plugins:
Additional context
We are aggregating logs from different pods that are running in our cluster and pushing them to s3 which is later used for debugging. We want to avoid lot of small objects being uploaded to s3. AS the system evolves these number of calls to s3 will be exorbitant.
