-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3 Multi part upload is not uploading the final chunk correctly #2962
Comments
I believe this is explained better in #2905 (comment) But I think this behavior needs to be enhanced because we lose the continuity of the log and clients are expected to stich the logs together to recreate the chronological order of the logs. |
I don't understand this... if you are uploading logs over time you will always end up with multiple log files. Are you collecting a fixed amount of logs and desire them to all be in the same file in S3? Also- the logs shouldn't be out of order in this case (though the plugin also can not make guarantees that logs will necessarily be in order across files). The reason I chose to implement it this way is because I felt it was safer. Basically on shut down the plugin wants to:
Essentially, you're requesting that the plugin chain together two calls to upload the remaining data, which I think is slightly riskier than having the calls be independent. I could change the code to do what you are asking for though- but the question for me still is- why? I understand it's a bit weird to have two files, but its not clear to me why that is a real problem. |
Hi, Thanks for the reply. Use case here is that we have a long running job (for example a build command which runs for 45 mins - 2hrs) via the ecs task. Once the job is complete, same task will not run again and the EC2 machines will be used for some other build at later point of time. So we would like to capture all the logs belonging to that particular task which we can use it for debugging purposes. Probably you can make it configurable so that we can retain the current functionality for other use cases. |
Yes, this is correct. Since we know a task runs for a given amount of time, we would like to collect all of its log into a single file. Thanks |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
Hello @PettitWesley , Any update on this request ? By the way, we also spoke to AWS TAM who said they will contact you get priority for this particular use case. Thanks |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This issue was closed because it has been stalled for 5 days with no activity. |
Bug Report
Describe the bug
When S3 plugin is setup to use the multi part upload with chunk size to s3, final chunk is not uploaded properly. It creates a separate file for the last chunk. If you see in the example below, chunk size was set to 5M. Overall the log size is about 17M. But I see 2 files with 11.7
To Reproduce
`[SERVICE]
Flush 1
Grace 120
[OUTPUT]
Name file
Match *
Path /tmp
File ${TaskId}.log
Format plain
#[OUTPUT]
#Name stdout
#Match *
[OUTPUT]
Name s3
Match *
region us-west-2
bucket raviteja-test
s3_key_format /fluent-bit-logs/${TaskId}.log
`
Expected behavior
Expected a single file 12.log
Screenshots
See the final chunk not being part of multipart upload
![image](https://user-images.githubusercontent.com/24725089/105399255-62823300-5c49-11eb-8c41-4a92074a7854.png)
S3 bucket looks like this
![image](https://user-images.githubusercontent.com/24725089/105399384-8e051d80-5c49-11eb-9a45-c9f423968b8d.png)
The text was updated successfully, but these errors were encountered: