-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3 Output Compression not working #3676
Comments
I have the same issue, I wondering if setting the content encoding to gzip is the issue. Does S3 automatically decompress the file on its side ? |
I know of no AWS S3 function that would be capable of doing that. S3 is just an object store. I verified this issue by downloading the S3 object directly after it was uploaded to eliminate the fluentd input that was pulling it down as the source of the problem. |
I should have been more specific in my reply, my apologies. Certainly the
Content-Type field is available to store what you want in it. However, to
my knowledge, S3 won't mutate the data object stored in any way based on
the value of that field.
…On Thu, Jun 24, 2021 at 8:37 AM Matthieu Paret ***@***.***> wrote:
AWS S3 is capable of that, it is written directly in the documentation.
Content-Encoding
Specifies what content encodings have been applied to the object and thus
what decoding mechanisms must be applied to obtain the media-type
referenced by the Content-Type header field. For more information, see
https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3676 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHGRWCSJUDOE5X64CEWQYXDTUMRHFANCNFSM47GNN54Q>
.
|
I have the same issue using fluent/fluent-bit:1.7.9. Any idea if the configuration is wrong, or this is an actual bug? Is it possible there's a threshold for compression? for example, if the file is less than 1K it skips the compression part? |
Nope. I haven't even seen any comment from someone at the project even acknowledging the issue. |
ACK, issue is reproduced with the same config and fluent bit version. |
@justchris1 @mtparet After some more test with the same config provided in this issue, the file will auto decompressed if downloaded on Macbook, but uploaded file in S3 is already compressed after size-based comparison. Could you provide the comparison of size for the file uploaded in S3 and in your local before uploaded to confirm if it's really uncompressed? |
@DrewZhang13 - When I was debugging this, I eliminated the automated ingestion of the file into fluentd on the other side. To confirm it was uncompressed, I downloaded the file directly from S3 after it was uploaded by fluent-bit. When I inspect the file stored in S3, it is uncompressed. S3 has no 'auto-compress' or 'uncompress' functions, so downloading it represents what was stored in S3. The content is plaintext & readable. |
@justchris1 So i have verified from both Macbook and Linux, no similar uncompressed situation comes at my side. |
No, I meant from a 'dumb' text editor like Windows Notepad. When I download the file that was uploaded by fluent-bit with the configuration file shown in the issue from the AWS console, I am able to open the file in Notepad immediately and see clear text. |
@DrewZhang13
I did another test. I use fluentd to consume these log files, and when I use |
@canidam yeah Mac will automatically decompressed when you download from S3. I think this is the reason why you are seeing wired behavior. |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
I still see this behavior. Please do not close. |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This is occurred by the attribute When you download the file tagged as An easy solution is to just remove off There seems to be no way to turn off |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This would not explain why I would get parsing errors in fluentd with compression turned on (and the corresponding configuration in fluentd indicating it was compressed, but working immediately after disabling the fluentd side only to indicate no compression). I still see this behavior. Please do not close. |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
I still see this behavior. Issue is not resolved. |
I can see this behavior on multiple systems too with fluentbit 1.8.10 (standalone fluentbit and fluentbit in the docker container). I also experienced this behavior on previous 1.8.x versions. This is my config:
|
It works for me (i.e. I checked on S3 and the results are gzipped. Athena can read them because the files end with Here are my settings:
|
Sure I tried that, please read my test configuration above. Maybe it has been fixed but I was using a recent helm installation.
One thing, not sure that configuration you use works as I'm quite sure you need to set the "use_put_object On" option (I got an error saying I had to turn in on when I omitted the option and the container wouldn't start). If you don't get the error then its another sign the version you are testing might have been updated. |
@Spritekin Compression has always only worked wtih Use_Put_object On |
@PettitWesley |
WHY CHROME, WHY!? |
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the |
This issue was closed because it has been stalled for 5 days with no activity. |
for who still facing this issue. |
Bug Report
Describe the bug
Using td-agent-bit version 1.7.8 with the S3 output, the compression setting seems to be ignored, even when using
use_put_object true
To Reproduce
Here is my configuration of the output s3 block.
Regardless if compression setting is missing (inferring none) or present with gzip, the uploaded files are always cleartext / uncompressed.
Expected behavior
Logs uploaded would be compressed with gzip before upload.
Your Environment
I can find nothing in the error logs about a failed compression. Every upload, I get a 'happy' message:
Successfully uploaded object
. However, the file is still cleartext. I saw references in @PettitWesley thread in #2700 that this was working, so I am unsure if this is a regression or something else.The text was updated successfully, but these errors were encountered: