Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] Check content type when reading s3 files #15252

Merged
merged 7 commits into from
Jan 7, 2020
Merged

[Filebeat] Check content type when reading s3 files #15252

merged 7 commits into from
Jan 7, 2020

Conversation

kaiyan-sheng
Copy link
Contributor

@kaiyan-sheng kaiyan-sheng commented Dec 23, 2019

When file name has .gz suffix but with text/plain content type, newS3BucketReader function will fail when using s3 input in Filebeat. Instead of simply checking file name, check the actual content type from the response and then decide how to build the new reader.

How to test it:

Upload file to an S3 bucket and change the file metadata property to test this PR:
Screen Shot 2020-01-07 at 7 48 59 AM

Upload a test1.txt.gz file and change content type to text/plain, s3 input should still be able to read the file.

closes #15225

@kaiyan-sheng kaiyan-sheng self-assigned this Dec 23, 2019
@kaiyan-sheng kaiyan-sheng added Filebeat Filebeat review Team:Integrations Label for the Integrations team bug needs_backport PR is waiting to be backported to other branches. labels Dec 23, 2019
@kaiyan-sheng kaiyan-sheng added the test-plan Add this PR to be manual test plan label Jan 3, 2020
@kaiyan-sheng
Copy link
Contributor Author

jenkins, test this please

@ynirk
Copy link

ynirk commented Jan 6, 2020

For information, AWS Cloudtrail set the content type to application/json and i had the same problem.
Applying this patch to filebeat 7.5.1 solved the issue

Copy link

@ynirk ynirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested it and it works fine

Copy link
Contributor

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it!

@kaiyan-sheng
Copy link
Contributor Author

jenkins, test this please

@kaiyan-sheng
Copy link
Contributor Author

ci failures are not related.

@kaiyan-sheng kaiyan-sheng merged commit 034e719 into elastic:master Jan 7, 2020
@kaiyan-sheng kaiyan-sheng deleted the fb_s3 branch January 7, 2020 20:12
@kaiyan-sheng kaiyan-sheng added v7.6.0 and removed needs_backport PR is waiting to be backported to other branches. labels Jan 7, 2020
kaiyan-sheng added a commit that referenced this pull request Jan 8, 2020
* Check resp.ContentType and filename
* Remove case "text/plain" to use default instead

(cherry picked from commit 034e719)
kaiyan-sheng added a commit that referenced this pull request Jan 8, 2020
… s3 files (#15369)

* [Filebeat] Check content type when reading s3 files (#15252)

* Check resp.ContentType and filename
* Remove case "text/plain" to use default instead

(cherry picked from commit 034e719)

* Fix changelog
@kaiyan-sheng kaiyan-sheng added skip-test-plan and removed test-plan Add this PR to be manual test plan labels Jan 15, 2020
@kaiyan-sheng
Copy link
Contributor Author

This PR will be tested when testing #15370

leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
…reading s3 files (elastic#15369)

* [Filebeat] Check content type when reading s3 files (elastic#15252)

* Check resp.ContentType and filename
* Remove case "text/plain" to use default instead

(cherry picked from commit 6692049)

* Fix changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[filebeat] Gzip does not work with different content type for s3 input
3 participants