Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] Unescape characters in s3 file names #18370

Merged
merged 3 commits into from
May 11, 2020
Merged

[Filebeat] Unescape characters in s3 file names #18370

merged 3 commits into from
May 11, 2020

Conversation

kaiyan-sheng
Copy link
Contributor

@kaiyan-sheng kaiyan-sheng commented May 7, 2020

What does this PR do?

This PR unescape 3-byte encoded substring to regular string. For example: "%3D" to "=".

Why is it important?

When user uses folders in S3 bucket to organize log files, SQS notification actually encode the log file path like below:

{"Records":[{"eventVersion":"2.1","eventSource":"aws:s3","awsRegion":"us-east-1","eventTime":"2020-05-07T21:02:38.676Z","eventName":"ObjectCreated:Put","userIdentity":{"principalId":"AWS:AIDAWHL7AXDB2IME26NB3"},"requestParameters":{"sourceIPAddress":"174.29.210.150"},"responseElements":{"x-amz-request-id":"B86806D8E85F4A5F","x-amz-id-2":"yw1VtzJIcckqIpEjmp/oeesJj3yplnGR48PDCX0/nS8Qj5VXWGdzCrjSzAv0cRio/oAXihdDwscZE7pm32CHQH8gbjM60Qu6"},"s3":{"s3SchemaVersion":"1.0","configurationId":"ObjectCreated","bucket":{"name":"test-fb-ks","ownerIdentity":{"principalId":"A2EOMKP5A4DS45"},"arn":"arn:aws:s3:::test-fb-ks"},"object":{"key":"year%3D2020/month%3D05/test1.txt","size":6,"eTag":"3e7705498e8be60520841409ebc69bc1","sequencer":"005EB47773C10D8A81"}}}]}

File in S3 is year=2020/month=05/test1.txt but in SQS is converted to year%3D2020/month%3D05/test1.txt.

This conversion needs to be undo when Filebeat tries to read the S3 file pointed by SQS message. Otherwise, this file will not be found.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

https://discuss.elastic.co/t/filebeat-s3-issue/230441

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label May 7, 2020
@kaiyan-sheng kaiyan-sheng self-assigned this May 7, 2020
@elasticmachine
Copy link
Collaborator

elasticmachine commented May 7, 2020

💚 Build Succeeded

Pipeline View Test View Changes Artifacts preview stats

Expand to view the summary

Build stats

Test stats 🧪

Test Results
Failed 0
Passed 1184
Skipped 128
Total 1312

@kaiyan-sheng kaiyan-sheng added review Team:Platforms Label for the Integrations - Platforms team labels May 8, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@kaiyan-sheng kaiyan-sheng added bug needs_backport PR is waiting to be backported to other branches. labels May 9, 2020
@andresrc andresrc added [zube]: Inbox [zube]: In Review and removed needs_team Indicates that the issue/PR needs a Team:* label [zube]: Inbox labels May 9, 2020
Copy link
Member

@ChrsMark ChrsMark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@kaiyan-sheng kaiyan-sheng merged commit 1e2ec4e into elastic:master May 11, 2020
@kaiyan-sheng kaiyan-sheng deleted the s3_filename_unescape branch May 11, 2020 13:49
@kaiyan-sheng kaiyan-sheng added v7.9.0 and removed needs_backport PR is waiting to be backported to other branches. labels May 11, 2020
kaiyan-sheng added a commit that referenced this pull request May 11, 2020
…names (#18412)

* [Filebeat] Unescape characters in s3 file names (#18370)

* unescape characters in s3 file names

(cherry picked from commit 1e2ec4e)

* update changelog
kaiyan-sheng added a commit that referenced this pull request May 11, 2020
…names (#18411)

* [Filebeat] Unescape characters in s3 file names (#18370)

* unescape characters in s3 file names

(cherry picked from commit 1e2ec4e)

* update changelog
kaiyan-sheng added a commit that referenced this pull request May 12, 2020
…names (#18413)

* [Filebeat] Unescape characters in s3 file names (#18370)

* upescape characters in s3 file names

(cherry picked from commit 1e2ec4e)
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
…3 file names (elastic#18413)

* [Filebeat] Unescape characters in s3 file names (elastic#18370)

* upescape characters in s3 file names

(cherry picked from commit ebddb94)
zmoog added a commit to zmoog/beats that referenced this pull request Feb 23, 2024
We introduced [^1] the `url.QueryUnescape()` function to unescape
object keys from S3 notification in SQS messages.

However, the object keys in the S3 list object responses do not
require [^2] unescape.

We must remove the unescape to avoid unintended changes to the S3
object key.

[^1]: elastic#18370
[^2]: elastic#38012 (comment)
zmoog added a commit to zmoog/beats that referenced this pull request Feb 23, 2024
We introduced [^1] the `url.QueryUnescape()` function to unescape
object keys from S3 notification in SQS messages.

However, the object keys in the S3 list object responses do not
require [^2] unescape.

We must remove the unescape to avoid unintended changes to the S3
object key.

[^1]: elastic#18370
[^2]: elastic#38012 (comment)
zmoog added a commit to zmoog/beats that referenced this pull request Feb 23, 2024
We introduced [^1] the `url.QueryUnescape()` function to unescape
object keys from S3 notification in SQS messages.

However, the object keys in the S3 list object responses do not
require [^2] unescape.

We must remove the unescape to avoid unintended changes to the S3
object key.

[^1]: elastic#18370
[^2]: elastic#38012 (comment)
zmoog added a commit to zmoog/beats that referenced this pull request Mar 4, 2024
We introduced [^1] the `url.QueryUnescape()` function to unescape
object keys from S3 notification in SQS messages.

However, the object keys in the S3 list object responses do not
require [^2] unescape.

We must remove the unescape to avoid unintended changes to the S3
object key.

[^1]: elastic#18370
[^2]: elastic#38012 (comment)
zmoog added a commit that referenced this pull request Mar 4, 2024
…de (#38125)

* Remove url.QueryUnescape()

We introduced [^1] the `url.QueryUnescape()` function to unescape
object keys from S3 notification in SQS messages.

However, the object keys in the S3 list object responses do not
require [^2] unescape.

We must remove the unescape to avoid unintended changes to the S3
object key.

[^1]: #18370
[^2]: #38012 (comment)

---------

Co-authored-by: Andrea Spacca <andrea.spacca@elastic.co>
mergify bot pushed a commit that referenced this pull request Mar 4, 2024
…de (#38125)

* Remove url.QueryUnescape()

We introduced [^1] the `url.QueryUnescape()` function to unescape
object keys from S3 notification in SQS messages.

However, the object keys in the S3 list object responses do not
require [^2] unescape.

We must remove the unescape to avoid unintended changes to the S3
object key.

[^1]: #18370
[^2]: #38012 (comment)

---------

Co-authored-by: Andrea Spacca <andrea.spacca@elastic.co>
(cherry picked from commit 5f1e656)
zmoog pushed a commit that referenced this pull request Mar 4, 2024
…s-s3 input in polling mode (#38165)

* [AWS] [S3] Remove url.QueryUnescape() from aws-s3 input in polling mode (#38125)

We introduced [^1] the `url.QueryUnescape()` function to unescape
object keys from S3 notification in SQS messages.

However, the object keys in the S3 list object responses do not
require [^2] unescape.

We must remove the unescape to avoid unintended changes to the S3
object key.

[^1]: #18370
[^2]: #38012 (comment)

---------

Co-authored-by: Andrea Spacca <andrea.spacca@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug review Team:Platforms Label for the Integrations - Platforms team v7.7.0 v7.8.0 v7.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants