Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x-pack/filebeat/input/awss3: relax queue_url constraints for non-standard endpoints #35520

Merged
merged 6 commits into from Jun 8, 2023

Conversation

efd6
Copy link
Contributor

@efd6 efd6 commented May 22, 2023

What does this PR do?

Currently, the queue_url option is used to obtain the region name for the SQS receiver. This prevents pointing the input at other sources for testing, so add a region_name option to provide an alternative way to provide the region name for non-AWS endpoints.

To avoid confusion, log a warning if the user configures the region option in conjunction with amazonaws.com endpoints such that they disagree.

Why is it important?

This is needed for supporting tests of inputs using s3.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

@efd6 efd6 added enhancement Filebeat Filebeat Team:Security-External Integrations backport-skip Skip notification from the automated backport with mergify 8.9-candidate labels May 22, 2023
@efd6 efd6 self-assigned this May 22, 2023
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels May 22, 2023
@efd6 efd6 requested a review from a team May 22, 2023 03:40
@elasticmachine
Copy link
Collaborator

elasticmachine commented May 22, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-06-08T08:33:58.468+0000

  • Duration: 129 min 20 sec

Test stats 🧪

Test Results
Failed 0
Passed 5968
Skipped 360
Total 6328

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@efd6 efd6 marked this pull request as ready for review May 22, 2023 07:01
@efd6 efd6 requested a review from a team as a code owner May 22, 2023 07:01
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

x-pack/filebeat/input/awss3/config.go Outdated Show resolved Hide resolved
x-pack/filebeat/input/awss3/input.go Outdated Show resolved Hide resolved
if err != nil {
return fmt.Errorf("invalid queue_url: %w", err)
}
if strings.HasSuffix(u.Host, ".amazonaws.com") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other AWS domains besides just amazonaws.com that you should include. See https://github.com/elastic/beats/blob/main/x-pack/filebeat/input/awss3/input.go#L350

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I was looking for a list.

Copy link
Contributor

@bhapas bhapas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is complicated because of all the ways in which a user can set the AWS region. I think those are:

  • default_region in input config
  • AWS_DEFAULT_REGION env var
  • AWS_REGION env var
  • extracted from the queue_url
  • region_name in input config (proposed)

My personal preference in behavior is to require an explicit region name to be configured either through env (which is part of the AWS SDK) or via the existing default_region config option. This would have the most clear semantics IMO -- you must set it or else it won't work and there's no guessing about the region being used. But our own SDK config wrapper injects a "default default" region of us-east-1 so we can't detect if the user did not set a region.

But we can't change that behavior without a breaking change, so if we make region_name take precedence over queue_url parsing I still that will be clear.

One thing I want to be sure of here is that we have tested this before merging to ensure that it works to pull data from Localstack. @bhapas can help with that step.

@@ -78,6 +79,13 @@ func (c *config) Validate() error {
return fmt.Errorf("number_of_workers <%v> must be greater than 0", c.NumberOfWorkers)
}

if c.QueueURL != "" {
region, _ := getRegionFromQueueURL(c.QueueURL, c.AWSConfig.Endpoint)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to have this validation or could we relax it? As a user, I think the semantics of explicitly setting a region name would be my preference in my config. (It's just a shame that we cannot rely on the AWS SDK config for this because it injects a default region so we can't tell if the user explicitly configured it).

This will steer users that are having trouble with their combination of queue_url and endpoint toward the region_name which is documented as for testing purposes. Not sure this is bad but I wanted to call it out. A common problem is that users get an error from getRegionFromQueueURL until they learn that they need to set endpoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be happy to relax this if we emit a log warning, though I don't think that is possible at this point, but is in input.go with some additional cost per start. WDYT?

@@ -308,9 +311,11 @@ func getRegionFromQueueURL(queueURL string, endpoint string) (string, error) {
return "", fmt.Errorf(queueURL + " is not a valid URL")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a unit test for this function. It has zero test coverage. When it was added it had some, but I must have lost them in a refactoring. Can you please add back the test case from 7b729da.

@@ -257,6 +257,12 @@ configuring multiline options.

URL of the AWS SQS queue that messages will be received from. (Required when `bucket_arn` and `non_aws_bucket_name` are not set).

[float]
==== `region_name`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using region is more consistent with everything in the AWS realm.

@efd6 efd6 changed the title x-pack/filebeat/input/awss3: relax queue_url constraints for testing x-pack/filebeat/input/awss3: relax queue_url constraints for non-standard endpoints May 25, 2023
@mergify

This comment was marked as outdated.

Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Before merging, @bhapas can you please test that this works with Localstack based SQS and S3.

@mergify

This comment was marked as outdated.

Copy link
Contributor

@zmoog zmoog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@mergify
Copy link
Contributor

mergify bot commented Jun 6, 2023

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b 35496-awss3 upstream/35496-awss3
git merge upstream/main
git push upstream 35496-awss3

@efd6
Copy link
Contributor Author

efd6 commented Jun 7, 2023

/test

x-pack/filebeat/input/awss3/input.go Outdated Show resolved Hide resolved
x-pack/filebeat/input/awss3/input.go Outdated Show resolved Hide resolved
@mergify

This comment was marked as outdated.

efd6 added 4 commits June 8, 2023 11:39
Currently, the queue_url option is used to obtain the region name for
the SQS receiver. This prevents pointing the input at other sources for
testing, so add a region_name option to provide an alternative way to
provide the region name for non-AWS endpoints.

To avoid confusion, prevent using the region_name option in conjunction
with amazonaws.com endpoints and document this.
Also improve comprehensability of test cases.
efd6 added 2 commits June 8, 2023 13:29
Make getRegionFromQueueURL aware of configured region name, this means
that it will return either a non-zero region name or a non-nil error.
Copy link
Contributor

@bhapas bhapas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@efd6 efd6 merged commit 7b1a839 into elastic:main Jun 8, 2023
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
8.9-candidate backport-skip Skip notification from the automated backport with mergify enhancement Filebeat Filebeat
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Filebeat] aws-s3 input should support any queue_url
6 participants