Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloudwatch_logs: remove sequence tokens from API calls #7973

Merged
merged 1 commit into from Oct 17, 2023

Conversation

matthewfala
Copy link
Contributor

@matthewfala matthewfala commented Sep 26, 2023

Allow multiple workers on cloudwatch_logs by removing sequence tokens 馃帀馃帀馃帀

This PR removes the sequence tokens from CloudWatch Log API requests entirely (as per the recent CWL API update).

Also removes error handling for InvalidSequence and DataAlreadyAccepted exceptions which no longer occur.

CWL API Docs:

The sequence token is now ignored in PutLogEvents actions. PutLogEvents actions are always accepted and never return InvalidSequenceTokenException or DataAlreadyAcceptedException even if the sequence token is not valid. You can use parallel PutLogEvents actions on the same log stream.

https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_PutLogEvents.html

Reopened from: #6741

Note: This is a cleaned up version of the following branch https://github.com/PettitWesley/fluent-bit.git 1_9-sequence-token


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Comment on lines 584 to 585
/* error can not be parsed, print raw response */
flb_plg_error(ins, "Raw response: %s", response);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not be the right way to think about things, but one interesting thought I just had- we already have folks complaining about the rates of hits for "error" from Fluent Bit: https://github.com/aws/aws-for-fluent-bit/blob/mainline/troubleshooting/debugging.md#rate-of-network-errors

So I wonder if this should be warn, because its printing a raw response which if its an error, will probably have error in it, so we don't need to further increase our error hits anymore...

馃

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's reasonable.

@@ -596,6 +598,7 @@ void flb_aws_print_error(char *response, size_t response_len,
}

flb_sds_destroy(error);
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add empty return?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason I can recall. Can remove.

@PettitWesley
Copy link
Contributor

Minus one small comment, I think this is probably good to go.

@matthewfala matthewfala temporarily deployed to pr September 28, 2023 00:59 — with GitHub Actions Inactive
@matthewfala matthewfala temporarily deployed to pr September 28, 2023 00:59 — with GitHub Actions Inactive
@matthewfala matthewfala temporarily deployed to pr September 28, 2023 00:59 — with GitHub Actions Inactive
@matthewfala matthewfala temporarily deployed to pr September 28, 2023 01:23 — with GitHub Actions Inactive
/* error can not be parsed, print raw response to debug */
flb_plg_debug(ctx->ins, "Raw response: %s", c->resp.payload);
/* error can not be parsed, print raw response */
flb_plg_error(ctx->ins, "Raw response: %s", c->resp.payload);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthewfala for same reasoning as my earlier comment on error => warn for raw responses, let's also do warn here?

/* error can not be parsed, print raw response to debug */
flb_plg_debug(ctx->ins, "Raw response: %s", c->resp.payload);
/* error can not be parsed, print raw response */
flb_plg_error(ctx->ins, "Raw response: %s", c->resp.payload);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment here

@matthewfala matthewfala temporarily deployed to pr September 29, 2023 02:53 — with GitHub Actions Inactive
@matthewfala matthewfala temporarily deployed to pr September 29, 2023 02:53 — with GitHub Actions Inactive
@matthewfala matthewfala temporarily deployed to pr September 29, 2023 02:53 — with GitHub Actions Inactive
@matthewfala matthewfala temporarily deployed to pr September 29, 2023 03:02 — with GitHub Actions Inactive
@matthewfala matthewfala temporarily deployed to pr September 29, 2023 03:02 — with GitHub Actions Inactive
@matthewfala matthewfala temporarily deployed to pr September 29, 2023 03:02 — with GitHub Actions Inactive
@matthewfala matthewfala temporarily deployed to pr September 29, 2023 03:03 — with GitHub Actions Inactive
@matthewfala matthewfala temporarily deployed to pr September 29, 2023 03:03 — with GitHub Actions Inactive
@matthewfala matthewfala temporarily deployed to pr September 29, 2023 03:03 — with GitHub Actions Inactive
@matthewfala matthewfala temporarily deployed to pr September 29, 2023 03:13 — with GitHub Actions Inactive
@matthewfala matthewfala temporarily deployed to pr September 29, 2023 03:13 — with GitHub Actions Inactive
Signed-off-by: Matthew Fala <falamatt@amazon.com>
@matthewfala matthewfala temporarily deployed to pr September 29, 2023 03:21 — with GitHub Actions Inactive
@matthewfala matthewfala temporarily deployed to pr September 29, 2023 03:21 — with GitHub Actions Inactive
@matthewfala matthewfala temporarily deployed to pr September 29, 2023 03:21 — with GitHub Actions Inactive
@matthewfala matthewfala temporarily deployed to pr September 29, 2023 03:51 — with GitHub Actions Inactive
@@ -581,6 +581,8 @@ void flb_aws_print_error(char *response, size_t response_len,

error = flb_json_get_val(response, response_len, "__type");
if (!error) {
/* error can not be parsed, print raw response */
flb_plg_warn(ins, "Raw response: %s", response);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lacks full context that it could have: we can print the api value in the raw response here, like: PutLogEvents: Raw Response:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess based on the response, the API should be obvious? Is this true? I think if this does really add context, it'd be nice to add.

We could also follow up with it later I suppose, as long we stage it now, so we don't forget it..

What are your thoughts on how important this context is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is nice to have but not essential. I'll update this for master, and include it in aws-for-fluent-bit after release of 2.32.0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool and thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@PettitWesley PettitWesley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to go minus one minor CX comment

@edsiper edsiper merged commit 284a4eb into fluent:master Oct 17, 2023
41 of 42 checks passed
@edsiper
Copy link
Member

edsiper commented Oct 17, 2023

thanks, merged.

note: commits must be with proper plugin name e.g: out_cloudwatch_logs.

leonardo-albertovich pushed a commit that referenced this pull request Nov 3, 2023
Signed-off-by: Matthew Fala <falamatt@amazon.com>
@cdancy
Copy link

cdancy commented Dec 1, 2023

@edsiper @PettitWesley @matthewfala is it safe to say that with 2.2.0 release of fluent-bit workers are at least supported with cloudwatch_logs output? I don't see the documentation updated but just wanted to double check with you all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants