Skip to content

Resource-based policy permissions are exceeded when many log group triggers are provided #37

@nik-11

Description

@nik-11

Found a bug? Maybe our Slack Community can help.

Slack Community

Describe the Bug

When supplying a large number of keys (>50) to cloudwatch_forwarder_log_groups, resource-based policy permissions throws a PolicyLengthExceededException with no way to recover. In addition to this, when the exception is thrown by the AWS SDK, terraform does not stop attempting to run PutSubscriptionFilter for every log group provided, with up to 25 retries each. This leads to incredibly long terraform apply times without any apparent reason if you don't have TF_LOGS set.

Expected Behavior

Provide any number of cloudwatch_forwarder_log_groups to the module that scales.

Potential fix?

Allow user to supply a "catch-all" source arn that can provide permissions to a larger number of log groups.

Steps to Reproduce

Steps to reproduce the behavior:

  1. Add new datasource referencing your account log groups.
data "aws_cloudwatch_log_groups" "user_log_groups" {
  log_group_name_prefix = "/aws/lambda/${var.resource_prefix}"
}
  1. Map datasource to local variable
locals {
  log_groups = { for value in data.aws_cloudwatch_log_groups.user_log_groups.log_group_names :
    # Remove invalid key characters
    trim(replace("${value}", "/", "-"), "-") => { 
      name           = "${value}"
      filter_pattern = ""
    }
  }
}
  1. Provide to cloudposse/datadog-lambda-forwarder/aws module
cloudwatch_forwarder_log_groups = local.log_groups
  1. Run terraform apply with TF_LOGS set to DEBUG to see SDK calls lambda/AddPermission and logs/PutSubscriptionFilter failing

Screenshots

If applicable, add screenshots or logs to help explain your problem.

module.DataDogIntegration.module.datadog_lambda_forwarder.aws_lambda_permission.cloudwatch_groups[REDACTED]: Creating...
2022-09-12T14:23:19.719+0800 [INFO]  provider.terraform-provider-aws_v3.75.2_x5: 2022/09/12 14:23:19 [DEBUG] [aws-sdk-go] DEBUG: Response lambda/AddPermission Details:
---[ RESPONSE ]--------------------------------------
HTTP/2.0 400 Bad Request
Content-Length: 91
Content-Type: application/json
Date: Mon, 12 Sep 2022 06:23:19 GMT
X-Amzn-Errortype: PolicyLengthExceededException
X-Amzn-Requestid: 5e0ad4a0-f35d-4314-8601-fb2c48d7be13


-----------------------------------------------------: timestamp=2022-09-12T14:23:19.719+0800
2022-09-12T14:23:19.719+0800 [INFO]  provider.terraform-provider-aws_v3.75.2_x5: 2022/09/12 14:23:19 [DEBUG] [aws-sdk-go] {"Type":"User","message":"The final policy size (20898) is bigger than the limit (20480)."}: timestamp=2022-09-12T14:23:19.719+0800
2022-09-12T14:23:19.719+0800 [INFO]  provider.terraform-provider-aws_v3.75.2_x5: 2022/09/12 14:23:19 [DEBUG] [aws-sdk-go] DEBUG: Validate Response lambda/AddPermission failed, attempt 0/25, error PolicyLengthExceededException: The final policy size (20898) is bigger than the limit (20480).
{
  RespMetadata: {
    StatusCode: 400,
    RequestID: "5e0ad4a0-f35d-4314-8601-fb2c48d7be13"
  },
  Message_: "The final policy size (20898) is bigger than the limit (20480).",
  Type: "User"
}: timestamp=2022-09-12T14:23:19.719+0800

Environment (please complete the following information):

Anything that will help us triage the bug will help. Here are some ideas:

terraform -v

Terraform v1.2.6
on darwin_arm64
+ provider registry.terraform.io/franckverrot/stripe v1.9.0
+ provider registry.terraform.io/hashicorp/archive v2.2.0
+ provider registry.terraform.io/hashicorp/aws v3.75.2
+ provider registry.terraform.io/hashicorp/external v2.2.2
+ provider registry.terraform.io/hashicorp/local v2.2.3
+ provider registry.terraform.io/hashicorp/random v3.4.3
+ provider registry.terraform.io/hashicorp/template v2.2.0

Additional Context

resource "aws_lambda_permission" "cloudwatch_groups" {
for_each = local.lambda_enabled && var.forwarder_log_enabled ? var.cloudwatch_forwarder_log_groups : {}
statement_id = "datadog-forwarder-${each.key}-permission"
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.forwarder_log[0].function_name
principal = "logs.${local.aws_region}.amazonaws.com"
source_arn = "${local.arn_format}:logs:${local.aws_region}:${local.aws_account_id}:log-group:${each.value.name}:*"
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    bug🐛 An issue with the system

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions