CCM-16073 - Enhanced callbacks #145

mjewildnhs · 2026-04-15T11:22:38Z

Not related to the phase 2 work but I think we can delete lines 3-6 - those secrets have never been in our repo at any point

mjewildnhs · 2026-04-15T11:57:11Z

Do we want an alarm on storage - e.g. 80% used

mjewildnhs · 2026-04-15T11:11:06Z

For cost saving I think we should switch to valkey.
You are billed in gigabyte-hours (GB-hrs) and the minimum for redis is 1GB vs 100mb in valkey.
Not sure we'll go above 100mb even in prod.
https://aws.amazon.com/elasticache/pricing/

Just waiting on agreement.

aidenvaines-cgi · 2026-04-15T11:18:25Z

Valkey is redis compatible, is half the cost and faster. Can we use that?

Just waiting on agreement.

mjewildnhs · 2026-04-15T11:53:09Z

1GB is the minimum with redis but can go down to 100mb if we make the valkey switch.
Keeping this low for dev/test environments is good for cost saving.
We should see how much each client will take in storage.

-Original file line number
+Diff line change
@@ Expand Up @@
     cd9c0efec38c5d63053dd865e5d4e207c0760d91:docs/guides/Perform_static_analysis.md:sonar-api-token:37
     96096685ab3d6876671e2bc9a6ff4d48fc56e521:src/helloworld/helloworld.sln:ipv4:4
 f4e8c15629b2cb09356a7fed4d72953590227ce:docs/Gemfile.lock:ipv4:4
+b9cb259d92c3defc27de00a4196682d11c231:lambdas/https-client-lambda/src/__tests__/tls-agent-factory.test.ts:private-key:49

-Original file line number
+Diff line change
@@ Expand Up / @@ -28,6 +28,7 @@ export default defineConfig([ @@
         "**/test-results",
         "**/playwright-report*",
         "eslint.config.mjs",
+        "**/lua-transform.js",
       ]),
       //imports
@@ Expand Down Expand Up / @@ -200,7 +201,7 @@ export default defineConfig([ @@
         },
       },
       {
-        files: ["**/utils/**", "tests/test-team/**", "tests/performance/helpers/**", "lambdas/**/src/**"],
+        files: ["**/utils/**", "tests/test-team/**", "tests/performance/helpers/**", "lambdas/**/src/**", "src/**/src/**"],
         rules: {
           "import-x/prefer-default-export": 0,
         },
@@ Expand Down @@

-Original file line number
+Diff line change
@@ Expand Up / @@ -19,7 +19,7 @@ @@
     | <a name="input_default_tags"></a> [default\_tags](#input\_default\_tags) | A map of default tags to apply to all taggable resources within the component | `map(string)` | `{}` | no |
     | <a name="input_deploy_mock_clients"></a> [deploy\_mock\_clients](#input\_deploy\_mock\_clients) | Flag to deploy mock webhook lambda for integration testing (test/dev environments only) | `bool` | `false` | no |
     | <a name="input_enable_event_anomaly_detection"></a> [enable\_event\_anomaly\_detection](#input\_enable\_event\_anomaly\_detection) | Enable CloudWatch anomaly detection alarm for inbound event queue message reception | `bool` | `true` | no |
-    | <a name="input_enable_xray_tracing"></a> [enable\_xray\_tracing](#input\_enable\_xray\_tracing) | Enable AWS X-Ray active tracing for Lambda functions | `bool` | `false` | no |
+    | <a name="input_enable_xray_tracing"></a> [enable\_xray\_tracing](#input\_enable\_xray\_tracing) | Enable AWS X-Ray active tracing for Lambda functions | `bool` | `true` | no |
     | <a name="input_environment"></a> [environment](#input\_environment) | The name of the tfscaffold environment | `string` | n/a | yes |
     | <a name="input_event_anomaly_band_width"></a> [event\_anomaly\_band\_width](#input\_event\_anomaly\_band\_width) | The width of the anomaly detection band. Higher values (e.g. 4-6) reduce sensitivity and noise, lower values (e.g. 2-3) increase sensitivity. Recommended: 2-4. | `number` | `3` | no |
     | <a name="input_event_anomaly_evaluation_periods"></a> [event\_anomaly\_evaluation\_periods](#input\_event\_anomaly\_evaluation\_periods) | Number of evaluation periods for the anomaly alarm. Each period is defined by event\_anomaly\_period. | `number` | `2` | no |
@@ Expand All / @@ -30,6 +30,12 @@ @@
     | <a name="input_log_level"></a> [log\_level](#input\_log\_level) | The log level to be used in lambda functions within the component. Any log with a lower severity than the configured value will not be logged: https://docs.python.org/3/library/logging.html#levels | `string` | `"INFO"` | no |
     | <a name="input_log_retention_in_days"></a> [log\_retention\_in\_days](#input\_log\_retention\_in\_days) | The retention period in days for the Cloudwatch Logs events to be retained, default of 0 is indefinite | `number` | `0` | no |
     | <a name="input_message_root_uri"></a> [message\_root\_uri](#input\_message\_root\_uri) | The root URI used for constructing message links in callback payloads | `string` | n/a | yes |
+    | <a name="input_mtls_cert_secret_arn"></a> [mtls\_cert\_secret\_arn](#input\_mtls\_cert\_secret\_arn) | Secrets Manager ARN for the shared mTLS client certificate (production) | `string` | `""` | no |
+    | <a name="input_mtls_mock_server_cert_s3_key"></a> [mtls\_mock\_server\_cert\_s3\_key](#input\_mtls\_mock\_server\_cert\_s3\_key) | S3 key for the mock webhook server certificate PEM (signed by the test CA) | `string` | `""` | no |
+    | <a name="input_mtls_mock_server_key_s3_key"></a> [mtls\_mock\_server\_key\_s3\_key](#input\_mtls\_mock\_server\_key\_s3\_key) | S3 key for the mock webhook server private key PEM | `string` | `""` | no |
+    | <a name="input_mtls_test_ca_s3_key"></a> [mtls\_test\_ca\_s3\_key](#input\_mtls\_test\_ca\_s3\_key) | S3 key for the test CA certificate PEM bundle used for server verification and the mock webhook server cert chain | `string` | `""` | no |
+    | <a name="input_mtls_test_cert_s3_key"></a> [mtls\_test\_cert\_s3\_key](#input\_mtls\_test\_cert\_s3\_key) | S3 key for the test mTLS client certificate bundle (non-production) | `string` | `""` | no |
+    | <a name="input_mtls_test_certs_s3_bucket"></a> [mtls\_test\_certs\_s3\_bucket](#input\_mtls\_test\_certs\_s3\_bucket) | S3 bucket containing test mTLS certificate material (non-production) | `string` | `""` | no |
     | <a name="input_parent_acct_environment"></a> [parent\_acct\_environment](#input\_parent\_acct\_environment) | Name of the environment responsible for the acct resources used, affects things like DNS zone. Useful for named dev environments | `string` | `"main"` | no |
     | <a name="input_pipe_event_patterns"></a> [pipe\_event\_patterns](#input\_pipe\_event\_patterns) | value | `list(string)` | `[]` | no |
     | <a name="input_pipe_log_level"></a> [pipe\_log\_level](#input\_pipe\_log\_level) | Log level for the EventBridge Pipe. | `string` | `"ERROR"` | no |
@@ Expand All / @@ -45,7 +51,7 @@ @@
     | Name | Source | Version |
     |------|--------|---------|
     | <a name="module_client_config_bucket"></a> [client\_config\_bucket](#module\_client\_config\_bucket) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.7/terraform-s3bucket.zip | n/a |
-    | <a name="module_client_destination"></a> [client\_destination](#module\_client\_destination) | ../../modules/client-destination | n/a |
+    | <a name="module_client_delivery"></a> [client\_delivery](#module\_client\_delivery) | ../../modules/client-delivery | n/a |
     | <a name="module_client_transform_filter_lambda"></a> [client\_transform\_filter\_lambda](#module\_client\_transform\_filter\_lambda) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.7/terraform-lambda.zip | n/a |
     | <a name="module_kms"></a> [kms](#module\_kms) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.7/terraform-kms.zip | n/a |
     | <a name="module_mock_webhook_lambda"></a> [mock\_webhook\_lambda](#module\_mock\_webhook\_lambda) | https://github.com/NHSDigital/nhs-notify-shared-modules/releases/download/3.0.7/terraform-lambda.zip | n/a |
@@ Expand Down @@

-Original file line number
+Diff line change
@@ Expand Up / @@ -2,3 +2,9 @@ resource "aws_cloudwatch_event_bus" "main" { @@
       name               = local.csi
       kms_key_identifier = module.kms.key_arn
     }
+    resource "aws_cloudwatch_event_archive" "main" {
+      name             = "${local.csi}-archive"
+      event_source_arn = aws_cloudwatch_event_bus.main.arn
+      retention_days   = 7
+    }

-Original file line number
+Diff line change
@@ -0,0 +1,178 @@
+    resource "aws_elasticache_serverless_cache" "delivery_state" {
+      name                 = "${local.csi}-delivery-state"
+      engine               = "redis"
+      major_engine_version = "7"
+      description          = "Per-target rate limiting and circuit breaker state for callback delivery"
+      snapshot_retention_limit = 0
+      security_group_ids = [aws_security_group.elasticache_delivery_state.id]
+      subnet_ids         = local.acct.private_subnet_ids
+      kms_key_id = module.kms.key_arn
+      cache_usage_limits {
+        data_storage {
+          maximum = 1
+          unit    = "GB"
+        }
+        ecpu_per_second {
+          maximum = 1000
+        }
+      }
+      tags = merge(
+        local.default_tags,
+        {
+          Name        = "${local.csi}-delivery-state"
+          Description = "Callback delivery rate limiter and circuit breaker state"
+        },
+      )
+    }
+    resource "aws_security_group" "elasticache_delivery_state" {
+      name        = "${local.csi}-elasticache-delivery-state"
+      description = "Security group for ElastiCache delivery state cluster"
+      vpc_id      = local.acct.vpc_id
+      tags = merge(
+        local.default_tags,
+        {
+          Name = "${local.csi}-elasticache-delivery-state"
+        },
+      )
+    }
+    resource "aws_vpc_security_group_ingress_rule" "elasticache_from_lambda" {
+      security_group_id            = aws_security_group.elasticache_delivery_state.id
+      referenced_security_group_id = aws_security_group.https_client_lambda.id
+      from_port                    = 6379
+      to_port                      = 6379
+      ip_protocol                  = "tcp"
+      description                  = "Allow HTTPS Client Lambda to connect to ElastiCache"
+      tags = local.default_tags
+    }
+    resource "aws_security_group" "https_client_lambda" {
+      name        = "${local.csi}-https-client-lambda"
+      description = "Security group for per-client HTTPS Client Lambda functions"
+      vpc_id      = local.acct.vpc_id
+      tags = merge(
+        local.default_tags,
+        {
+          Name = "${local.csi}-https-client-lambda"
+        },
+      )
+    }
+    resource "aws_vpc_security_group_egress_rule" "lambda_to_elasticache" {
+      security_group_id            = aws_security_group.https_client_lambda.id
+      referenced_security_group_id = aws_security_group.elasticache_delivery_state.id
+      from_port                    = 6379
+      to_port                      = 6379
+      ip_protocol                  = "tcp"
+      description                  = "Allow Lambda to connect to ElastiCache"
+      tags = local.default_tags
+    }
+    resource "aws_vpc_security_group_egress_rule" "lambda_to_https" {
+      security_group_id = aws_security_group.https_client_lambda.id
+      cidr_ipv4         = "0.0.0.0/0"
+      from_port         = 443
+      to_port           = 443
+      ip_protocol       = "tcp"
+      description       = "Allow Lambda outbound HTTPS for webhook delivery"
+      tags = local.default_tags
+    }
+    resource "aws_cloudwatch_metric_alarm" "elasticache_ecpu_utilisation" {
+      alarm_name = "${local.csi}-elasticache-ecpu-utilisation"
+      alarm_description = join(" ", [
+        "PERFORMANCE: ElastiCache processing units utilisation is high.",
+        "Consider scaling up or optimising Redis commands.",
+      ])
+      comparison_operator = "GreaterThanThreshold"
+      evaluation_periods  = 3
+      metric_name         = "ElastiCacheProcessingUnits"
+      namespace           = "AWS/ElastiCache"
+      period              = 300
+      statistic           = "Average"
+      threshold           = 80
+      actions_enabled     = true
+      treat_missing_data  = "notBreaching"
+      dimensions = {
+        CacheClusterId = aws_elasticache_serverless_cache.delivery_state.name
+      }
+      tags = merge(
+        local.default_tags,
+        {
+          Name = "${local.csi}-elasticache-ecpu-utilisation"
+        },
+      )
+    }
+    resource "aws_cloudwatch_metric_alarm" "elasticache_connections" {
+      alarm_name = "${local.csi}-elasticache-connections"
+      alarm_description = join(" ", [
+        "RELIABILITY: ElastiCache connection count is high.",
+        "Review per-client Lambda connection pool sizing.",
+      ])
+      comparison_operator = "GreaterThanThreshold"
+      evaluation_periods  = 2
+      metric_name         = "CurrConnections"
+      namespace           = "AWS/ElastiCache"
+      period              = 300
+      statistic           = "Maximum"
+      threshold           = 500
+      actions_enabled     = true
+      treat_missing_data  = "notBreaching"
+      dimensions = {
+        CacheClusterId = aws_elasticache_serverless_cache.delivery_state.name
+      }
+      tags = merge(
+        local.default_tags,
+        {
+          Name = "${local.csi}-elasticache-connections"
+        },
+      )
+    }
+    resource "aws_cloudwatch_metric_alarm" "elasticache_throttled_ops" {
+      alarm_name = "${local.csi}-elasticache-throttled-ops"
+      alarm_description = join(" ", [
+        "PERFORMANCE: ElastiCache throttled operations detected.",
+        "Increase ECPU limit or reduce request rate.",
+      ])
+      comparison_operator = "GreaterThanThreshold"
+      evaluation_periods  = 2
+      metric_name         = "ThrottledCmds"
+      namespace           = "AWS/ElastiCache"
+      period              = 300
+      statistic           = "Sum"
+      threshold           = 0
+      actions_enabled     = true
+      treat_missing_data  = "notBreaching"
+      dimensions = {
+        CacheClusterId = aws_elasticache_serverless_cache.delivery_state.name
+      }
+      tags = merge(
+        local.default_tags,
+        {
+          Name = "${local.csi}-elasticache-throttled-ops"
+        },
+      )
+    }

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CCM-16073 - Enhanced callbacks #145

Uh oh!

Diff view

Diff view

There are no files selected for viewing

mjewildnhs Apr 15, 2026

Uh oh!

Uh oh!

mjewildnhs Apr 15, 2026

Uh oh!

mjewildnhs Apr 15, 2026

Uh oh!

cgitim Apr 16, 2026

Uh oh!

aidenvaines-cgi Apr 15, 2026

Uh oh!

cgitim Apr 16, 2026

Uh oh!

mjewildnhs Apr 15, 2026

Uh oh!

Uh oh!

Uh oh!

-Original file line number
+Diff line change
@@ Expand Up / @@ -20,47 +20,37 @@ locals { @@
           targets = [
             for target in try(client.targets, []) :
             merge(target, {
-              invocationEndpoint = "${aws_lambda_function_url.mock_webhook[0].function_url}${target.targetId}"
+              invocationEndpoint = try(target.mtls.enabled, false) ? "https://${aws_lb.mock_webhook_mtls[0].dns_name}/${target.targetId}" : "${aws_lambda_function_url.mock_webhook[0].function_url}${target.targetId}"
               apiKey             = merge(target.apiKey, { headerValue = random_password.mock_webhook_api_key[0].result })
             })
           ]
         })
       } : local.config_clients
-      config_targets = merge([
-        for client_id, data in local.config_clients : {
-          for target in try(data.targets, []) : target.targetId => {
-            client_id                        = client_id
-            target_id                        = target.targetId
-            invocation_endpoint              = var.deploy_mock_clients ? "${aws_lambda_function_url.mock_webhook[0].function_url}${target.targetId}" : target.invocationEndpoint
-            invocation_rate_limit_per_second = target.invocationRateLimit
-            http_method                      = target.invocationMethod
-            header_name                      = target.apiKey.headerName
-            header_value                     = var.deploy_mock_clients ? random_password.mock_webhook_api_key[0].result : target.apiKey.headerValue
-          }
-        }
-      ]...)
-      config_subscriptions = merge([
-        for client_id, data in local.config_clients : {
-          for subscription in try(data.subscriptions, []) : subscription.subscriptionId => {
-            client_id       = client_id
+      client_subscriptions = {
+        for client_id, data in local.config_clients :
+        client_id => {
+          for subscription in try(data.subscriptions, []) :
+          subscription.subscriptionId => {
             subscription_id = subscription.subscriptionId
             target_ids      = try(subscription.targetIds, [])
           }
         }
-      ]...)
-      subscription_targets = merge([
-        for subscription_id, subscription in local.config_subscriptions : {
-          for target_id in subscription.target_ids :
-          "${subscription_id}-${target_id}" => {
-            subscription_id = subscription_id
-            target_id       = target_id
+      }
+      client_subscription_targets = {
+        for client_id, data in local.config_clients :
+        client_id => merge([
+          for subscription in try(data.subscriptions, []) : {
+            for target_id in try(subscription.targetIds, []) :
+            "${subscription.subscriptionId}-${target_id}" => {
+              subscription_id = subscription.subscriptionId
+              target_id       = target_id
+            }
           }
-        }
-      ]...)
+        ]...)
+      }
       applications_map_parameter_name = coalesce(var.applications_map_parameter_name, "/${var.project}/${var.environment}/${var.component}/applications-map")
     }

-Original file line number
+Diff line change
@@ -0,0 +1,47 @@
+    module "client_delivery" {
+      source   = "../../modules/client-delivery"
+      for_each = local.config_clients
+      project        = var.project
+      aws_account_id = var.aws_account_id
+      region         = var.region
+      component      = var.component
+      environment    = var.environment
+      group          = var.group
+      client_id       = each.key
+      client_bus_name = aws_cloudwatch_event_bus.main.name
+      kms_key_arn     = module.kms.key_arn
+      subscriptions        = local.client_subscriptions[each.key]
+      subscription_targets = local.client_subscription_targets[each.key]
+      client_config_bucket     = module.client_config_bucket.bucket
+      client_config_bucket_arn = module.client_config_bucket.arn
+      applications_map_parameter_name = local.applications_map_parameter_name
+      lambda_s3_bucket      = local.acct.s3_buckets["lambda_function_artefacts"]["id"]
+      lambda_code_base_path = local.aws_lambda_functions_dir_path
+      force_lambda_code_deploy = var.force_lambda_code_deploy
+      log_level                = var.log_level
+      log_retention_in_days    = var.log_retention_in_days
+      enable_xray_tracing      = var.enable_xray_tracing
+      log_destination_arn       = local.log_destination_arn
+      log_subscription_role_arn = local.acct.log_subscription_role_arn
+      elasticache_endpoint     = aws_elasticache_serverless_cache.delivery_state.endpoint[0].address
+      elasticache_cache_name   = aws_elasticache_serverless_cache.delivery_state.name
+      elasticache_iam_username = "${var.project}-${var.environment}-${var.component}-elasticache-user"
+      mtls_cert_secret_arn     = var.mtls_cert_secret_arn
+      mtls_test_cert_s3_bucket = var.mtls_test_certs_s3_bucket
+      mtls_test_cert_s3_key    = var.mtls_test_cert_s3_key
+      vpc_subnet_ids           = local.acct.private_subnet_ids
+      lambda_security_group_id = aws_security_group.https_client_lambda.id
+      deploy_mock_clients = var.deploy_mock_clients
+    }

CCM-16073 - Enhanced callbacks #145

Are you sure you want to change the base?

Uh oh!

CCM-16073 - Enhanced callbacks #145

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

mjewildnhs Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mjewildnhs Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

mjewildnhs Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

cgitim Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

aidenvaines-cgi Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

cgitim Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

mjewildnhs Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!