Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws-cdk/hotswap: ecs hotswap not updating image in new task definition #27343

Closed
akbisw opened this issue Sep 28, 2023 · 6 comments · Fixed by #27358
Closed

aws-cdk/hotswap: ecs hotswap not updating image in new task definition #27343

akbisw opened this issue Sep 28, 2023 · 6 comments · Fixed by #27358
Labels
@aws-cdk/aws-ecs Related to Amazon Elastic Container bug This issue is a bug. p1

Comments

@akbisw
Copy link

akbisw commented Sep 28, 2023

Describe the bug

ECS Hotswap is supposed push a new image on change, create a new task definition and update ECS service with that new task definition.

It is doing all of that except updating the new image ARN in the new task definition. Hotswap is effectively swapping in old code.

Expected Behavior

When there is code change, I expect new image to be built, uploaded to ECR and task definition updated with that new image.

Current Behavior

hotswap command completes without errors

cdk deploy web-api-service-dev --hotswap -e -vvv --force

[14:38:33] CDK toolkit version: 2.99.0 (build 0aa1096)
[14:38:33] Command line arguments: {
  _: [ 'deploy' ],
  hotswap: true,
  e: true,
  exclusively: true,
  v: 3,
  verbose: 3,
  force: true,
  f: true,
  app: 'development/dev.py',
  a: 'development/dev.py',
  lookups: true,
  'ignore-errors': false,
  ignoreErrors: false,
  json: false,
  j: false,
  debug: false,
  ec2creds: undefined,
  i: undefined,
  'version-reporting': undefined,
  versionReporting: undefined,
  'path-metadata': undefined,
  pathMetadata: undefined,
  'asset-metadata': undefined,
  assetMetadata: undefined,
  'role-arn': undefined,
  r: undefined,
  roleArn: undefined,
  staging: true,
  'no-color': false,
  noColor: false,
  ci: false,
  all: false,
  'build-exclude': [],
  E: [],
  buildExclude: [],
  parameters: [ {} ],
  'previous-parameters': true,
  previousParameters: true,
  logs: true,
  concurrency: 1,
  'asset-prebuild': true,
  assetPrebuild: true,
  '$0': '/opt/homebrew/bin/cdk',
  STACKS: [ 'web-api-service-dev' ],
  'S-t-a-c-k-s': [ 'web-api-service-dev' ]
}
[14:38:33] cdk.json: {
  "app": "python3 app.py",
  "watch": {
    "include": [
      "../cognito/**",
      "../src/**"
    ],
    "exclude": [
      "README.md",
      "cdk*.json",
      "../src/**/__pycache__",
      "../src/**/tests",
      "../src/**/.mypy_cache",
      "../src/**/.venv"
    ]
  }
}
[14:38:33] cdk.context.json: {
  ******
}
[14:38:33] merged settings: {
  versionReporting: true,
  assetMetadata: true,
  pathMetadata: true,
  output: 'cdk.out',
  app: 'development/dev.py',
  watch: {
    include: [ '../cognito/**', '../src/**' ],
    exclude: [
      'README.md',
      'cdk*.json',
      '../src/**/__pycache__',
      '../src/**/tests',
      '../src/**/.mypy_cache',
      '../src/**/.venv'
    ]
  },
  context: {},
  debug: false,
  toolkitBucket: {},
  staging: true,
  bundlingStacks: [ 'web-api-service-dev' ],
  lookups: true,
  assetPrebuild: true
}
[14:38:33] [trace] SdkProvider#withAwsCliCompatibleDefaults()
[14:38:33] Determining if we're on an EC2 instance.
[14:38:33] Does not look like an EC2 instance.
[14:38:33] Reading cached notices from /Users/ab/.cdk/cache/notices.json
[14:38:33] Unable to determine AWS region from environment or AWS configuration (profile: "default"), defaulting to 'us-east-1'
[14:38:33] Toolkit stack: CDKToolkit
[14:38:33] Setting "CDK_DEFAULT_REGION" environment variable to us-east-1
[14:38:33] [trace] SdkProvider#defaultAccount()
[14:38:33] [trace]   SdkProvider#defaultCredentials()
[14:38:33] Resolving default credentials
[14:38:34] [trace]   SDK#currentAccount()
[14:38:34] [trace]     SDK#forceCredentialRetrieval()
[14:38:34] Looking up default account ID from STS
[14:38:34] [AWS sts 200 0.193s 0 retries] getCallerIdentity({})
[14:38:34] Default account ID: ***************
[14:38:34] Setting "CDK_DEFAULT_ACCOUNT" environment variable to ***************
[14:38:34] context: {********}
[14:38:34] outdir: cdk.out
[14:38:34] env: {
  CDK_DEFAULT_REGION: 'us-east-1',
  CDK_DEFAULT_ACCOUNT: '***************',
  CDK_OUTDIR: 'cdk.out',
  CDK_CLI_ASM_VERSION: '34.0.0',
  CDK_CLI_VERSION: '2.99.0'
}

✨  Synthesis time: 6.49s

⚠️ The --hotswap and --hotswap-fallback flags deliberately introduce CloudFormation drift to speed up deployments
⚠️ They should only be used for development - never use them for your production Stacks!

[14:38:40] [trace] SdkProvider#resolveEnvironment()
[14:38:40] [trace] SdkProvider#baseCredentialsPartition()
[14:38:40] [trace]   SdkProvider#resolveEnvironment()
[14:38:40] [trace]   SdkProvider#obtainBaseCredentials()
[14:38:40] [trace]     SdkProvider#defaultAccount()
[14:38:40] [trace]     SdkProvider#defaultCredentials()
[14:38:40] [trace]   SDK#currentAccount()
[14:38:40] [trace]     SDK#forceCredentialRetrieval()
[14:38:40] Retrieved account ID *************** from disk cache
[14:38:40] [trace] SdkProvider#forEnvironment()
[14:38:40] [trace]   SdkProvider#resolveEnvironment()
[14:38:40] [trace]   SdkProvider#obtainBaseCredentials()
[14:38:40] [trace]     SdkProvider#defaultAccount()
[14:38:40] [trace]     SdkProvider#defaultCredentials()
[14:38:40] [trace]   SdkProvider#withAssumedRole()
[14:38:40] Assuming role 'arn:aws:iam::***************:role/cdk-hnb659fds-deploy-role-***************-us-east-2'.
[14:38:40] [trace]   SDK#forceCredentialRetrieval()
[14:38:40] [trace] SDK#ssm()
[14:38:40] [trace]   SDK#wrapServiceErrorHandling()
[14:38:40] [AWS ssm 200 0.124s 0 retries] getParameter({ Name: '/cdk-bootstrap/hnb659fds/version' })
web-api-service-dev:  start: Building e6506d3c290d4f96e4eefd2b0cee18f0e9ea7d985a8be166ade2212318343f30:***************-us-east-2
web-api-service-dev:  success: Built e6506d3c290d4f96e4eefd2b0cee18f0e9ea7d985a8be166ade2212318343f30:***************-us-east-2
[14:38:40] [trace] SdkProvider#resolveEnvironment()
[14:38:40] [trace]   SdkProvider#resolveEnvironment()
[14:38:40] [trace] SdkProvider#baseCredentialsPartition()
[14:38:40] [trace]   SdkProvider#resolveEnvironment()
[14:38:40] [trace]     SdkProvider#baseCredentialsPartition()
[14:38:40] [trace]       SdkProvider#resolveEnvironment()
[14:38:40] [trace]     SdkProvider#obtainBaseCredentials()
[14:38:40] [trace]       SdkProvider#defaultAccount()
[14:38:40] [trace]         SdkProvider#obtainBaseCredentials()
[14:38:40] [trace]           SdkProvider#defaultAccount()
[14:38:40] [trace]         SdkProvider#defaultCredentials()
[14:38:40] [trace]           SdkProvider#defaultCredentials()
[14:38:40] [trace]     SDK#currentAccount()
[14:38:40] [trace]       SDK#forceCredentialRetrieval()
[14:38:40] [trace]         SDK#currentAccount()
[14:38:40] [trace]           SDK#forceCredentialRetrieval()
[14:38:40] Retrieved account ID *************** from disk cache
web-api-service-dev:  start: Building b03bf6f842d1d9783f11fe80fd3d324a3d47f2f4bbc14552fe97cbaf2ff99ccd:***************-us-east-2
[14:38:40] [trace]     SdkProvider#baseCredentialsPartition()
[14:38:40] [trace]       SdkProvider#resolveEnvironment()
[14:38:40] [trace]       SdkProvider#obtainBaseCredentials()
[14:38:40] [trace]         SdkProvider#defaultAccount()
[14:38:40] [trace]         SdkProvider#defaultCredentials()
[14:38:40] [trace]       SDK#currentAccount()
[14:38:40] [trace]         SDK#forceCredentialRetrieval()
[14:38:40] Retrieved account ID *************** from disk cache
web-api-service-dev:  start: Publishing e6506d3c290d4f96e4eefd2b0cee18f0e9ea7d985a8be166ade2212318343f30:***************-us-east-2
[14:38:40] [trace]     SdkProvider#baseCredentialsPartition()
[14:38:40] [trace]       SdkProvider#resolveEnvironment()
[14:38:40] [trace]       SdkProvider#obtainBaseCredentials()
[14:38:40] [trace]         SdkProvider#defaultAccount()
[14:38:40] [trace]         SdkProvider#defaultCredentials()
[14:38:40] [trace]       SDK#currentAccount()
[14:38:40] [trace]         SDK#forceCredentialRetrieval()
[14:38:40] Retrieved account ID *************** from disk cache
[14:38:40] [trace]     SdkProvider#forEnvironment()
[14:38:40] [trace]       SdkProvider#resolveEnvironment()
[14:38:40] [trace]       SdkProvider#obtainBaseCredentials()
[14:38:40] [trace]         SdkProvider#defaultAccount()
[14:38:40] [trace]         SdkProvider#defaultCredentials()
[14:38:40] [trace]       SdkProvider#withAssumedRole()
[14:38:40] Assuming role 'arn:aws:iam::***************:role/cdk-hnb659fds-image-publishing-role-***************-us-east-2'.
[14:38:40] [trace]       SDK#forceCredentialRetrieval()
[14:38:40] Retrieved account ID *************** from disk cache
[14:38:40] [trace]     SdkProvider#forEnvironment()
[14:38:40] [trace]       SdkProvider#resolveEnvironment()
[14:38:40] [trace]       SdkProvider#obtainBaseCredentials()
[14:38:40] [trace]         SdkProvider#defaultAccount()
[14:38:40] [trace]         SdkProvider#defaultCredentials()
[14:38:40] [trace]       SdkProvider#withAssumedRole()
[14:38:40] Assuming role 'arn:aws:iam::***************:role/cdk-hnb659fds-file-publishing-role-***************-us-east-2'.
[14:38:40] [trace]       SDK#forceCredentialRetrieval()
[14:38:40] [trace]     SDK#ecr()
[14:38:40] [trace]       SDK#wrapServiceErrorHandling()
[14:38:40] [trace] SDK#s3()
[14:38:40] [trace]   SDK#wrapServiceErrorHandling()
[14:38:40] web-api-service-dev:  check: Check s3://cdk-hnb659fds-assets-***************-us-east-2/e6506d3c290d4f96e4eefd2b0cee18f0e9ea7d985a8be166ade2212318343f30.json
[14:38:40] [AWS s3 200 0.148s 0 retries] getBucketLocation({ Bucket: 'cdk-hnb659fds-assets-***************-us-east-2' })
[14:38:40] [AWS ecr 200 0.174s 0 retries] describeRepositories({
  repositoryNames: [
    'cdk-hnb659fds-container-assets-***************-us-east-2',
    [length]: 1
  ]
})
[14:38:40] web-api-service-dev:  check: Check ***************.dkr.ecr.us-east-2.amazonaws.com/cdk-hnb659fds-container-assets-***************-us-east-2:b03bf6f842d1d9783f11fe80fd3d324a3d47f2f4bbc14552fe97cbaf2ff99ccd
[14:38:40] [AWS s3 200 0.151s 0 retries] listObjectsV2({
  Bucket: 'cdk-hnb659fds-assets-***************-us-east-2',
  Prefix: 'e6506d3c290d4f96e4eefd2b0cee18f0e9ea7d985a8be166ade2212318343f30.json',
  MaxKeys: 1
})
[14:38:40] web-api-service-dev:  found: Found s3://cdk-hnb659fds-assets-***************-us-east-2/e6506d3c290d4f96e4eefd2b0cee18f0e9ea7d985a8be166ade2212318343f30.json
web-api-service-dev:  success: Published e6506d3c290d4f96e4eefd2b0cee18f0e9ea7d985a8be166ade2212318343f30:***************-us-east-2
[14:38:41] [AWS ecr 200 0.169s 0 retries] describeImages({
  repositoryName: 'cdk-hnb659fds-container-assets-***************-us-east-2',
  imageIds: [
    {
      imageTag: 'b03bf6f842d1d9783f11fe80fd3d324a3d47f2f4bbc14552fe97cbaf2ff99ccd'
    },
    [length]: 1
  ]
})
[14:38:41] web-api-service-dev:  found: Found ***************.dkr.ecr.us-east-2.amazonaws.com/cdk-hnb659fds-container-assets-***************-us-east-2:b03bf6f842d1d9783f11fe80fd3d324a3d47f2f4bbc14552fe97cbaf2ff99ccd
web-api-service-dev:  success: Built b03bf6f842d1d9783f11fe80fd3d324a3d47f2f4bbc14552fe97cbaf2ff99ccd:***************-us-east-2
[14:38:41] [trace] SdkProvider#resolveEnvironment()
[14:38:41] [trace] SdkProvider#baseCredentialsPartition()
[14:38:41] [trace]   SdkProvider#resolveEnvironment()
[14:38:41] [trace]   SdkProvider#obtainBaseCredentials()
[14:38:41] [trace]     SdkProvider#defaultAccount()
[14:38:41] [trace]     SdkProvider#defaultCredentials()
[14:38:41] [trace]   SDK#currentAccount()
[14:38:41] [trace]     SDK#forceCredentialRetrieval()
[14:38:41] Retrieved account ID *************** from disk cache
web-api-service-dev:  start: Publishing b03bf6f842d1d9783f11fe80fd3d324a3d47f2f4bbc14552fe97cbaf2ff99ccd:***************-us-east-2
web-api-service-dev:  success: Published b03bf6f842d1d9783f11fe80fd3d324a3d47f2f4bbc14552fe97cbaf2ff99ccd:***************-us-east-2
[14:38:41] Reading existing template for stack web-api-service-dev (web-api-service-dev).
[14:38:41] [trace] SdkProvider#resolveEnvironment()
[14:38:41] [trace] SdkProvider#baseCredentialsPartition()
[14:38:41] [trace]   SdkProvider#resolveEnvironment()
[14:38:41] [trace]   SdkProvider#obtainBaseCredentials()
[14:38:41] [trace]     SdkProvider#defaultAccount()
[14:38:41] [trace]     SdkProvider#defaultCredentials()
[14:38:41] [trace]   SDK#currentAccount()
[14:38:41] [trace]     SDK#forceCredentialRetrieval()
[14:38:41] Retrieved account ID *************** from disk cache
[14:38:41] [trace] SdkProvider#forEnvironment()
[14:38:41] [trace]   SdkProvider#resolveEnvironment()
[14:38:41] [trace]   SdkProvider#obtainBaseCredentials()
[14:38:41] [trace]     SdkProvider#defaultAccount()
[14:38:41] [trace]     SdkProvider#defaultCredentials()
[14:38:41] [trace]   SdkProvider#withAssumedRole()
[14:38:41] Assuming role 'arn:aws:iam::***************:role/cdk-hnb659fds-lookup-role-***************-us-east-2'.
[14:38:41] [trace]   SDK#forceCredentialRetrieval()
[14:38:41] [trace] SDK#cloudFormation()
[14:38:41] [trace]   SDK#wrapServiceErrorHandling()
[14:38:41] [AWS cloudformation 200 0.167s 0 retries] describeStacks({ StackName: 'web-api-service-dev' })
[14:38:41] [AWS cloudformation 200 0.258s 0 retries] getTemplate({ StackName: 'web-api-service-dev', TemplateStage: 'Original' })
web-api-service-dev (web-api-service-dev): deploying... [1/1]
[14:38:41] [trace] SdkProvider#resolveEnvironment()
[14:38:41] [trace] SdkProvider#baseCredentialsPartition()
[14:38:41] [trace]   SdkProvider#resolveEnvironment()
[14:38:41] [trace]   SdkProvider#obtainBaseCredentials()
[14:38:41] [trace]     SdkProvider#defaultAccount()
[14:38:41] [trace]     SdkProvider#defaultCredentials()
[14:38:41] [trace]   SDK#currentAccount()
[14:38:41] [trace]     SDK#forceCredentialRetrieval()
[14:38:41] Retrieved account ID *************** from disk cache
[14:38:41] [trace] SDK#appendCustomUserAgent()
[14:38:41] [trace] SDK#cloudFormation()
[14:38:41] [trace]   SDK#wrapServiceErrorHandling()
[14:38:42] [AWS cloudformation 200 0.145s 0 retries] describeStacks({ StackName: 'web-api-service-dev' })
[14:38:42] web-api-service-dev: checking if we can skip deploy
[14:38:42] web-api-service-dev: forced deployment
[14:38:42] web-api-service-dev: deploying...
[14:38:42] [trace] SDK#getEndpointSuffix()
[14:38:42] [trace] SdkProvider#resolveEnvironment()
[14:38:42] [trace] SdkProvider#forEnvironment()
[14:38:42] [trace]   SdkProvider#resolveEnvironment()
[14:38:42] [trace]   SdkProvider#obtainBaseCredentials()
[14:38:42] [trace]     SdkProvider#defaultAccount()
[14:38:42] [trace]     SdkProvider#defaultCredentials()
[14:38:42] [trace]   SDK#validateCredentials()
[14:38:42] [AWS sts 200 0.115s 0 retries] getCallerIdentity({})
[14:38:42] [trace] SDK#currentAccount()
[14:38:42] [trace]   SDK#forceCredentialRetrieval()
[14:38:42] Retrieved account ID *************** from disk cache
[14:38:42] [trace] SDK#cloudFormation()
[14:38:42] [trace]   SDK#wrapServiceErrorHandling()
[14:38:42] [AWS cloudformation 200 0.147s 0 retries] describeStacks({ StackName: 'web-api-service-dev' })
[14:38:42] [AWS cloudformation 200 0.304s 0 retries] getTemplate({ StackName: 'web-api-service-dev', TemplateStage: 'Original' })
[14:38:42] [trace] SDK#cloudFormation()
[14:38:42] [trace]   SDK#wrapServiceErrorHandling()
[14:38:43] [AWS cloudformation 200 0.221s 0 retries] listStackResources({ StackName: 'web-api-service-dev', NextToken: undefined })
[14:38:43] [trace] SDK#getEndpointSuffix()

✨ hotswapping resources:
[14:38:43] [trace] SDK#appendCustomUserAgent()
   ✨ ECS Task Definition 'web-api-service-family-dev'
   ✨ ECS Service 'web-api'
[14:38:43] [trace] SDK#ecs()
[14:38:43] [trace]   SDK#wrapServiceErrorHandling()
[14:38:43] [AWS ecs 200 0.153s 0 retries] describeTaskDefinition({
  taskDefinition: 'arn:aws:ecs:us-east-2:***************:task-definition/web-api-service-family-dev:722',
  include: [ 'TAGS', [length]: 1 ]
})
[14:38:43] [trace] SDK#ecs()
[14:38:43] [trace]   SDK#wrapServiceErrorHandling()
[14:38:43] [AWS ecs 200 0.226s 0 retries] registerTaskDefinition({
  containerDefinitions: [
    {
      name: 'datadog-agent',
      image: '***************.dkr.ecr.us-east-2.amazonaws.com/cdk-hnb659fds-container-assets-***************-us-east-2:b03bf6f842d1d9783f11fe80fd3d324a3d47f2f4bbc14552fe97cbaf2ff99ccd',
      cpu: 512,
      memoryReservation: 512,
      links: [ [length]: 0 ],
      portMappings: [
        { containerPort: 8126, hostPort: 8126, protocol: 'tcp' },
        [length]: 1
      ],
      essential: false,
      entryPoint: [ [length]: 0 ],
      command: [ [length]: 0 ],
      environment: [
        { name: 'ECS_FARGATE', value: 'true' },
        { name: 'DD_APM_ENABLED', value: 'true' },
        {
          name: 'DD_AC_EXCLUDE',
          value: 'name:datadog-agent name:ecs-agent'
        },
        { name: 'DD_REMOTE_CONFIGURATION_ENABLED', value: 'true' },
        [length]: 4
      ],
      environmentFiles: [ [length]: 0 ],
      mountPoints: [ [length]: 0 ],
      volumesFrom: [ [length]: 0 ],
      secrets: [
        {
          name: 'DD_API_KEY',
          valueFrom: 'arn:aws:secretsmanager:us-east-2:***************:secret:DdApiKeySecret-VbX7JsCRRXlz-5SBQWo'
        },
        [length]: 1
      ],
      dnsServers: [ [length]: 0 ],
      dnsSearchDomains: [ [length]: 0 ],
      extraHosts: [ [length]: 0 ],
      dockerSecurityOptions: [ [length]: 0 ],
      dockerLabels: { 'org.opencontainers.image.revision': 'v6.9.0' },
      ulimits: [ [length]: 0 ],
      logConfiguration: {
        logDriver: 'awsfirelens',
        options: {
          Host: 'http-intake.logs.datadoghq.com',
          Name: 'datadog',
          TLS: 'on',
          dd_service: 'web-api-agent',
          dd_source: 'ecs',
          dd_tags: 'env:dev',
          provider: 'ecs'
        },
        secretOptions: [
          {
            name: 'apikey',
            valueFrom: 'arn:aws:secretsmanager:us-east-2:***************:secret:DdApiKeySecret-VbX7JsCRRXlz-5SBQWo'
          },
          [length]: 1
        ]
      },
      healthCheck: {
        command: [ 'CMD-SHELL', 'agent health', [length]: 2 ],
        interval: 30,
        timeout: 5,
        retries: 2,
        startPeriod: 15
      },
      systemControls: [ [length]: 0 ]
    },
    {
      name: 'log_router',
      image: 'amazon/aws-for-fluent-bit:stable',
      cpu: 256,
      memoryReservation: 256,
      links: [ [length]: 0 ],
      portMappings: [ [length]: 0 ],
      essential: false,
      entryPoint: [ [length]: 0 ],
      command: [ [length]: 0 ],
      environment: [ [length]: 0 ],
      environmentFiles: [ [length]: 0 ],
      mountPoints: [ [length]: 0 ],
      volumesFrom: [ [length]: 0 ],
      secrets: [ [length]: 0 ],
      user: '0',
      dnsServers: [ [length]: 0 ],
      dnsSearchDomains: [ [length]: 0 ],
      extraHosts: [ [length]: 0 ],
      dockerSecurityOptions: [ [length]: 0 ],
      dockerLabels: {},
      ulimits: [ [length]: 0 ],
      systemControls: [ [length]: 0 ],
      firelensConfiguration: {
        type: 'fluentbit',
        options: {
          'config-file-type': 'file',
          'config-file-value': '/fluent-bit/configs/parse-json.conf',
          'enable-ecs-log-metadata': 'true'
        }
      }
    },
    {
      name: 'web-api',
      image: '***************.dkr.ecr.us-east-2.amazonaws.com/cdk-hnb659fds-container-assets-***************-us-east-2:8a2e97dddc47d3dc5151a4d29b812f2f5e5fb0921283f821cad6cf58086a3a73',
      cpu: 0,
      links: [ [length]: 0 ],
      portMappings: [
        { containerPort: 80, hostPort: 80, protocol: 'tcp' },
        [length]: 1
      ],
      essential: true,
      entryPoint: [ [length]: 0 ],
      command: [ './webapi_entrypoint.sh', [length]: 1 ],
      environment: [
        *********
      ]
      environmentFiles: [ [length]: 0 ],
      mountPoints: [ [length]: 0 ],
      volumesFrom: [ [length]: 0 ],
      linuxParameters: {
        capabilities: { add: [ [length]: 0 ], drop: [ [length]: 0 ] },
        devices: [ [length]: 0 ],
        initProcessEnabled: true,
        tmpfs: [ [length]: 0 ]
      },
      secrets: [ [length]: 0 ],
      dnsServers: [ [length]: 0 ],
      dnsSearchDomains: [ [length]: 0 ],
      extraHosts: [ [length]: 0 ],
      dockerSecurityOptions: [ [length]: 0 ],
      dockerLabels: {
        'com.datadoghq.ad.check_names': '["web-api"]',
        'com.datadoghq.ad.init_configs': '[{}]',
        'com.datadoghq.ad.instances': '[{"host": "%%host%%", "port": 80}]',
        'com.datadoghq.tags.env': 'dev',
        'com.datadoghq.tags.service': 'web-api',
        'org.opencontainers.image.revision': 'v6.9.2',
      },
      ulimits: [ [length]: 0 ],
      logConfiguration: {
        logDriver: 'awsfirelens',
        options: {
          Host: 'http-intake.logs.datadoghq.com',
          Name: 'datadog',
          TLS: 'on',
          dd_service: 'web-api',
          dd_source: 'ecs',
          dd_tags: 'env:dev',
          provider: 'ecs'
        },
        secretOptions: [
          {
            name: 'apikey',
            valueFrom: 'arn:aws:secretsmanager:us-east-2:***************:secret:DdApiKeySecret-VbX7JsCRRXlz-5SBQWo'
          },
          [length]: 1
        ]
      },
      healthCheck: {
        command: [ 'CMD-SHELL', 'exit 0', [length]: 2 ],
        interval: 30,
        timeout: 15,
        retries: 3,
        startPeriod: 60
      },
      systemControls: [ [length]: 0 ]
    },
    [length]: 3
  ],
  family: 'web-api-service-family-dev',
  taskRoleArn: 'arn:aws:iam::***************:role/web-api-service-dev-webapiservicedefinitiondevTask-1ECRP01L7MBS2',
  executionRoleArn: 'arn:aws:iam::***************:role/web-api-service-dev-webapiservicedefinitiondevExec-1B49MK9BMUOJY',
  networkMode: 'awsvpc',
  volumes: [ [length]: 0 ],
  placementConstraints: [ [length]: 0 ],
  runtimePlatform: { cpuArchitecture: 'ARM64', operatingSystemFamily: 'LINUX' },
  requiresCompatibilities: [ 'FARGATE', [length]: 1 ],
  cpu: '2048',
  memory: '8192'
})
[14:38:43] [trace] SDK#ecs()
[14:38:43] [trace]   SDK#wrapServiceErrorHandling()
[14:38:43] [AWS ecs 200 0.399s 0 retries] updateService({
  service: 'arn:aws:ecs:us-east-2:***************:service/cluster-dev/web-api',
  taskDefinition: 'arn:aws:ecs:us-east-2:***************:task-definition/web-api-service-family-dev:729',
  cluster: 'cluster-dev',
  forceNewDeployment: true,
  deploymentConfiguration: { minimumHealthyPercent: 0 }
})
[14:38:43] [trace] SDK#ecs()
[14:38:43] [trace]   SDK#wrapServiceErrorHandling()
[14:38:43] [trace] SDK#ecs()
[14:38:43] [trace]   SDK#wrapServiceErrorHandling()
[14:38:44] [AWS ecs 200 0.176s 0 retries] describeServices({
  cluster: 'cluster-dev',
  services: [
    'arn:aws:ecs:us-east-2:***************:service/cluster-dev/web-api',
    [length]: 1
  ]
})
✨ ECS Task Definition 'web-api-service-family-dev' hotswapped!
✨ ECS Service 'web-api' hotswapped!
[14:38:44] [trace] SDK#removeCustomUserAgent()

 ✅  web-api-service-dev (web-api-service-dev)

✨  Deployment time: 2.12s

✨  Total time: 8.61s

This ecs service has 3 containers:

  1. web-api

  2. datadog-agent

  3. log_router

  4. In the logs we see that web-api image and datadog-agent image is being built due to code changes:

[14:38:40] web-api-service-dev:  check: Check s3://cdk-hnb659fds-assets-***************-us-east-2/e6506d3c290d4f96e4eefd2b0cee18f0e9ea7d985a8be166ade2212318343f30.json
[14:38:40] [AWS s3 200 0.148s 0 retries] getBucketLocation({ Bucket: 'cdk-hnb659fds-assets-***************-us-east-2' })
[14:38:40] [AWS ecr 200 0.174s 0 retries] describeRepositories({
  repositoryNames: [
    'cdk-hnb659fds-container-assets-***************-us-east-2',
    [length]: 1
  ]
})
[14:38:40] web-api-service-dev:  check: Check ***************.dkr.ecr.us-east-2.amazonaws.com/cdk-hnb659fds-container-assets-***************-us-east-2:b03bf6f842d1d9783f11fe80fd3d324a3d47f2f4bbc14552fe97cbaf2ff99ccd
[14:38:40] [AWS s3 200 0.151s 0 retries] listObjectsV2({
  Bucket: 'cdk-hnb659fds-assets-***************-us-east-2',
  Prefix: 'e6506d3c290d4f96e4eefd2b0cee18f0e9ea7d985a8be166ade2212318343f30.json',
  MaxKeys: 1
})
[14:38:40] web-api-service-dev:  found: Found s3://cdk-hnb659fds-assets-***************-us-east-2/e6506d3c290d4f96e4eefd2b0cee18f0e9ea7d985a8be166ade2212318343f30.json
web-api-service-dev:  success: Published e6506d3c290d4f96e4eefd2b0cee18f0e9ea7d985a8be166ade2212318343f30:***************-us-east-2
[14:38:41] [AWS ecr 200 0.169s 0 retries] describeImages({
  repositoryName: 'cdk-hnb659fds-container-assets-***************-us-east-2',
  imageIds: [
    {
      imageTag: 'b03bf6f842d1d9783f11fe80fd3d324a3d47f2f4bbc14552fe97cbaf2ff99ccd'
    },
    [length]: 1
  ]
})
[14:38:41] web-api-service-dev:  found: Found ***************.dkr.ecr.us-east-2.amazonaws.com/cdk-hnb659fds-container-assets-***************-us-east-2:b03bf6f842d1d9783f11fe80fd3d324a3d47f2f4bbc14552fe97cbaf2ff99ccd
web-api-service-dev:  success: Built b03bf6f842d1d9783f11fe80fd3d324a3d47f2f4bbc14552fe97cbaf2ff99ccd:***************-us-east-2
[14:38:41] [trace] SdkProvider#resolveEnvironment()
[14:38:41] [trace] SdkProvider#baseCredentialsPartition()
[14:38:41] [trace]   SdkProvider#resolveEnvironment()
[14:38:41] [trace]   SdkProvider#obtainBaseCredentials()
[14:38:41] [trace]     SdkProvider#defaultAccount()
[14:38:41] [trace]     SdkProvider#defaultCredentials()
[14:38:41] [trace]   SDK#currentAccount()
[14:38:41] [trace]     SDK#forceCredentialRetrieval()
[14:38:41] Retrieved account ID *************** from disk cache
web-api-service-dev:  start: Publishing b03bf6f842d1d9783f11fe80fd3d324a3d47f2f4bbc14552fe97cbaf2ff99ccd:***************-us-east-2
web-api-service-dev:  success: Published b03bf6f842d1d9783f11fe80fd3d324a3d47f2f4bbc14552fe97cbaf2ff99ccd:***************-us-east-2

But only datadog-agent image is updated in the task definition. Web-api image does not change or match the one that was just published:

      name: 'web-api',
      image: '***************.dkr.ecr.us-east-2.amazonaws.com/cdk-hnb659fds-container-assets-***************-us-east-2:8a2e97dddc47d3dc5151a4d29b812f2f5e5fb0921283f821cad6cf58086a3a73',
      name: 'datadog-agent',
      image: '***************.dkr.ecr.us-east-2.amazonaws.com/cdk-hnb659fds-container-assets-***************-us-east-2:b03bf6f842d1d9783f11fe80fd3d324a3d47f2f4bbc14552fe97cbaf2ff99ccd',

b03bf6f842d1d9783f11fe80fd3d324a3d47f2f4bbc14552fe97cbaf2ff99ccd matches the logs above. 8a2e97dddc47d3dc5151a4d29b812f2f5e5fb0921283f821cad6cf58086a3a73 does not

Note that web-api marked as the only essential container.

Reproduction Steps

Simplified version of the ECS app:
https://github.com/akbisw/localstack-sample/tree/main

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.99.0

Framework Version

2.99.0

Node.js Version

v20.7.0

OS

macOS

Language

Python

Language Version

3.9.6

Other information

No response

@akbisw akbisw added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Sep 28, 2023
@github-actions github-actions bot added the @aws-cdk/aws-ecs Related to Amazon Elastic Container label Sep 28, 2023
@tmokmss
Copy link
Contributor

tmokmss commented Sep 29, 2023

Hi thanks for reporting the issue, and I confirmed it reproduces. Sorry for inconvenience 🙏

From v2.93.0 (#26404), we hotswap ECS task definition in the following steps:

  1. Calculate patch from old/new CFn template
  2. Apply the patch to the task definition fetched from describeTaskDefinition API
  3. register the new task definition and update services

The root cause of this is the order of containerDefinitions array is somehow shuffled when deployed to ECS, so the patch calculated from CFn template becomes invalid.

I guess we'd better revert #26404 for now, as it turned out to be unreliable (e.g. there can be more arrays that will get shuffled, but we don't know because it's undocumented). I'll create a PR later.

In the mean time, you should be able to use hotswap properly on cdk v2.92.0.

@akbisw
Copy link
Author

akbisw commented Sep 29, 2023

Thanks for confirming @tmokmss. #26404 allows ecs hotswap to work in the first place for an app like https://github.com/akbisw/localstack-sample/tree/main with references from other stacks as env vars 😓 so I think I'll be waiting a while 😢

@tmokmss
Copy link
Contributor

tmokmss commented Sep 29, 2023

Fortunately, the ongoing #27292 will make it possible to evaluate Fn::ImportValue, so you will be able to hotswap task definition with cross-stack references, even without #26404 :)

@akbisw
Copy link
Author

akbisw commented Oct 2, 2023

Many thanks! @tmokmss

@indrora
Copy link
Contributor

indrora commented Oct 2, 2023

CDK Team Member here.

Thank you for reporting this. This does indeed look like a regression. I can see that @tmokmss has put in a PR for this - Thank you!

@indrora indrora added p1 needs-review and removed needs-triage This issue or PR still needs to be triaged. labels Oct 2, 2023
@mergify mergify bot closed this as completed in #27358 Oct 12, 2023
mergify bot pushed a commit that referenced this issue Oct 12, 2023
…tain intrinsics" (#27358)

Closes #27343

From v2.93.0 (#26404), we hotswap ECS task definition in the following steps:

1. Calculate patch from old/new CFn template
2. Apply the patch to the task definition fetched from describeTaskDefinition API
3. register the new task definition and update services

The root cause of the issue #27343 is the order of containerDefinitions array is somehow shuffled when deployed to ECS, so the patch calculated from CFn template becomes invalid.

For example, when the containerDefinitions in a CFn template is like below:

```json
    "ContainerDefinitions": [
     {
      "Name": "main",
      "Image": "imageA"
     },
     {
      "Name": "sidecar",
      "Image": "imageB"
     }
    ],
```

the deployed task definition can sometimes become like this:

```json
    "ContainerDefinitions": [
     {
      "Name": "sidecar",
      "Image": "imageB"
     },
     {
      "Name": "main",
      "Image": "imageA"
     }
    ],
```

This makes a patch calculated from CFn template diff completely invalid. We can sort both CFn template and the response of describeTaskDefinition API in a deterministic order, but it is still unreliable because there can be more arrays whose order will be shuffled. [The response of describeTaskDefinition](https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_DescribeTaskDefinition.html#API_DescribeTaskDefinition_ResponseSyntax) has many array fields, and it is not documented if they may be shuffled or not.

I guess we should completely abandon this approach, because it cannot be reliable enough. I have an idea for more reliable approach, but at least it should be reverted asap as it's breaking the ECS hotswap feature.

I'm really sorry for me not being aware with this behavior 🙏 


----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-ecs Related to Amazon Elastic Container bug This issue is a bug. p1
Projects
None yet
4 participants