Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cdk-assets: Remove asset from staging bucket on failed deployment #14474

Closed
2 tasks
bgshacklett opened this issue Apr 30, 2021 · 12 comments · Fixed by aws/aws-cdk-cli#197
Closed
2 tasks

cdk-assets: Remove asset from staging bucket on failed deployment #14474

bgshacklett opened this issue Apr 30, 2021 · 12 comments · Fixed by aws/aws-cdk-cli#197
Labels
@aws-cdk/assets Related to the @aws-cdk/assets package effort/small Small work item – less than a day of effort feature-request A feature should be added or improved. p2

Comments

@bgshacklett
Copy link

In #12536, it has been noted that part of the problem is that a corrupted zip file may be uploaded to the staging bucket. At this point, CDK will no-longer attempt to upload the asset, again, because it detects that an asset with the corresponding hash resides within the bucket. After reaching this state, it is necessary to manually remove the affected asset, or assets, from the staging bucket before a successful deployment can occur. In cases where the deployment of a given asset fails, the asset should be removed from the staging bucket to ensure that this "broken" state is not reached.

Use Case

This change would help ensure that CDK does not attempt to use a corrupt pre-existing asset from the staging bucket during deployment.

Alternatives

Provide a CLI flag to ensure that assets are overwritten in the staging bucket on every deployment.

Other

  • 👋 I may be able to implement this feature request
  • ⚠️ This feature might incur a breaking change

This is a 🚀 Feature Request

@bgshacklett bgshacklett added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Apr 30, 2021
@github-actions github-actions bot added the @aws-cdk/assets Related to the @aws-cdk/assets package label Apr 30, 2021
@eladb
Copy link
Contributor

eladb commented May 2, 2021

Reassigning to @rix0rrr

@eladb eladb assigned rix0rrr and unassigned eladb May 2, 2021
@eladb eladb added p1 effort/small Small work item – less than a day of effort labels May 2, 2021
@dariagrudzien
Copy link

We seem to be experiencing the same issue.

@ryparker ryparker removed the needs-triage This issue or PR still needs to be triaged. label Jun 1, 2021
@github-actions
Copy link

This issue has not received any attention in 1 year. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

@github-actions github-actions bot added the closing-soon This issue will automatically close in 4 days unless further comments are made. label Jun 17, 2022
@bgshacklett
Copy link
Author

Please do not auto-close this issue.

@github-actions github-actions bot removed the closing-soon This issue will automatically close in 4 days unless further comments are made. label Jun 17, 2022
@sannies
Copy link

sannies commented Feb 21, 2023

when you interrupt a deploy with Ctrl-C you also might end up with corrupted asset directories(*) for 3rd party layers. These asset directories will then be zipped, uploaded and cached. It is very hard to recover from that state.

(*) in my case the 3rd party layer is created by 'pip' in a docker.

@rix0rrr
Copy link
Contributor

rix0rrr commented Feb 21, 2023

Good find! If we can, we should try and switch to multipart uploads. Those are atomic by default, and the file will only appear if the upload completes.

Depends on whether wr already have the correct s3 permissions on the asset role though...

@rix0rrr
Copy link
Contributor

rix0rrr commented Feb 21, 2023

Multipart shouldn't need any additional permissions, so we should be good to deploy that.

Does need an additional lifecycle rule on the bucket to remove old multiparts though.

@sannies
Copy link

sannies commented Feb 21, 2023

I don't think that we are exactly talking about the same issue here. I my case I hit Ctrl-C while the pip install (*) is running. The asset directory (asset.0aff....cd54) was created and some but not all of the 3rd party libraries have been installed in it. In this moment Ctrl-C interrupts the installation. The directory is then there but its content is corrupt.
The next cdk synth will not rebuild this specific asset again. It is already there - no reason to do it. The directory will then be zipped and uploaded. In this moment the cdk asset bucket is 'poisoned' and you can only recover when your change the assets by force e.g. change the requirements.txt. A force flag would allow recovery without actually performing a dummy change.

(*)

LayerVersion(
   stack, '3rdpLayer',
   code=AssetCode(
        "lambdas",
        bundling=BundlingOptions(
            image=Runtime.PYTHON_3_9.,
            command=[
                'bash', '-c',
                'pip install -r requirements.txt -t /asset-output/python',
            ])))

@rix0rrr
Copy link
Contributor

rix0rrr commented Feb 22, 2023

Oh I see, this isn't about the upload but about the build. I misunderstood.

We've fixed this for zipping (by building to a tempfile), but apparently not for bundling. That'll be the solution for bundling as well then.

@rix0rrr
Copy link
Contributor

rix0rrr commented Mar 5, 2025

The root cause is being fixed in #33692

To get past cases where a broken asset has been published, rm -rf cdk.out && cdk deploy --force will do a clean build and forced overwrite of a potentially corrupted asset.

That last bit is not true yet, as cdk-assets does its own short-circuiting.

@rix0rrr
Copy link
Contributor

rix0rrr commented Mar 5, 2025

After aws/aws-cdk-cli#197 is released, the above command rm -rf cdk.out && cdk deploy --force will do what it needs to do.

mergify bot pushed a commit that referenced this issue Mar 5, 2025

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
#33692)

When a bundling command is interrupted with Ctrl-C, the asset output directory has already been created. On the next synthesis, we assume the asset has already successfully been produced, don't do any bundling, and upload it.

We will then have produced and uploaded a broken asset.

Instead, the common pattern to handle this is:

- Do the work into a temporary directory
- Rename the temporary directory to the target directory only if the work succeeds.

Closes #33201, closes #32869, relates to #14474.

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
github-merge-queue bot pushed a commit to aws/aws-cdk-cli that referenced this issue Mar 7, 2025

Partially verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
We cannot verify signatures from co-authors, and some of the co-authors attributed to this commit require their commits to be signed.
The `cdk deploy --force` flag is intended to disable all smartness
around saving work. If set, it won't check whether assets already exist
in the cloud, and remove the build and publishing steps from the work
graph.

However, this by itself is not enough to make sure the asset truly gets
published again, because the `publish()` action has its own version of
short-circuiting again.

Rather than remove the short-circuiting behavior from `cdk-assets`, we
add another `{ force }` flag there as well, which gets its value from
the CLI's `--force` flag.

This will make it possible to recover from corrupted assets which were
accidentally published, as fixed in
aws/aws-cdk#33692, by running `rm -rf cdk.out &&
cdk deploy --force`.

Fixes aws/aws-cdk#14474.

---
By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache-2.0 license

---------

Signed-off-by: github-actions <github-actions@github.com>
Co-authored-by: github-actions <github-actions@github.com>
Co-authored-by: Kaizen Conroy <36202692+kaizencc@users.noreply.github.com>
Co-authored-by: Momo Kornher <kornherm@amazon.co.uk>
Copy link

github-actions bot commented Mar 7, 2025

Comments on closed issues and PRs are hard for our team to see.
If you need help, please open a new issue that references this one.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 7, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
@aws-cdk/assets Related to the @aws-cdk/assets package effort/small Small work item – less than a day of effort feature-request A feature should be added or improved. p2
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants