-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cdk deploy in endless loop cause of Fargate Service cant fire up task #7746
Comments
I have the same issue with the ApplicationLoadbalancedFargate pattern. I use fromAsset(pathName) and deploy via cdk deploy locally. Looking at the task page there was an error message that it couldn't download the image from ECR. To me, that sounds like it is an error with the preconfigured IAM permissions. I launched the task via the Console and the Task succesfully reached the status RUNNING
CDK version 1.36.1 |
@jonny-rimek thats a different scenario than mine. I dont have IAM problems. I digged in deeper and to me it looks like a chicken/egg problem. When i remove my EcsDeployment Stage from Codepipeline and deploy my stack from scratch everything works. Of course this gets me a docker image in my ECS repo (because codepipeline runs). Now when i re-add the ECS Deployment stage in my code and re-deploy the stack, everything works because now there is a docker image in the ECR repo. Subsequent codepipeline runs triggered via Github repo change work too and i get full auto-deployment and stuff. So currently i must deploy my stack in two steps, first without the deployment stage and then with it included. Looks wrong to me. IMO the problem is that ApplicationLoadBalancedFargateService directly wants to bootstrap an image via:
it doesnt know that its embedded in EcsDeployAction where it should act only when there is an imagedefinitions.json on input attribute. |
to you deploy via |
I dropped the ecs pattern and did everything from scratch and the deployment works just fine(no ALB yet) If I remove assignPublicIp I get the following error message |
Need to correct my previous statement. Its indeed a IAM issue but i dont understand why. Thats what i ve seen in the task logs:
Before i thought its a chicken egg problem and tried to circumvent it by using:
Here "hello-world" ist the most tiny image on dockerhub i could find which acts as a placeholder as long as my codepipeline runs. Now a clean "cdk deploy" finishes ok but the problem is that now when my codepipeline finishes the new image wont be pulled into ECS. The scary thing is that my pipeline worked yesterday witbout problems and i could easily trigger new builds and ECS was updated accordingly. |
Ok, i am facing a two headed snake here. It looks like if i use the "placeholder" image approach with fromRegistry("hello-world"), then CDK cant know that i want to pull from ECR when my codepipeline finishes, thus not having the correct permissions in the taskExecutionRole. I can fix that with:
When i use the fromEcrRepository(ecrRepository, "latest") approach, then i still think i have a problem with not having an image available at first deploy time which leaves me with the endless deploy loop. Because here i think its not a permission problem because CDK does know that i want to interact with ECR and should create the default taskExecutionRole accordingly. I will try both approaches from scratch now to see if my findings hold. Always takes ages of course to test because destroying and deploying complex stacks takes a while. |
Ok. This is the detailed error when directly referencing a non-exising image with fromEcrRepository():
So to me it looks like the placeholder-dummy image for 1st time deployment is the only way to go. If you do it this way, you need to add a policy like mentioned in my previous post, because otherwise the CDK created TaskExecutionRole has not enough permissions. Hope i have not put too much infos in here, but this way other people can get an idea what to do. To the AWS-CDK dev team: Is there a way to solve this in an elegant way? |
Hey @logemann , yes, this is an issue. Basically, the problem is that we're missing a concept in the CDK currently, that represent "an image that doesn't exist yet, but will be created when the CodePipeline runs". In a demo project we've done a long time ago, we have a class that represents exactly that. This is how it is used: [1], [2]. Would adding this class to the main CDK project solve your issue @logemann ? If so, I will convert this issue to a feature request. Thanks, |
hey @skinny85 , thanks for commenting. Just checked the project and the class you mentioned. Quite some amount of code (not only the class but also the surroundings) to solve this particular issue. I would rather use my placeholder image instead of going the mentioned way. You might check the Tutorial i ve just finished regarding this issue to see how i approached this. From a dev standpoint it would be super nice if ApplicationLoadBalancedFargateService would be smart enough to know it is wrapped in EcsDeployAction and somehow do a different initializing behavior. But i am way too bad in Cloudformation inner workings to know if this is even possible. |
Well, this would solve the following issue that you talk about in your tutorial:
Instead, you would simply do this: createLoadBalancedFargateService(scope: Construct, vpc: Vpc, ecrRepository: ecr.Repository, pipelineProject: PipelineProject) {
var fargateService = new ecspatterns.ApplicationLoadBalancedFargateService(scope, 'myLbFargateService', {
// ...
taskImageOptions: {
containerName: repoName,
image: new PipelineContainerImage(ecrRepository),
containerPort: 8080,
},
});
fargateService.taskDefinition.executionRole?.addManagedPolicy(
ManagedPolicy.fromAwsManagedPolicyName('AmazonEC2ContainerRegistryPowerUser'));
return fargateService;
} And no weird workarounds are needed... isn't that strictly better? |
indeed... i somehow didnt fully understand the project you mentioned because i though you need to use CloudFormationCreateUpdateStackAction and some other things to get PipelineContainerImage up and running. Somehow couldnt see that PipelineContainerImage will suffice. To me thats definitely worth a FeatRequest then. Can you write in 2 short sentences what imageName in PipelineContainerImage gets resolved to and why it doesnt have the same problems (no finding an image) ? Thanks. Note: Will update the tutorial then, that there might be something coming along the way to make it even better. But for that i should understand it at least haha. Update: From what i can gather from the class is that there is some lazy evaluation going on with regards to the imageName in the ECR repo. But i really dont get what PipelineParam is. And cant get an idea of the (in my case) unused methods like paramName() |
The trick with If you see, then the parameter is filled in the If you don't want to use However, there is a problem: the action will update the image "out-of-band", causing an intentional drift in the CloudFormation state (the actual image will be something different than the image parameter in CloudFormation). This might prove problematic (for example, and update to your service's properties might make the image be reverted to its original), and in general is a bad practice. For those reasons, I would advise against using Does this make sense? |
Yeah makes sense and then i got it right that you cant use PipelineContainerImage isolated. But still i dont think its developer friendly. Using CloudFormationCreateUpdateStackAction feels like quite a big workaround too if at the end you just want to use EcsDeployAction. I think we can close this one, adding PipelineContainerImage to the distro would only make sense if there is a ton of documentation how to use it in conjunction with CloudFormationCreateUpdateStackAction as kind of a replacement to EcsDeployAction for this specific use case. A use case which is IMO quite mainstream. |
@logemann were you able to solve this issue? |
…ant to be used in CodePipeline While CDK Pipelines is the idiomatic way of deploying ECS applications in CDK, it does not handle the case where the application's source code is kept in a separate source code repository from the CDK infrastructure code. This adds a new class to the ECS module, `TagParameterContainerImage`, that allows deploying a service managed that way through CodePipeline. Related to aws#1237 Related to aws#7746
…ant to be used in CodePipeline While CDK Pipelines is the idiomatic way of deploying ECS applications in CDK, it does not handle the case where the application's source code is kept in a separate source code repository from the CDK infrastructure code. This adds a new class to the ECS module, `TagParameterContainerImage`, that allows deploying a service managed that way through CodePipeline. Related to aws#1237 Related to aws#7746
…ant to be used in CodePipeline While CDK Pipelines is the idiomatic way of deploying ECS applications in CDK, it does not handle the case where the application's source code is kept in a separate source code repository from the CDK infrastructure code. This adds a new class to the ECS module, `TagParameterContainerImage`, that allows deploying a service managed that way through CodePipeline. Related to aws#1237 Related to aws#7746
… be used in CodePipeline (#11795) While CDK Pipelines is the idiomatic way of deploying ECS applications in CDK, it does not handle the case where the application's source code is kept in a separate source code repository from the CDK infrastructure code. This adds a new class to the ECS module, `TagParameterContainerImage`, that allows deploying a service managed that way through CodePipeline. Related to #1237 Related to #7746 ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
… be used in CodePipeline (aws#11795) While CDK Pipelines is the idiomatic way of deploying ECS applications in CDK, it does not handle the case where the application's source code is kept in a separate source code repository from the CDK infrastructure code. This adds a new class to the ECS module, `TagParameterContainerImage`, that allows deploying a service managed that way through CodePipeline. Related to aws#1237 Related to aws#7746 ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@jonny-rimek did you find the solution to this? I also dropped ecs pattern and now I am facing the same issue. |
same problem here |
I am having a similar problem with ecs_pattern and QueueProcessingFargateService:
|
I cant solve this problem.... |
Hi, I had the same problem, I managed to get a workaround with the pattern. It's more code though
At this point it may be better to drop the pattern and create everything manually. With this tho I can use GitHub actions to push to the ecr repo with no issues, and the nodejs app starts. |
Still happening in newer cdk version: 2.41 |
this is happening to me as well |
Hello, I am still trying to deploy the most basic stack just to get It hangs. import { Construct } from 'constructs';
import { App, Stack, StackProps } from 'aws-cdk-lib';
import { Vpc } from 'aws-cdk-lib/aws-ec2';
import { Cluster, ContainerImage } from 'aws-cdk-lib/aws-ecs';
import { ApplicationLoadBalancedFargateService } from 'aws-cdk-lib/aws-ecs-patterns';
const app = new App();
export class ApiSixStack extends Stack {
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props);
const vpc = Vpc.fromLookup(this, 'VPC', { isDefault: true });
const cluster = new Cluster(this, 'Cluster', { vpc });
new ApplicationLoadBalancedFargateService(this, 'Service', {
cluster,
memoryLimitMiB: 1024,
desiredCount: 1,
cpu: 512,
taskImageOptions: { image: ContainerImage.fromRegistry('amazon/amazon-ecs-sample') },
loadBalancerName: 'application-lb-name',
});
}
}
const env = {
account: process.env.CDK_DEFAULT_ACCOUNT,
region: process.env.CDK_DEFAULT_REGION,
};
new ApiSixStack(app, 'stack-app', { env });
app.synth(); Any help is appreciated, |
@0xBradock check out the container status in ECS. I would bet the port |
@0xBradock I think it's because the load balancer cannot reach Fargate services' health checks. The default VPC you're using doesn't have private subnets and since You can solve it in two ways:
|
I am deploying a codepipeline stack with deployment to a fargate service. Problem is, when there is an issue starting the fargate task, the deployment never returns because fargate tries to start the task over and over again (like every minute or so).
Roughly my code is:
My problem could be that i define an image in the LoadBalancedFargateService which isnt available during deployment of the stack because codePipeline didnt run yet. Dont know for sure.
Question remains if its wise to just never terminate the "cdk deploy" cause of neverending tries to fire up a task in the backend.
Reproduction Steps
hard to reproduce out of context.
Error Log
no error in console on cdk deploy. Hard to find the real error. Tried it via AWS console without success.
Environment
This is 🐛 Bug Report
The text was updated successfully, but these errors were encountered: