Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strange cycle errors in 1.3.0 #31843

Closed
jbg opened this issue Sep 22, 2022 · 9 comments · Fixed by #31857 or #31917
Closed

strange cycle errors in 1.3.0 #31843

jbg opened this issue Sep 22, 2022 · 9 comments · Fixed by #31857 or #31917
Assignees
Labels
bug v1.3 Issues (primarily bugs) reported against v1.3 releases waiting for reproduction unable to reproduce issue without further information

Comments

@jbg
Copy link

jbg commented Sep 22, 2022

Terraform Version

Terraform v1.3.0
on darwin_arm64
+ provider registry.terraform.io/hashicorp/aws v4.31.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.13.1

Terraform Configuration Files

Unable to provide.

Debug Output

Unable to provide.

Expected Behavior

In Terraform 1.2.9, the configuration applies without any issue.

Actual Behavior

In Terraform 1.3.0, a cycle is detected, even though several of the resources listed adjacently in the cycle do not have any (implicit or explicit) dependency on each other.

│ Error: Cycle: aws_subnet.sin["ap-southeast-1a"], aws_iam_role.sin_cluster, aws_subnet.sin["ap-southeast-1c"], aws_vpc.sin, aws_subnet.sin["ap-southeast-1b"], provider["registry.terraform.io/hashicorp/kubernetes"], kubernetes_manifest.eventlistener (destroy), aws_kms_key.sin_cluster, aws_eks_cluster.sin

Perhaps notably, the kubernetes_manifest.eventlistener resource is tainted and will be replaced.

Downgrading to 1.2.9, the exact same configuration applies without any issue.

Steps to Reproduce

  1. terraform apply

Additional Context

No response

References

No response

@jbg jbg added bug new new issue not yet triaged labels Sep 22, 2022
@jbardin
Copy link
Member

jbardin commented Sep 22, 2022

Hi @jbg,

Thanks for filing the issue. Without any configuration or logging there's not much to go by here. Would it be possible to create a more minimal reproduction, or a redacted trace log? Using TF_LOG_CORE=trace will significantly reduce the log output, and contains graph information which might lead to the root cause.

Thanks!

@jbardin jbardin added waiting for reproduction unable to reproduce issue without further information and removed new new issue not yet triaged labels Sep 22, 2022
@jbardin jbardin self-assigned this Sep 22, 2022
@jbg
Copy link
Author

jbg commented Sep 29, 2022

@jbardin I've just tested with v1.3.1 and the problem still exists. I haven't had time to try to make a minimal reproduction yet. The issue seems to happen when resources are going to be destroyed and recreated, and the outputted cycle error always lists those resources as well as their providers in the cycle. The problem exists on both v1.3.0 and v1.3.1, but not on v1.2.9. I will try to get time to make a reproduction or redacted trace log this week.

@jbardin
Copy link
Member

jbardin commented Sep 29, 2022

Thanks @jbg, this cycle looked very much like what the linked PR handles, so I linked them. They may still be related, but I'll reopen this for further investigation.

@jbardin jbardin reopened this Sep 29, 2022
@jbardin jbardin added the v1.3 Issues (primarily bugs) reported against v1.3 releases label Sep 29, 2022
@Zordrak
Copy link

Zordrak commented Sep 29, 2022

I dont know how easy it's going to be to provide evidence as the module I'm using is absolutely huge.. but I just wanted to confirm that this is definitely a problem in 1.3.1, fine in 1.2.9

@jbardin
Copy link
Member

jbardin commented Sep 29, 2022

Would it be possible to compare trace logs from a plan in each version? You can avoid all the provider output by using TF_LOG_CORE=trace, though it's still going to be quite a lot if the config is that large.

The cycle output here actually looks a bit strange, with references going back and forth from the same aws_subnet.sin resource, which would imply that the aws_subnet.sin resource somehow references itself which should always have been a cycle:

aws_subnet.sin["ap-southeast-1a"],
aws_iam_role.sin_cluster,
aws_subnet.sin["ap-southeast-1c"],
aws_vpc.sin,
aws_subnet.sin["ap-southeast-1b"]

I'm wondering if this cycle is just a manifestation of one of the unsolvable ordering situations you can get when a provider depends on a managed resource, which is why it's recommended to not ever have that situation in the first place. The usual issue is just that the provider fails because it cannot get the necessary config data at the right time, but maybe we'll find that the ordering prior to v1.3 was underspecified in some cases.

@Zordrak
Copy link

Zordrak commented Sep 29, 2022

After applying with 1.2.9, 1.3.1 plans successfully. So it's definitely an issue as the above with a destroy/create. I'm going to have to revert to 1.2.9 until this is resolved - I can't risk getting locked into a position where I can't make a change due to a cycling plan.

The cycle I had looked like the below, and the cause for the change was a number of lambda functions having their layers changed. I know what you mean about the dependence of a provider on something (which for me is completely unavoidable), but as far as I can tell there is nothing in the list of thing being changed upon which a provider depends.

Error: Cycle: module.ous.aws_organizations_organizational_unit.level3["<redacted>"], <snipped lots of: module.ous.aws_organizations_organizational_unit.level<levelnum>["<redacted>"],>, module.bs_gr_audit.module.lambdacron_sechub_stds_controls.module.kms.aws_iam_policy.user (destroy), aws_organizations_organization.main, module.ous.var.organization_root_id (expand), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.aws_iam_policy.lambda_insights[0] (destroy), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.aws_iam_policy.lambda_execution (destroy), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.aws_cloudwatch_log_group.main (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.aws_iam_policy.lambda_insights[0] (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.module.kms.aws_iam_policy.admin (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.module.kms.aws_kms_alias.main (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.aws_sns_topic.main[0] (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.aws_iam_role_policy_attachment.insights_logs[0] (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.aws_iam_role.main (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.aws_cloudwatch_event_rule.main[0] (destroy), module.bs_gr_security.aws_iam_role_policy_attachment.lambdacron_sechub_stds_controls_kms_sns_alerts (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.module.kms.aws_iam_policy.user (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.aws_iam_policy.lambda_execution (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.aws_iam_role_policy_attachment.lambda_execution (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.aws_cloudwatch_event_target.main[0] (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.aws_lambda_permission.events[0] (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.aws_lambda_function.main (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.aws_cloudwatch_log_group.main (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.aws_iam_role_policy_attachment.sns_publish_lambda (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.aws_iam_policy.sns_publish_lambda (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.aws_iam_role_policy_attachment.xray (destroy), module.bs_gr_security.module.lambdacron_sechub_stds_controls.aws_iam_policy.xray (destroy), aws_organizations_account.security, provider["registry.terraform.io/hashicorp/aws"].security, module.bs_gr_security.module.lambdacron_sechub_stds_controls.aws_sns_topic_policy.main[0] (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.module.kms.aws_iam_policy.admin (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.aws_cloudwatch_log_group.main (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.aws_iam_policy.xray (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.aws_cloudwatch_event_rule.main[0] (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.aws_iam_policy.lambda_insights[0] (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.module.kms.aws_iam_policy.user (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.aws_sns_topic_policy.main[0] (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.aws_sns_topic.main[0] (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.aws_iam_policy.lambda_execution (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.aws_iam_policy.sns_publish_lambda (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.aws_iam_role_policy_attachment.insights_logs[0] (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.aws_iam_role_policy_attachment.xray (destroy), module.bs_gr_shared.aws_iam_role_policy_attachment.lambdacron_sechub_stds_controls_kms_sns_alerts (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.aws_cloudwatch_event_target.main[0] (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.aws_lambda_permission.events[0] (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.aws_lambda_function.main (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.aws_iam_role_policy_attachment.sns_publish_lambda (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.aws_iam_role_policy_attachment.lambda_execution (destroy), module.bs_gr_shared.module.lambdacron_sechub_stds_controls.aws_iam_role.main (destroy), aws_organizations_account.shared, provider["registry.terraform.io/hashicorp/aws"].shared, module.bs_gr_shared.module.lambdacron_sechub_stds_controls.module.kms.aws_kms_alias.main (destroy), module.ous.aws_organizations_organizational_unit.level1["graveyard"], module.ous.local.ous (expand), module.ous.output.ous (expand), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.module.kms.aws_iam_policy.admin (destroy), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.module.kms.aws_kms_alias.main (destroy), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.aws_iam_role_policy_attachment.insights_logs[0] (destroy), module.bs_gr_audit.aws_iam_role_policy_attachment.lambdacron_sechub_stds_controls_kms_sns_alerts (destroy), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.aws_iam_role.main (destroy), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.aws_iam_role_policy_attachment.lambda_execution (destroy), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.aws_sns_topic_policy.main[0] (destroy), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.aws_iam_policy.sns_publish_lambda (destroy), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.aws_sns_topic.main[0] (destroy), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.aws_iam_role_policy_attachment.sns_publish_lambda (destroy), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.aws_lambda_function.main (destroy), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.aws_iam_role_policy_attachment.xray (destroy), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.aws_iam_policy.xray (destroy), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.aws_cloudwatch_event_target.main[0] (destroy), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.aws_lambda_permission.events[0] (destroy), module.bs_gr_audit.module.lambdacron_sechub_stds_controls.aws_cloudwatch_event_rule.main[0] (destroy), aws_organizations_account.audit, provider["registry.terraform.io/hashicorp/aws"].audit

I've saved my console output around this issue for future reference, but cannot now reproduce the problem without significant impact to the project, so I'm not going to be able to provide a TRACE, although given the size of it for my module it'd be hard to make use of one.

EDIT: Scratch that.. I don't know why the AWS Organization is in that cycle.. I need to find that out as that will probably be key - the organization would cause a dependent-provider issue because the AWS accounts depend on the Org and the subordinate Providers depend on the accounts. But that said, it really doesn't account for the fact that 1.2.9 had no problem planning and applying the changes. Only 1.3.1 cared. So either it's a graph construction issue in 1.3.1, or the change to 1.3.1 somehow caused a resource to change that 1.2.9 didnt need to change, that then got itself caught in a dependency deathroll.

EDIT 2: I found why the org was in the list.. one of the changes was a service control policy. So, yes it's agreeable that this is somehow a combination of changes where enough resources either side of a provider were changing that the provider couldnt assert itself.. but:

  1. It doesnt explain why that would be a dependency cycle instead of a provider explosion
  2. It doesnt explain why 1.2.9 didnt care
    If it is in fact a tightening of the order specification in 1.3.1 then it makes me concerned that a number of use cases will break - because when it comes to terraform dependency ordering, it's often so hard to work out whats theoretically allowable, that we determine the rules by seeing what works, and what works reproducably. If ordering rules have been tightened it could literally be changing the game while the ball is in play and make certain things that were always possible before impossible now - and that's scary.

@jbardin
Copy link
Member

jbardin commented Sep 29, 2022

Prior to v1.3 the apply graph could discard most of the objects entirely, which simplified things greatly and avoided problems with use cases like this. Basically the dependencies were there, but we could ignore them in almost every case. The addition of preconditions and postconditions means that we need to keep everything around longer to validate the values, which is surfacing the problem.

Something I didn't think about previously though, which appears to be the case here, is that you have aws resources feeding into different aws providers' configurations! This of course seems obvious now, but the temporary workaround we have in place is only going to compare provider type (because this use case has almost always been between different providers entirely, and that's the easiest thing to check), but because they are the same type of provider we're not catching the problem at all!

I think we may be able to fix the most recent patch to work here, so that a more precise solution can be worked out in a future release.

@jack-parsons-bjss
Copy link

@jbardin I do not believe this cycle bug has been resolved - we've seen cycles on a number of occasions using 1.3.2 which do not occur with 1.2.9:
Error: Cycle: module.ous.aws_organizations_organizational_unit.level<redacted>["<redacted>"], module.ous.aws_organizations_organizational_unit.level<redacted>["<redacted>"], module.ous.aws_organizations_organizational_unit.level<redacted>["<redacted>"], module.ous.aws_organizations_organizational_unit.level<redacted>["<redacted>"], module.ous.aws_organizations_organizational_unit.level<redacted>["<redacted>"], module.ous.aws_organizations_organizational_unit.level<redacted>["<redacted>"], module.bs_gr_audit.module.lambdacron_remove_shield.aws_iam_role.main (destroy), module.bs_gr_shared.module.lambdacron_remove_shield.aws_iam_role.main (destroy), module.bs_gr_audit.aws_cloudwatch_event_rule.ec2_deletetags[0] (destroy), module.bs_gr_audit.module.lambdacron_remove_shield.aws_sns_topic_policy.main[0] (destroy), module.bs_gr_audit.module.lambdacron_remove_shield.aws_sns_topic.main[0] (destroy), module.bs_gr_audit.module.lambdacron_remove_shield.module.kms.aws_kms_key.main (destroy), module.bs_gr_shared.aws_cloudwatch_event_rule.elasticloadbalancing_removetags[0] (destroy), module.bs_gr_shared.module.lambdacron_remove_shield.module.kms.aws_iam_policy.user (destroy), module.bs_gr_audit.module.lambdacron_remove_shield.module.kms.aws_iam_policy.admin (destroy), module.bs_gr_audit.module.lambdacron_remove_shield.module.kms.aws_iam_policy.user (destroy), aws_organizations_organization.main, module.ous.var.organization_root_id (expand), module.bs_gr_audit.aws_lambda_permission.cloudtrail_delivery[0] (destroy), module.bs_gr_shared.module.lambdacron_remove_shield.module.kms.aws_iam_policy.admin (destroy), module.bs_gr_audit.module.lambdacron_remove_shield.module.kms.aws_kms_alias.main (destroy), module.bs_gr_audit.module.lambdacron_remove_shield.aws_lambda_function.main (destroy), module.bs_gr_audit.aws_lambda_permission.remove_shield_ec2_deletetags[0] (destroy), module.bs_gr_audit.aws_cloudwatch_event_target.ec2_deletetags[0] (destroy), module.bs_gr_audit.module.lambdacron_remove_shield.aws_cloudwatch_log_group.main (destroy), module.bs_gr_shared.module.lambdacron_remove_shield.aws_iam_role_policy_attachment.lambda_execution (destroy), module.bs_gr_shared.module.lambdacron_remove_shield.aws_iam_policy.lambda_execution (destroy), module.bs_gr_shared.aws_cloudwatch_event_rule.ec2_deletetags[0] (destroy), module.bs_gr_network.aws_cloudwatch_event_rule.ec2_deletetags[0] (destroy), module.bs_gr_network.module.lambdacron_remove_shield.aws_iam_role.main (destroy), module.bs_gr_network.module.lambdacron_remove_shield.module.kms.aws_iam_policy.admin (destroy), module.bs_gr_network.module.lambdacron_remove_shield.aws_iam_policy.lambda_execution (destroy), module.bs_gr_network.module.lambdacron_remove_shield.module.kms.aws_iam_policy.user (destroy), module.bs_gr_network.module.lambdacron_remove_shield.aws_iam_role_policy_attachment.lambda_execution (destroy), module.bs_gr_network.aws_cloudwatch_event_rule.elasticloadbalancing_removetags[0] (destroy), module.bs_gr_network.aws_lambda_permission.cloudtrail_delivery[0] (destroy), module.bs_gr_network.module.lambdacron_remove_shield.aws_lambda_function.main (destroy), module.bs_gr_network.aws_lambda_permission.remove_shield_ec2_deletetags[0] (destroy), module.bs_gr_network.module.lambdacron_remove_shield.aws_sns_topic.main[0] (destroy), module.bs_gr_network.aws_cloudwatch_event_target.ec2_deletetags[0] (destroy), module.bs_gr_network.aws_cloudwatch_event_target.elasticloadbalancing_removetags[0] (destroy), module.bs_gr_network.module.lambdacron_remove_shield.module.kms.aws_kms_alias.main (destroy), module.bs_gr_network.module.lambdacron_remove_shield.aws_sns_topic_policy.main[0] (destroy), module.bs_gr_network.module.lambdacron_remove_shield.module.kms.aws_kms_key.main (destroy), aws_organizations_account.network, provider["registry.terraform.io/hashicorp/aws"].network, module.bs_gr_network.aws_lambda_permission.remove_shield_elasticloadbalancing_removetags[0] (destroy), module.bs_gr_network.module.lambdacron_remove_shield.aws_cloudwatch_log_group.main (destroy), module.bs_gr_audit.aws_cloudwatch_event_target.elasticloadbalancing_removetags[0] (destroy), module.bs_gr_audit.aws_lambda_permission.remove_shield_elasticloadbalancing_removetags[0] (destroy), module.bs_gr_audit.aws_cloudwatch_event_rule.elasticloadbalancing_removetags[0] (destroy), module.bs_gr_shared.module.lambdacron_remove_shield.aws_cloudwatch_log_group.main (destroy), module.bs_gr_shared.module.lambdacron_remove_shield.module.kms.aws_kms_alias.main (destroy), module.bs_gr_shared.module.lambdacron_remove_shield.aws_sns_topic_policy.main[0] (destroy), module.bs_gr_shared.aws_lambda_permission.remove_shield_elasticloadbalancing_removetags[0] (destroy), module.bs_gr_shared.aws_cloudwatch_event_target.ec2_deletetags[0] (destroy), module.bs_gr_shared.module.lambdacron_remove_shield.aws_lambda_function.main (destroy), module.bs_gr_shared.aws_cloudwatch_event_target.elasticloadbalancing_removetags[0] (destroy), module.bs_gr_shared.module.lambdacron_remove_shield.aws_sns_topic.main[0] (destroy), module.bs_gr_shared.aws_lambda_permission.remove_shield_ec2_deletetags[0] (destroy), module.bs_gr_shared.module.lambdacron_remove_shield.module.kms.aws_kms_key.main (destroy), module.bs_gr_security.aws_cloudwatch_event_rule.ec2_deletetags[0] (destroy), module.bs_gr_security.module.lambdacron_remove_shield.aws_iam_policy.lambda_execution (destroy), module.bs_gr_security.module.lambdacron_remove_shield.aws_iam_role_policy_attachment.lambda_execution (destroy), module.bs_gr_security.module.lambdacron_remove_shield.aws_iam_role.main (destroy), module.bs_gr_security.aws_cloudwatch_event_rule.elasticloadbalancing_removetags[0] (destroy), module.bs_gr_security.module.lambdacron_remove_shield.aws_cloudwatch_log_group.main (destroy), module.bs_gr_security.aws_lambda_permission.cloudtrail_delivery[0] (destroy), module.bs_gr_security.module.lambdacron_remove_shield.module.kms.aws_iam_policy.user (destroy), module.bs_gr_security.module.lambdacron_remove_shield.aws_sns_topic_policy.main[0] (destroy), module.bs_gr_security.module.lambdacron_remove_shield.aws_lambda_function.main (destroy), module.bs_gr_security.aws_cloudwatch_event_target.elasticloadbalancing_removetags[0] (destroy), module.bs_gr_security.module.lambdacron_remove_shield.aws_sns_topic.main[0] (destroy), module.bs_gr_security.aws_lambda_permission.remove_shield_ec2_deletetags[0] (destroy), module.bs_gr_security.aws_lambda_permission.remove_shield_elasticloadbalancing_removetags[0] (destroy), module.bs_gr_security.module.lambdacron_remove_shield.module.kms.aws_kms_alias.main (destroy), module.bs_gr_security.aws_cloudwatch_event_target.ec2_deletetags[0] (destroy), module.bs_gr_security.module.lambdacron_remove_shield.module.kms.aws_kms_key.main (destroy), aws_organizations_account.security, provider["registry.terraform.io/hashicorp/aws"].security, module.bs_gr_security.module.lambdacron_remove_shield.module.kms.aws_iam_policy.admin (destroy), module.bs_gr_audit.module.lambdacron_remove_shield.aws_iam_role_policy_attachment.lambda_execution (destroy), module.bs_gr_audit.module.lambdacron_remove_shield.aws_iam_policy.lambda_execution (destroy), module.ous.aws_organizations_organizational_unit.level<redacted>["<redacted>"], module.ous.local.ous (expand), module.ous.output.ous (expand), aws_organizations_account.shared, provider["registry.terraform.io/hashicorp/aws"].shared, module.bs_gr_shared.aws_lambda_permission.cloudtrail_delivery[0] (destroy), aws_organizations_account.audit, provider["registry.terraform.io/hashicorp/aws"].audit

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug v1.3 Issues (primarily bugs) reported against v1.3 releases waiting for reproduction unable to reproduce issue without further information
Projects
None yet
4 participants