Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ForceNew and non-empty plans caused by PodSpec defaults #1074

Merged
merged 1 commit into from Dec 21, 2020

Conversation

dak1n1
Copy link
Contributor

@dak1n1 dak1n1 commented Nov 19, 2020

Description

Acceptance tests

  • Have you added an acceptance test for the functionality being added?
  • Have you run the acceptance tests on this branch?

Output from acceptance testing:

Release Note

Release note for CHANGELOG:

* Fix `init_container` ForceNew for resources using PodSpec.

References

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

@ghost ghost added the size/XL label Nov 19, 2020
kubernetes/provider_test.go Outdated Show resolved Hide resolved
resource.TestCheckResourceAttr("kubernetes_pod.test", "spec.0.dns_config.0.option.0.value", "1"),
resource.TestCheckResourceAttr("kubernetes_pod.test", "spec.0.dns_config.0.option.1.name", "use-vc"),
resource.TestCheckResourceAttr("kubernetes_pod.test", "spec.0.dns_config.0.option.1.value", ""),
resource.TestCheckResourceAttr("kubernetes_pod.test", "spec.0.dns_policy", "Default"),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed these because it was just a bunch of duplicate fields that don't need to be re-tested.

image = "%s"
command = ["wget", "-O", "/work-dir/index.html", "http://kubernetes.io"]
command = ["sh", "-c", "until nslookup init-service.default.svc.cluster.local; do echo waiting for init-service; sleep 2; done"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced our initContainer tests with the example from the Kubernetes site, since ours didn't seem to work anymore.

}
}
`, name)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea with this test is to check that our defaults are being set correctly. I was finding that most our tests are explicitly setting attributes like automount_service_account_token, but when you leave that attribute out of your config, it caused an non-empty plan. So I wanted to check all our defaults relating to PodSpec.

@ghost ghost added size/XXL and removed size/XL labels Nov 19, 2020
@dak1n1 dak1n1 mentioned this pull request Nov 20, 2020
3 tasks
@dak1n1 dak1n1 force-pushed the podspec branch 2 times, most recently from 0ad6d16 to 04b7cc2 Compare November 21, 2020 21:54
@dak1n1 dak1n1 force-pushed the podspec branch 2 times, most recently from 18519a8 to 8514ca2 Compare December 2, 2020 23:05
@dak1n1 dak1n1 marked this pull request as ready for review December 2, 2020 23:08
@dak1n1
Copy link
Contributor Author

dak1n1 commented Dec 2, 2020

@dak1n1 dak1n1 force-pushed the podspec branch 2 times, most recently from 672b17d to 4c4c32e Compare December 4, 2020 22:50
@dak1n1
Copy link
Contributor Author

dak1n1 commented Dec 5, 2020

I'll be working on a couple failures from the TC test, like this one:

=== RUN   TestAccKubernetesPod_with_projected_volume

------- Stderr: -------
panic: Invalid address to set: []string{"spec", "0", "volume", "0", "projected", "0", "sources", "3", "downward_api", "0", "items", "1", "resource_field_ref", "0", "divisor"}

goroutine 281 [running]:
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*ResourceData).Set(0xc000c54a80, 0x220a208, 0x4, 0x1e55560, 0xc000eef380, 0x0, 0xc000eeab00)
	/opt/teamcity-agent/work/992907ecae0fda26/src/github.com/terraform-providers/terraform-provider-kubernetes/vendor/github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema/resource_data.go:188 +0x388
github.com/hashicorp/terraform-provider-kubernetes/kubernetes.resourceKubernetesPodRead(0x2765f80, 0xc000dcaa20, 0xc000c54a80, 0x20aeb20, 0xc000ca34e0, 0x0, 0x0, 0xc0018d5ae0)
	/opt/teamcity-agent/work/992907ecae0fda26/src/github.com/terraform-providers/terraform-provider-kubernetes/kubernetes/resource_kubernetes_pod.go:178 +0x7ff
github.com/hashicorp/terraform-provider-kubernetes/kubernetes.resourceKubernetesPodCreate(0x2765f80, 0xc000dcaa20, 0xc000c54a80, 0x20aeb20, 0xc000ca34e0, 0xc000c81f10, 0x78d02a, 0xc00159ef20)
	/opt/teamcity-agent/work/992907ecae0fda26/src/github.com/terraform-providers/terraform-provider-kubernetes/kubernetes/resource_kubernetes_pod.go:103 +0xc61
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).create(0xc001393970, 0x2765f00, 0xc000ab01c0, 0xc000c54a80, 0x20aeb20, 0xc000ca34e0, 0x0, 0x0, 0x0)
	/opt/teamcity-agent/work/992907ecae0fda26/src/github.com/terraform-providers/terraform-provider-kubernetes/vendor/github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema/resource.go:275 +0x1ec
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).Apply(0xc001393970, 0x2765f00, 0xc000ab01c0, 0xc0004a8230, 0xc00159ef20, 0x20aeb20, 0xc000ca34e0, 0x0, 0x0, 0x0, ...)

I'm currently hunting down another Pod volume issue that causes the Pod to be recreated each time terraform runs. (It has something to do with the information placed into the volume and volume_mount attributes when auto_mount_service_account_token is set to true. Interestingly, the Deployment resource doesn't add this information to terraform state. And the Deployment resource works, whereas the Pod resource can use the exact same container spec as the deployment, but fails. It always contains a non-empty plan, due to the addition of the service account token volume and mount.

@dak1n1
Copy link
Contributor Author

dak1n1 commented Dec 7, 2020

I got stuck on one of those bugs, so I opened a new bug report for it here. #1085

go build -o /tmp/.terraform.d/localhost/test/kubernetes/9.9.9/$(OS_ARCH)/terraform-provider-kubernetes_9.9.9_$(OS_ARCH)
cd $(EXT_PROV_DIR) && TF_CLI_CONFIG_FILE=$(EXT_PROV_DIR)/.terraformrc TF_PLUGIN_CACHE_DIR=$(EXT_PROV_DIR)/.terraform terraform init -upgrade
TF_CLI_CONFIG_FILE=$(EXT_PROV_DIR)/.terraformrc TF_PLUGIN_CACHE_DIR=$(EXT_PROV_DIR)/.terraform TF_ACC=1 go test $(TEST) -v -run 'regression' $(TESTARGS)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change makes the testacc command faster. If you need to run update-related tests (the ones with _regression in the name, now you can run make test-update.

It also works with non-update related tests. Sometimes I like to do this if I've made schema changes that I want to be sure are in the test run:

TESTARGS="-run 'TestAccKubernetesJob'" make test-update

@@ -10,7 +10,7 @@ import (
)

func dataSourceKubernetesPod() *schema.Resource {
podSpecFields := podSpecFields(false, false, false)
podSpecFields := podSpecFields(false, false)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the isDeprecated field from the podSpecFields function because we're about to release a major update. The replication controller resource was the one using isDeprecrated, since originally the provider was created with a replication controller spec that didn't match the API. The issue was fixed in January 2019 (commit d5ff12c) and the affected resources have had deprecation warnings since then.

busyboxImageVersion1 = "busybox:1.31"
alpineImageVersion = "alpine:3.12.1"
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added these after Docker Hub started rate-limiting pulls. I was hitting the limit with my local testing. (100 pulls per 6 hours). This is a much more polite way to use their service anyway 😄

@@ -27,9 +35,6 @@ var testAccProviderFactories = map[string]func() (*schema.Provider, error){

func init() {
testAccProvider = Provider()
testAccProviders = map[string]*schema.Provider{
"kubernetes": testAccProvider,
}
testAccProviderFactories = map[string]func() (*schema.Provider, error){
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed since we're using ProviderFactories now instead.

},
},
})
}
Copy link
Contributor Author

@dak1n1 dak1n1 Dec 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few tests with the name _minimal. The purpose of the minimal tests is to see what happens when the user specifies the bare minimum required to create the resource. This tests all of our default values for the resource, and ensures the plan is empty after apply. Non-empty plans have been a problem across a few resources when defaults were applied incorrectly.

Config: requiredProviders() + testAccKubernetesDeploymentConfigLocal("kubernetes-local", name, imageName),
},
{
Config: requiredProviders() + testAccKubernetesDeploymentConfigLocal("kubernetes-local", name, imageName),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change here is to allow a different configuration for the Released version of the provider vs the locally-compiled version. It should test the new StateUpgraders.

}
ops = append(ops, specOps...)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled in Patrick's changes from here #769 (This came over during my last bug on-duty, since I was reviewing high priority PRs).

},
})
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole test is a result of looking at Patrick's work and seeing that certain fields are allowed to change in the Job resource. I tested manually using kubectl edit to figure out exactly which fields are mutable vs immutable. The API docs don't always accurately reflect that information.

),

// FIXME remove this when TTLSecondsAfterFinished defaults to true
ExpectNonEmptyPlan: true,
},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't actually mean to remove this comment... but then after it was removed, I figured I'd like the tests to keep telling me that this field isn't working yet. It's kind of noise, but I'd rather it be noisy than have to remember to come back and fix it later. Also, I learned that minikube can actually test this!

minikube start --feature-gates=TTLAfterFinished=true

@@ -41,7 +40,7 @@ func resourceKubernetesPod() *schema.Resource {
Required: true,
MaxItems: 1,
Elem: &schema.Resource{
Schema: podSpecFields,
Schema: podSpecFields(false, false),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just moved this down here because that's how it's laid out in the Deployment resource, and I was comparing the two when investigating the Pod volume bug.

},
},
})
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tests all Pod defaults.

},
},
})
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test will error until we fix bug 961 😅

PlanOnly: true,
},
},
})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with bug 1085.

MaxItems: 1,
Elem: &schema.Resource{
Schema: podSpecFields(false, false, true),
Schema: podSpecFields(false, true),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I'm reviewing this again, I see it sets isComputed to true even though this isn't a computed field. I know I tested the end result of this pretty thoroughly, but I wonder if there were any unforeseen impacts of having isComputed=true,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's far away in my mind now, but re-reading the change https://github.com/hashicorp/terraform-provider-kubernetes/pull/193/files#diff-5f4831c5771d0f4db1c84186dd377b99c788922fe9dfd33171a92de100426f15R7, I believe I did pass true to isComputed here to allow for a transition from the deprecated fields to the new ones in existing configurations. Not doing this would have triggered use diffs on those new fields, I think.

@dak1n1
Copy link
Contributor Author

dak1n1 commented Dec 9, 2020

I opened this bug since I wasn't able to fix the issue in this PR. #1088 There is still an idempotency issue with the stateful_set resource. But it can be worked around easily by specifying an update_strategy.

@dak1n1
Copy link
Contributor Author

dak1n1 commented Dec 14, 2020

I looked at the crash, and it turns out only to be a crash during the acceptance test run. During a normal apply, it simply errors. So I made a new bug for that one too. #1094

@dak1n1
Copy link
Contributor Author

dak1n1 commented Dec 15, 2020

TC results look good. The remaining test failures have known causes (like missing the StateUpgraders, or some of the bugs I've opened during this PR). https://ci-oss.hashicorp.engineering/viewLog.html?buildId=163100

This PR is ready for review.

Copy link
Contributor

@jrhouston jrhouston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the heroic effort untangling all this @dak1n1. I like the addition of _minimal tests to test the smallest surface area of a resource.

* Remove deprecated replicationController spec.

* Make PodSpec ForceNew more explicit.

* Add make target for regression upgrade tests.

* Standardize image versions used in acceptance tests
  to minimize docker pulls from dockerhub.

* Ensure init container is always updatable.

* Ensure mutable JobSpec fields are updatable.

Co-authored-by: Ilia Lazebnik <Ilia.lazebnik@gmail.com>
Co-authored-by: Patrick Decat <pdecat@gmail.com>
@dak1n1 dak1n1 merged commit 484112d into hashicorp:master Dec 21, 2020
@ghost
Copy link

ghost commented Jan 21, 2021

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!

@hashicorp hashicorp locked as resolved and limited conversation to collaborators Jan 21, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants