New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a limited generic YAML resource to the provider #195

Open
wants to merge 14 commits into
base: master
from

Conversation

Projects
None yet
3 participants
@lawrencegripper

lawrencegripper commented Oct 26, 2018

Fixes #183

Hi,

First up, hear me out - I think this is a good approach and has serious upside for a large number of users. I know this has been looked at before and I've spend time reading those threads and hope that this approach is different enough to warrant consideration.

What doesn't this solve?

Creation of standard types (services, pods etc). I propose that the preferred method for these to remain as the existing rich schema resources.

What does it solve?

This allows users to create other exotic types in Kubernetes through terraform, the main aim is to support CRDs (currently testing with an ingress resource but plan to extensively test against CRDs).

It follows a similar model to the existing Azure TF provider in which rich types are provided but a 'fallback' is provided to supply JSON directly to the Azure API's. Similar to that resource it has heavy limitation (updates are Create/deletes, YAML must only contain a single resource and there are no outputs)

Why is it different from past attempts?

The code doesn't require any knowledge of the resources its creating. Using the UniversalDecoder and PartialMetaDataObject along with the ServiceDiscoveryClient and RestClient it is able to parse the YAML provided, validate the kubernetes cluster can support the type defined and create a RestClient capable of performing CRUD operations on the resource in K8s.

This means that, like the ARM Deployment resource, it can support new K8s resource going forward without code changes.

Secondly the code avoids using odd parsing and reflection methods on the YAML and the Clusters version of the resource to spot diffs. It uses a CustomizeDiff function along with the ResourceVersion and UID fields present on the K8s resources. If any change is made to the K8s resource outside of TF these will be picked up by a the diff and the resource will be recreated in the correct configuration.

Status

Currently this is manually tested. The code is commented to aid understand but I haven't updated the provider docs. If there is interest in taking this forward I will:

  • Add Documentation
  • Add Tests (CRDs and review for edge cases)
  • Tidy up code

@hashibot hashibot bot added the size/XL label Oct 26, 2018

@pdecat

This comment has been minimized.

Contributor

pdecat commented Oct 26, 2018

That would be a great addition!

What about using JSON instead of YAML to be able to leverage terraform's jsonencode() function to configure the resource (and soon to come jsondecode() function)?

Maybe both languages could be supported and the resource be named something like kubernetes_generic with mutually exclusive json_body and yaml_body fields.

Edit: current implementation does not seem too much tied to YAML so maybe just a body field with automated encoding detection could do it.

@paultyng

This comment has been minimized.

Contributor

paultyng commented Oct 26, 2018

@pdecat JSON is valid YAML, so i think in theory would just work.

@pdecat

This comment has been minimized.

Contributor

pdecat commented Oct 26, 2018

@paultyng that's what I figured reviewing the code, it is not much tied to YAML.

// Create the resource in Kubernetes
err = client.Post().AbsPath(absPath["POST"]).Body(rawObj).Do().Into(metaObj)
if err != nil {
return fmt.Errorf("failed to create resource in kuberentes: %+v", err)

This comment has been minimized.

@pdecat

pdecat Oct 26, 2018

Contributor

Typo

Suggested change Beta
return fmt.Errorf("failed to create resource in kuberentes: %+v", err)
return fmt.Errorf("failed to create resource in kubernetes: %+v", err)
// defined in the YAML
client, absPath, rawObj, err := getRestClientFromYaml(yaml, meta.(KubeProvider))
if err != nil {
return fmt.Errorf("failed to create kuberentes rest client for resource: %+v", err)

This comment has been minimized.

@pdecat

pdecat Oct 26, 2018

Contributor

Typo

Suggested change Beta
return fmt.Errorf("failed to create kuberentes rest client for resource: %+v", err)
return fmt.Errorf("failed to create kubernetes rest client for resource: %+v", err)
// defined in the YAML
client, absPaths, _, err := getRestClientFromYaml(yaml, meta.(KubeProvider))
if err != nil {
return fmt.Errorf("failed to create kuberentes rest client for resource: %+v", err)

This comment has been minimized.

@pdecat

pdecat Oct 26, 2018

Contributor
Suggested change Beta
return fmt.Errorf("failed to create kuberentes rest client for resource: %+v", err)
return fmt.Errorf("failed to create kubernetes rest client for resource: %+v", err)
client, absPaths, _, err := getRestClientFromYaml(yaml, meta.(KubeProvider))
if err != nil {
return fmt.Errorf("failed to create kuberentes rest client for resource: %+v", err)

This comment has been minimized.

@pdecat

pdecat Oct 26, 2018

Contributor
Suggested change Beta
return fmt.Errorf("failed to create kuberentes rest client for resource: %+v", err)
return fmt.Errorf("failed to create kubernetes rest client for resource: %+v", err)
metaObj := &meta_v1beta1.PartialObjectMetadata{}
err = client.Delete().AbsPath(absPaths["DELETE"]).Do().Into(metaObj)
if err != nil {
return fmt.Errorf("failed to delete kuberentes resource '%s': %+v", metaObj.SelfLink, err)

This comment has been minimized.

@pdecat

pdecat Oct 26, 2018

Contributor
Suggested change Beta
return fmt.Errorf("failed to delete kuberentes resource '%s': %+v", metaObj.SelfLink, err)
return fmt.Errorf("failed to delete kubernetes resource '%s': %+v", metaObj.SelfLink, err)
client, absPaths, _, err := getRestClientFromYaml(yaml, meta.(KubeProvider))
if err != nil {
return false, fmt.Errorf("failed to create kuberentes rest client for resource: %+v", err)

This comment has been minimized.

@pdecat

pdecat Oct 26, 2018

Contributor
Suggested change Beta
return false, fmt.Errorf("failed to create kuberentes rest client for resource: %+v", err)
return false, fmt.Errorf("failed to create kubernetes rest client for resource: %+v", err)
@paultyng

This comment has been minimized.

Contributor

paultyng commented Oct 26, 2018

@lawrencegripper thanks for this PR, this is a common ask (see #150, #141, #109 and also terraform-provider-k8s), and definitely something we are interested in doing. We also do something similar for the nomad_job resource, but we have learned lessons from both the way that is implemented, and with the Azure resource you mention.

Due to the way the type and diffing system was implemented in Terraform 0.11 and earlier, we couldn't get the UX to work quite the way we wanted, but in Terraform 0.12, coupled with some of the improvements to the type system which allow for better nested complex types, we think we can provide this capability in a way that fits in much cleaner with how Terraform users expect it to work (and how Kubernetes natives expect it to work), but it requires changes to the Terraform Provider SDK to enable those new features.

You've also run in to some of the complications with the CustomizeDiff work you started. Kubernetes uses its own three way diffing logic, different from Terraform's. This resource should really replicate that functionality so that it works in a way Kubernetes user's expect (the defacto standard being how kubectl apply works). Unfortunately, that diffing logic isn't packaged up in a easy to use way, and even though there is work to implement server side apply, it would limit the usefulness to versions of Kubernetes where that lands and going forward.

We have an ongoing project internally, where we have collected the requirements for this new resource, and are just waiting for 0.12 to be released (and the new maintainer for this provider, @alexsomesan to ramp up somewhat), before working on the proposal for implementation. I would think once 0.12 has landed, and "baked" somewhat, we will probably be posting the proposal on this repository for community feedback.

@lawrencegripper

This comment has been minimized.

lawrencegripper commented Oct 29, 2018

@paultyng awesome, great to hear there will be improvements that can be made to this when TF 0.12 arrives. In the meantime I think the UX of the code in this PR does fit the brief of matching expectations of TF and K8s users.

I've recorded a quick run-through of using the resource. First up I use TF to create the resource in the cluster then I use Kubectl to delete the resource and show TF recreating it. Next I edit the resource with Kubectl (changing port 80->81) and then re-run TF. As the edit has caused the resource_version to be changed this is picked up by TF and an update is planned then applied. The user see's the update reason referencing the resource_version as having changed.

yamlux

Regarding the 3way merges and kubectl apply, I spent a bit of time using govendor to pull in the jsonmergepath logic from kubectl and after playing for a while discounted that approach (technically it felt doable although I didn't prove that).

One thing that put me off this approach was that it didn't seem to fit with my expectations of how the Terraform provider should work (these may well not be right). I have an expectation that TF follows a Desired Configuration approach and will fix any drift from the declared configuration and the actual running state. As such, if I declare a YAML definition for a resource I expect that to be what is running. I didn't feel like users would expect manual edits made to the resource through Kubectl to be maintained, as they would be with a 3 way merge (for example if they added additional configuration fields not specified in the YAML in TF).

Another limitation of attempting to use the 3 way merge approach is that the provider has no knowledge of what fields in the YAML support updating. Essentially you would need to dynamically decide which fields are likely to require ForceNew to be set on them which I didn't see as feasible. Kubectl allows a force command which will mean apply falls back to delete->create. I looked at using this option but again this creates an issue as in some circumstances TF would update the resource and others it would recreate it, when running a plan operation the user wouldn't know which was going to happen and that didn't feel nice.

That being said, if the 3 way merge support is a blocker to this work being merged I'm happy to revisit the problem and attempt to vendor in the kubectl apply patch logic for use within the provider.

@paultyng

This comment has been minimized.

Contributor

paultyng commented Oct 29, 2018

One thing that put me off this approach was that it didn't seem to fit with my expectations of how the Terraform provider should work (these may well not be right). I have an expectation that TF follows a Desired Configuration approach and will fix any drift from the declared configuration and the actual running state. As such, if I declare a YAML definition for a resource I expect that to be what is running. I didn't feel like users would expect manual edits made to the resource through Kubectl to be maintained, as they would be with a 3 way merge (for example if they added additional configuration fields not specified in the YAML in TF).

I think the issue with this, is that users aren't the only thing editing the YAML. Admission controllers, scaling systems, etc., are all modifying resources, so there is potential for a lot of non-user drift in the configuration.

An ingress is an interesting example, in a cloud provider, if you provision an ingress, I would imagine it makes updates with information about the cloud's ingress system and would lead potentially to an update loop?

@paultyng

This comment has been minimized.

Contributor

paultyng commented Oct 29, 2018

@lawrencegripper

This comment has been minimized.

lawrencegripper commented Oct 30, 2018

Damn - good spot, hadn't considered that.

I had a play this morning and I believe I have a solution which will work with MutatingAdmissionsControllers and other controllers updating the object. Happy to take questions on this as I went quite far down the rabbit hole and found it quite hard to explain.... here goes:

I loop through the fields set in the user provided YAML , for example looking at the field "image" and then pull the "image" field from the resource as it was returned to me by Kubernetes after creation in the cluster.

I then build a hash from these values at the time of creation. When TF does a Read to compare state it rebuilds this hash and compares it against the one stored in state at creation time and triggers an update if there is a diff. In the hash I ignore status and other service side fields along with any fields not specified by the user in their YAML. (Note: We can choose here whether to enforce matching on ALL non-server side fields or only compare those explicitly set).

For AdmissionsControllers, like this example mutating controller I wrote, my expectation is that, as the mutation occurs at the time of resource creation, the RESTClient will receive the mutated object back in the response. This allows TF to deal with a definition that will be mutated on deployment into the cluster as the hash would represent the mutated value.

To test this worked as expected I added a service into the YAML example with a type of LoadBalancer set. This will cause the cloud provider to assign an IP and set it in the Status object asynchronously. When just using the ResourceVersion this would have wrongly triggered an update from TF. With the new hash approach this works as expected, TF ignores the change to the status object.

The end result is, in my limited testing, it should now play nicely with a mutating admission controller and updates occurring to the status fields of the resource (or other fields not directly set by the user). I still use the ResourceVersion to optimize the process, only recalculating the YAML hash when there has been a change.

@paultyng Is this something that you would consider merging assuming I add the appropriate tests to prove out this behavior and update the docs accordingly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment