Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First pass at last_operation description proposal #537

Merged

Conversation

maleck13
Copy link
Contributor

@maleck13 maleck13 commented Nov 6, 2017

Describe what this PR does and why we need it:

Proposal to allow APBs to write a last operation description and have the broker collect and return it as part of the last_operation API
Which issue this PR fixes (This will close that issue when PR gets merged)
#475

Note
While I am becoming more familiar with the broker code, I am still very new to the code base so would appreciate an feedback on the approach outlined here.

On the APB side I would need to investigate the code in order to better understand how we could add a module to allow writing to the log file.

@openshift-ci-robot openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Nov 6, 2017
@maleck13
Copy link
Contributor Author

maleck13 commented Nov 6, 2017

@rthallisey As per discussion on the issue, I have made an attempt at a proposal here to implement last operation description. Any feedback welcome when you get a chance



## Proposed Implementation
Create an append only log file in a specified location ```/var/log/last_operation```, which is collected periodically and also a final time after
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

terms:
final operation - the operation that indicates the apb is 100% complete
last_operation - the most recent operation an apb performed

One thing I like about waiting to collect the final operation is that we know the apb is complete. That might come in handy with dependencies. On the other hand, does the broker really need the final operation if it's only a description? How do you view the importance of the final operation?

Copy link
Member

@djzager djzager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this proposal. I really like this idea. I think it would be better implemented by manipulating the Pod's annotations rather than relying on log files and exec'ing into containers.

## Proposed Implementation
Create an append only log file in a specified location ```/var/log/last_operation```, which is collected periodically and also a final time after
the apb has completed but before the pod is deleted.
In the bind workflow, we already do something similar by execing into the container in order to collect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if all of this is possible without using a file inside the APB container. Kubernetes already has a downward API for exposing things like Pod Name and Pod Namespace combined with Kubernetes annotations means that we could simply have an annotation(s) for last_operation and final_operation available through API calls w/o needing to exec.

$ kubectl annotate pods asb-1-h597s last_operation='Last operation added'
pod "asb-1-h597s" annotated
$ kubectl get pods asb-1-h597s -o go-template='{{ .metadata.annotations.last_operation }}'
Last operation added%

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djzager Interesting that you mention the annotations approach. This was something I also thought about. I shied away from it as the file approach seemed more fitting given the pod was already being exec'd into. I agree though this could be a simpler approach, particularly taking into account using the downward API to inform the apb.
Curious on your thoughts about using a watch while the pod was executing, would this be a reasonable way to pick up updates to the annotations in the broker?
I will rework the proposal to make use of the downward API and pod annotations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the APB pod that is running would add annotations on itself?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is my understanding. I believe the apb module could make this transparent to the apb developer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of using watch, unfortunately this is where I get a little fuzzy with how the broker handles the provision/deprovision jobs, their associated channels, and writing state information to etcd. My immediate impression is that we may need to make some changes to the broker in order to support additional status changes if we are going to write the last_operation to etcd.

You may be able to get by with handling the in progress state with an associated request for the last_operation annotation on the pod when the service catalog is asking for the last operation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if using the labels instead of the annotation might make more sense. I don't fully understand the difference but just starting the discussion.

Copy link
Contributor Author

@maleck13 maleck13 Nov 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit hazy on the difference also. However I know labels are used as selectors etc and for querying via the client. Whereas I have often seen annotations used for specific meta data for example https://github.com/openshift/origin/blob/master/examples/quickstarts/cakephp-mysql-persistent.json#L11
So I think that in this case annotations would make more sense.
Although, perhaps just keys in the metadata as this one is https://github.com/openshift/origin/blob/master/examples/quickstarts/cakephp-mysql-persistent.json#L18

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question on this approach: If we go with the watch on the pod, I am wondering would it at some point also make sense to do the same thing for the credentials from a bind. Rather than execing into the pod, have the pod create a secret in the pods namespace and have the broker watch for secrets in that namespace.
Obviously this is separate from this work but it could be a good paradigm to use for these kind of interactions? I can create a separate issue for this to discuss further.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the security impact, if at all, of the apb creating a secret and populating it. But, I think it's a good idea. The broker could also create it and the apb populate it. There are a few avenues we could take here. We'll have to explore this separately.

So I think that in this case annotations would make more sense.

+1 for annotations. I think labels are meant for kubernetes services identifying other kubernetes services.

Curious on your thoughts about using a watch while the pod was executing, would this be a
reasonable way to pick up updates to the annotations in the broker?

If the broker watches the resource, it should allow us to always get the final operation because the logic will exist in the broker. We can grab the final operation right before the namespace is cleaned up. +1

Copy link
Contributor

@jmrodri jmrodri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the comments about looking into a better approach than adding another exec.

@jmrodri
Copy link
Contributor

jmrodri commented Nov 9, 2017

@djzager we may need an experiment with the annotations to see if that is something we can definitely do.

@jmrodri
Copy link
Contributor

jmrodri commented Nov 9, 2017

I'm assuming this is what we're talking about: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/

@maleck13
Copy link
Contributor Author

maleck13 commented Nov 9, 2017

@jmrodri yes that is what we're talking about

@djzager
Copy link
Member

djzager commented Nov 9, 2017

It would be nice if I could thread conversations. I'm working on an experiment for you @jmrodri by creating a broker that passes the pod_name and pod_namespace to the pod and creating a hack APB that annotates itself.

@shawn-hurley
Copy link
Contributor

I thought there was some way to pull this data from the pod, I know namespace is possible, but pod name might be a little harder

@djzager
Copy link
Member

djzager commented Nov 9, 2017

You may have heard me celebrating @shawn-hurley that this worked.

	pod := &v1.Pod{
		ObjectMeta: metav1.ObjectMeta{
			Name:   executionContext.PodName,
			Labels: labels,
		},
		Spec: v1.PodSpec{
			Containers: []v1.Container{
				{
					Name:  "apb",
					Image: spec.Image,
					Args: []string{
						action,
						"--extra-vars",
						extraVars,
					},
					Env: []v1.EnvVar{
						v1.EnvVar{
							Name: "POD_NAME",
							ValueFrom: &v1.EnvVarSource{
								FieldRef: &v1.ObjectFieldSelector{
									FieldPath: "metadata.name",
								},
							},
						},
						v1.EnvVar{
							Name: "POD_NAMESPACE",
							ValueFrom: &v1.EnvVarSource{
								FieldRef: &v1.ObjectFieldSelector{
									FieldPath: "metadata.namespace",
								},
							},
						},
					},
					ImagePullPolicy: pullPolicy,
					VolumeMounts:    volumeMounts,
				},
			},
			RestartPolicy:      v1.RestartPolicyNever,
			ServiceAccountName: executionContext.ServiceAccount,
			Volumes:            volumes,
		},
	}

@djzager
Copy link
Member

djzager commented Nov 9, 2017

I'm actually genuinely surprised that it was this easy and it worked on the first trytries. I updated the broker with these lines (we may want to use volumes in practice but 🤷‍♂️):

// pkg/apb/executor.go
		Spec: v1.PodSpec{
			Containers: []v1.Container{
				{
					Name:  "apb",
					Image: spec.Image,
					Args: []string{
						action,
						"--extra-vars",
						extraVars,
					},
					Env: []v1.EnvVar{
						v1.EnvVar{
							Name: "POD_NAME",
							ValueFrom: &v1.EnvVarSource{
								FieldRef: &v1.ObjectFieldSelector{
									FieldPath: "metadata.name",
								},
							},
						},
						v1.EnvVar{
							Name: "POD_NAMESPACE",
							ValueFrom: &v1.EnvVarSource{
								FieldRef: &v1.ObjectFieldSelector{
									FieldPath: "metadata.namespace",
								},
							},
						},
					},
					ImagePullPolicy: pullPolicy,
					VolumeMounts:    volumeMounts,
				},
			},
			RestartPolicy:      v1.RestartPolicyNever,
			ServiceAccountName: executionContext.ServiceAccount,
			Volumes:            volumes,
		},

And then I created a custom hello-world-apb whose last step was to annotate itself:

- name: annotate myself
  k8s_v1_pod:
    name: "{{ lookup('env', 'POD_NAME') }}"
    namespace: "{{ lookup('env', 'POD_NAMESPACE') }}"
    annotations:
      status: Finished

Then I ran the apb, keeping the namespace after completion:

$ oc get pods -n apb-push-hello-world-apb-prov-6l746 -o yaml
apiVersion: v1
items:
- apiVersion: v1
  kind: Pod
  metadata:
    annotations:
      openshift.io/scc: restricted
      status: Finished

🕺

@jmrodri
Copy link
Contributor

jmrodri commented Nov 9, 2017

@djzager so would an apb be annotating itself regularly? Like if I'm giving status of my installation, will I have to call that annotation multiple times? What are the restrictions for the status or annotation fields?

With the logfile approach, we can put ANYTHING in there. So I'd want to be able to get as much information as possible. Also, will annotations overwrite themselves? Say the apb writes 50%, then 60%, then 70%, then 80%. We only see the 50 and 80 because we didn't catch the 60 and 70 in time. Could this happen?

I like the changes, just want to see what the gotchas will be if any.

@jmrodri
Copy link
Contributor

jmrodri commented Nov 9, 2017

@shawn-hurley you mentioned "pull this data". Are you wondering if there is a need to have the apb annotate itself for the name, etc?

@shawn-hurley
Copy link
Contributor

@jmrodri I was referring to being able to use the known location of /var/run/secretes/serviceaccount/namespace to retrieve the namespace, and was wondering if the same thing existed for pod name

@djzager
Copy link
Member

djzager commented Nov 9, 2017

@jmrodri

so would an apb be annotating itself regularly? Like if I'm giving status of my
installation, will I have to call that annotation multiple times?

An apb could be annotating itself as often as it wants, my guess is that we would want to support annotations like last_operation and final_operation and there are 2 ways prominent to me on how you could do that 1) use asb-modules to have something like asb_last_operation and asb_final_operation to do this kind of thing under the covers (I like this because it makes it easier for APB developers) 2) do what we are doing in my experiment.

What are the restrictions for the status or annotation fields?

Annotations are unstructured key:value pairs so we could put whatever we want in there but I think simpler the better.

With the logfile approach, we can put ANYTHING in there. So I'd want to be
able to get as much information as possible. Also, will annotations overwrite
themselves? Say the apb writes 50%, then 60%, then 70%, then 80%. We
only see the 50 and 80 because we didn't catch the 60 and 70 in time.
Could this happen?

Everytime we update the annotation field (ie. last_operation) it is overwritten so this scenario you describe is possible and likely a common one.

@jmrodri
Copy link
Contributor

jmrodri commented Nov 9, 2017

With the logfile approach, we can put ANYTHING in there. So I'd want to be
able to get as much information as possible. Also, will annotations overwrite
themselves? Say the apb writes 50%, then 60%, then 70%, then 80%. We
only see the 50 and 80 because we didn't catch the 60 and 70 in time.
Could this happen?

Everytime we update the annotation field (ie. last_operation) it is overwritten so this scenario you describe is possible and likely a common one.

I only ask in order to set expectations. I can see bugs being written against this when the progress goes from 0 to 100% because we missed the 2 operations in between.

The workflow would be service-catalog is polling the broker's last_operation endpoint on a periodic basis. The broker will be watching the pod for annotation changes, upon change it will send a message to the channel with the status update. The apb in the pod will be doing its work and calling a module to update the pod annotation.

My original concern might not be a problem there are enough latency in this whole work flow from the polling of the last_operation endpoint to the watching interval down to time it takes for apb to update status. I think it will be okay if we miss a few. Also, getting some status is better than what we have today, NO STATUS!

@maleck13
Copy link
Contributor Author

maleck13 commented Nov 9, 2017

An apb could be annotating itself as often as it wants, my guess is that we would want to support annotations like last_operation and final_operation and there are 2 ways prominent to me on how you could do that 1) use asb-modules to have something like asb_last_operation and asb_final_operation to do this kind of thing under the covers (I like this because it makes it easier for APB developers)

I think going the asb module route also makes a lot of sense. That module could then handle retrieving the values of those annotations and then updating the value by appending to the existing one. Ensuring an append only log.
I think from an end user point of view, if it goes from 10%: creating X to then 10%: creating X, 20%: updating Y, 40%: creating X, 80% deleting Z you are still getting the full picture which I think is fine.

One thing I think we should discuss is error scenarios. Is there a common or best way to "trap" an error during the apb execution to allow the developer to add some context and update the log before the apb exits?

@djzager
Copy link
Member

djzager commented Nov 9, 2017

@maleck13

I think from an end user point of view, if it goes from 10%: creating X 
to then 10%: creating X, 20%: updating Y, 40%: creating X, 80% deleting 
Z you are still getting the full picture which I think is fine.

We could when going the asb-module route: 1) get the last_operation and 2) make the annotation be {{ previous_operation}}, 80% deleting to do as you are suggesting. Personally, I just don't think that we should do that. I believe the asb-module should keep writing it's last_operation annotation like:

// at start
  metadata:
    annotations:
      last_operation: "10%: creating"
// in the middle
  metadata:
    annotations:
      last_operation: "50%: running"
// in the end
  metadata:
    annotations:
      last_operation: "100%: done"
      final_operation: "Woot, we're done"

I say this because it seems appropriate for us to be reporting the last_operation when asked not all the operations performed.

@maleck13
Copy link
Contributor Author

maleck13 commented Nov 9, 2017

@djzager Yeah I see where you are coming from, and strictly speaking I think you are correct, it should be the last operation only. The concern I would have then is around consistency, as mentioned in previous comment, you could potentially get very different feedback depending on the the timings of the polling.
Is it stretching the use of this API too far to have the last operation present (ie the last thing append to the log) along with the previous operations?
Perhaps this is something that needs discussion with the service catalog to see is there interest in having support for something like {status:"in progress", "description":"doing something", "operation_log":["something i did before"]} ?

@maleck13
Copy link
Contributor Author

maleck13 commented Nov 9, 2017

I will wait to update the proposal until we iron out these different pieces

@rthallisey
Copy link
Contributor

@djzager Yeah I see where you are coming from, and strictly speaking I think you are correct, it should
be the last operation only. The concern I would have then is around consistency, as mentioned in
previous comment, you could potentially get very different feedback depending on the the timings of the
polling. Is it stretching the use of this API too far to have the last operation present (ie the last thing
append to the log) along with the previous operations?

I think it's going to be hard to expect consistency from last_operation because the description field can be anything. It's plain text so we're not requiring a percentage or anything, but it gives the user a way to demonstrate progress that is meaningful to them. So to me, it doesn't matter if we're missing any messages because the description isn't guaranteed to be a definition of state.

Here are a summary of my expectations:

  1. The last_operation demonstrates APB progress.
  2. No restrictions or requirements for user to demonstrate APB progress.
  3. No guarantee that last_operation shows every operation.
  4. The final_operation is the last_operation gathered before the APB is deleted.

@maleck13
Copy link
Contributor Author

maleck13 commented Nov 10, 2017

Ok this sounds like a good definition. Based on the thumbs up looks like there is agreement, so I think I can proceed with an update to the proposal based on the conversations here.

  • using the pod annotations to add the last operations and final operation
  • using a watch on the pod to receive changes to this field and store it as part of the Job status
  • using the downward API to get the pod name and namespace
  • using an apb module that will use the pod name and namespace to allow the developer to add a last operation description

I am away next week so it may be after next week before I complete it

@maleck13
Copy link
Contributor Author

Updated proposal based on discussions here

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 603e07d on maleck13:proposal-last-operation-description into ** on openshift:master**.

Copy link
Member

@djzager djzager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good to me. A bunch of nitpicks and a few meaningful requests for you.

- The last_operation demonstrates APB progress.
- No restrictions or requirements for user to demonstrate APB progress.
- No guarantee that last_operation shows every operation.
- The final_operation is the last_operation gathered before the APB is deleted.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted completes


**Last Operation:** the most recent operation an apb performed

**Final operation:** the operation that indicates the apb is 100% complete
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oOperation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also recommend lining this up with the expectations, `Final Operation: the last operation before APB completed.


## Terms

**Last Operation:** the most recent operation an apb performed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: strings.ToUpper("apb"), I see this elsewhere


```go

wi, err := k8client.CoreV1().Pods(ns).Watch(meta_v1.ListOptions{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 It wasn't immediately obvious to me how to accomplish this, nice.

- Modify the existing provision and deprovision subscribers along with the corresponding work messages.
A namespace field would be added to these messages as the pod name is already present.
Inside the subscriber, the go routine to watch the pod and update the JobStatus would start when a new work msg was recieved.
This routine would stop once the a deleted change was received in the watch.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This routine would stop once the a deleted change was received in the watch.


- Modify last operation handler to pull the description out of the Job state and send it back as part of the response.

### APB changes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually don't think that you are targeting the APB specifically here, more the underlying ansible-asb-modules. I don't think that you would be putting the last operation on disk, you should just modify the running pod's annotation directly with the new asb_last_operation_description.

Something like the following may make sense:

```
asb_last_operation_description:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This applies to both new ansible-asb-module actions, _description from the name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


In order to collect this information we would use a watch via the Kubernetes client on the pod resource within the temporary namespace [Pod Rest API](https://docs.openshift.com/container-platform/3.5/rest_api/kubernetes_v1.html#list-or-watch-objects-of-kind-pod).
This would allow us to react to changes made (i.e to the annotations) on Pod Object. Whenever a change occurred, an update to the JobStatus would happen. If the ```apb_final_operation``` annotation is present this would take precedence over the last_operation annotation.
Once the pod was deleted we would stop the watch on the pod and update the JobStatus ```final_operation``` annotation value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need two annotations? If the pod completes, the final_operation = last_operation. Since we're watching the pod anyway, we'll know when it's complete. That will allow us to get both data points with a single annotation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right that's actually a good point, and simplifies things further. I will update based on this

@maleck13 maleck13 force-pushed the proposal-last-operation-description branch from 603e07d to 8e4dc8f Compare November 17, 2017 13:33
@maleck13
Copy link
Contributor Author

@jmrodri @djzager I have updated based on all feedback here now. PTA and let me know if there is anything further required.

Copy link
Member

@djzager djzager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one nit

description:"10%: creating deployment and routes"
```

***Note not very familiar with how the ansible apb modules wors under the hood so would need some guidance here***
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: worsworks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ansibleplaybookbundle/ansible-asb-modules#8 actually has an example for you of using the kubernetes API to create a secret. If you need any more guidance, I think I can help you out.

Copy link
Contributor

@rthallisey rthallisey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job Craig

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 8e4dc8f on maleck13:proposal-last-operation-description into ** on openshift:master**.

@maleck13 maleck13 force-pushed the proposal-last-operation-description branch from 8e4dc8f to 55ded73 Compare November 17, 2017 14:35
@maleck13
Copy link
Contributor Author

Fixed the typo. Thanks for the help guys. I am hoping to start looking at implementation next week. However we have a lot going at the moment with the mobile next project as we have finished our PoC and are planning for tech preview, so I will have to see how things pan out.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 55ded73 on maleck13:proposal-last-operation-description into ** on openshift:master**.

@jmrodri jmrodri merged commit 8f96a6a into openshift:master Nov 17, 2017
@maleck13
Copy link
Contributor Author

@jmrodri @rthallisey After spending sometime looking at the implementation of this, I have a question about the subscribers and the role they play, in particular the setting of the JobState. Does it make sense for these subscribers to set the JobState or would it make sense to change the jobs to have access to the DAO and set the JobState directly?
I believe this would simplify the overall design somewhat. Or is there a principal or design philosophy that I am unaware of here for why the subscribers should be doing this? The channels seems to be about sending a workmsg (so something that should be done async as a piece of background work) whereas setting the state of an executing job seems like something that should be reflected as soon as possible?

@rthallisey
Copy link
Contributor

One of the reasons behind subscribers was async, where the work and the state handling are two separate threads. A second thing about subscribers is that it makes a general API for an $action that handles all its logging and its DAO connection. This may come in handy later with the recent discussion in the community about making the broker a more general API. I think for now we should keep it as is. We can always change it down the road.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature proposal size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants