-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instance controller refactoring #850
Conversation
@@ -232,7 +245,7 @@ func prepareKubeResources(plan *activePlan, meta *executionMetadata, renderer ku | |||
} | |||
|
|||
for _, phase := range plan.Spec.Phases { | |||
phaseState, _ := getPhaseFromStatus(phase.Name, plan.State) | |||
phaseState, _ := getPhaseFromStatus(phase.Name, plan.PlanStatus) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check error or remove it from the return signature of getPhaseFromStatus
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dislike the ignored error as well at first glance... however the error for an invalid name and we are getting the name for the phases. It is hard to see that this could error unless the plans change from underneath...
this is a private function that is use in only this use case... we should remove the error return from the function. I would also be inclined to from the prefix get
-> phaseFromStatus
. all of this is minor.
} | ||
} | ||
return nil | ||
} | ||
|
||
func prettyPrint(i interface{}) string { | ||
s, _ := json.MarshalIndent(i, "", " ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could fail for custom marshalers, make sure if this errors to set some reasonable default at least ("{}"?) (so you can preserve your return signature)
} | ||
|
||
return newState, nil | ||
} | ||
|
||
func executeStep(step v1alpha1.Step, state *v1alpha1.StepStatus, resources []runtime.Object, c client.Client) error { | ||
if isInProgress(state.State) { | ||
state.State = v1alpha1.PhaseStateInProgress | ||
if isInProgress(state.Status) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General comment: I'm wondering if instead of doing all of this inline setting of status fields everywhere, we could create a set of transformer functions that take a state and the next step, executes it, and returns the state of the PE at that point. Could make everything a lot more fluent and set the stage for various transformations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yeah, that was my original intentioon behing this executePlan refactoring but at that point I did not do it because it's hard to "compute the next step" - you need to figure out if it's update or create (that's easy) and then what to update (so compute the diff) which is harder. I think moving in this direction would make the test really testable though. Are you ok with me pursuing that direction in the next phase? as it's rather more a testability improvement in the execution engine than the Instance controller thing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with pursuing this in the next phase and agree 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking to have a number of passes for review.
This pass is looking at plan-execution changes:
Beyond the suggestion provided... there is a label for PlanExecutionAnnotation
https://github.com/kudobuilder/kudo/blob/master/pkg/util/kudo/labels.go#L14
It isn't clear how it is used now... it is used in instance/apply_convention.go
which is an applyConventionsToTemplates. I don't think this is needed any longer (meaning the PEAnnonation)
pkg/kudoctl/cmd/plan/plan_history.go
Outdated
labelSelector = fmt.Sprintf("instance=%s", options.Instance) | ||
} else { | ||
fmt.Printf("History of plan-executions for instance \"%s\" in namespace \"%s\" to operator-version \"%s\":\n", options.Instance, options.Namespace, args[0]) | ||
fmt.Printf("History of plan-executions for instance \"%s\" in namespace \"%s\" for operator-version \"%s\":\n", options.Instance, options.Namespace, args[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fmt.Printf("History of plan-executions for instance \"%s\" in namespace \"%s\" for operator-version \"%s\":\n", options.Instance, options.Namespace, args[0]) | |
fmt.Printf("History of plan executions for instance \"%s\" in namespace \"%s\" for operator-version \"%s\":\n", options.Instance, options.Namespace, args[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at plan history cmd more... I'm not sure any of this makes sense. you mentioned in our walk thru that this is slated for the next release... which I completely support. I wonder if we should just remove these commands or stub them out. they will not be functional.
the following:
planExecutionsGVR := schema.GroupVersionResource{
Group: "kudo.dev",
Version: "v1alpha1",
Resource: "planexecutions",
}
is not functional
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What Ken said! ^^
func(a handler.MapObject) []reconcile.Request { | ||
requests := make([]reconcile.Request, 0) | ||
// We want to query and queue up operators Instances | ||
instances := &kudov1alpha1.InstanceList{} | ||
// we are listing all instances here, which could come with some performance penalty | ||
// a possible optimization is to introduce filtering based on operatorversion (or operator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm less concerned with the optimization concern (which we should do... we should put a todo: here)
I'm more concerned that when the watch condition is triggered for 1 instance we perform this operation on ALL instances. Why do we not work solely with the instance associated with the trigger?
We want to handle edge and level... we need to handle level in a different way IMO... one that isn't driven by events in the system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok... after further review... I see that we loop thru all instances... but we only operator on the one defined by "a". my concern isn't valid... I still suggest the //TODO:
but it isn't a big deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nothing new in this PR, this was already in before. I don't like TODOs in code, I am happy to file an issue though :) #860
for i2, p := range v.Phases { | ||
planStatus.Phases[i2].Status = ExecutionPending | ||
for i3 := range p.Steps { | ||
i.Status.PlanStatus[n1].Phases[i2].Steps[i3].Status = ExecutionPending |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is unfortunate that i is instance here... and we have i2 and i3 as indexes. I would encourage expanding i into instance
in this case. and use i
, j
,k
as indexes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. it's a nit, but still: using n1, i2, i3
for indexes/keys is confusing. short var names are good until they aren't ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I rename this one I would have to rename all of them in the whole file :/ and I kind of like to use i for it, feels go-ish... I renamed the index to planIndex
let me know if that helps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alenkacz i've also seen Idx to go (heh heh) shorter in a lot of Go, so you could have:
planIdx, phaseIdx, stepIdx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a rule/situation where we can favor clarity over conciseness, because as it stands it would be really easy to refactor this, not thinking about indexes, and create a runtime out of bounds error. making those index names look like that embeds the semantic right in the access pattern.
|
||
// getInstance retrieves the instance by namespaced name | ||
// returns nil, nil when instance is not found (not found is not considered an error) | ||
func (r *Reconciler) getInstance(request ctrl.Request) (instance *kudov1alpha1.Instance, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like this should take a ctx context.Context
as an argument. This function is called once from Reconcile
and each step creates it's own context (which works of course) but the point of context is to carry the context throughout the execution.
same with getOperatorVersion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
additionally, would it be helpful to create our own Context struct embedding conext.Context here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, to counter this, including myself, we should question ourselves every time we think we need a context somewhere because context typically ends up being very viral in your codebase. I don't think it's an antipattern but should be very carefully considered as a whole before we do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure, let's talk about this... right now we don't really use the context at all, we always use context.TODO()
... until we start doing something meaningful with it I don't see much advantage of having it as dependency as it's only bloating the number of parameters that method has... Having it as dependency and then always passing in context.TODO() for now that feels a bit meaningless
// after have an active plan (which may be complete) | ||
return instance.Status.ActivePlan.Name == "" | ||
if len(missingRequiredParameters) != 0 { | ||
return nil, &executionError{err: fmt.Errorf("parameters are missing when evaluating template: %s", strings.Join(missingRequiredParameters, ",")), fatal: true, eventName: kudo.String("Missing parameter")} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should consider creating a Kubernetes event here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we actually do! :) the execution error contains name of the event and that is then emitted "automatically" in handleError
so the error type already holds the event information
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do. this executionError
will bubble up and land in handleError
method where an event will be generated.
👍 to @kensipe's comments, and I need another pass after taking some more time to think about it all. This is a great PR. I have a lot of little things I want to think about in the full context before I suggest anything that actually blocks this PR from landing. In general, with maybe disabling the Plan History command for the moment (not having tried it or delved into @kensipe's comments there), I'm open to landing this and continuing to increment on a lot of those nits and see how it evolves with Tasker. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a number of recommendations / nits... but nothing that should stop this.
nice work
ov = &kudov1alpha1.OperatorVersion{} | ||
err = c.Get(ctx, | ||
err = r.Get(context.TODO(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like it should be passed in
if err != nil { | ||
return nil, err | ||
} | ||
type executionError struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
executionError
is confusing to me... many things could be an execution error... I'm not sure when to use it and how it is different from InstanceError. The places it is used it looks like a PlanExecutionError. perhaps it is fine... but I had to hunt for it's use to understand it which means like some challenge in readability.
and... although we collapse planexec into instance... I could still see having a planexecution package which would allow for planexec.NewError etc. or plan... like plan.NewError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after more thought... I really don't like the broken CLI commands for plan. While I dislike adding more to a large PR... we can't merge this and then release... we will need to do something regarding plan commands.
I'm ok with merge with plan cmd fixes... which could be removing them... or commenting them out.. or responding with a "coming soon message" |
Co-Authored-By: Ken Sipe <kensipe@gmail.com>
what if we just don't actually register them with the command handler so we can iterate on it and it's tests, then introduce it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm! First batch of nits/comments out, more to come
P.S. don't forget to publish the events ;)
// v v | ||
//+------+------+ +-------+--------+ | ||
//| Fatal error | | Complete | | ||
//+-------------+ +----------------+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏 👏 👏
// EnsurePlanStatusInitialized initializes plan status for all plans this instance supports | ||
// it does not trigger run of any plan | ||
// it either initializes everything for a fresh instance without any status or tries to adjust status after OV was updated | ||
func (i *Instance) EnsurePlanStatusInitialized(ov *OperatorVersion) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked about it offline, so it's merely a reminder: we need to unit tests this method. It's not very wild but merging Json manually deserves a few tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, as mentioned offline, we might wanna use apimachinery patch module to do a 2/3 way merge automatically.
const PhaseStatePending PhaseState = "PENDING" | ||
// isUpgradePlan returns true if this could be an upgrade plan - this is just an approximation because deploy plan can be used for both | ||
func isUpgradePlan(planName string) bool { | ||
return planName == "deploy" || planName == "upgrade" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we should put these magic strings into a const()
at least.
for i2, p := range v.Phases { | ||
planStatus.Phases[i2].Status = ExecutionPending | ||
for i3 := range p.Steps { | ||
i.Status.PlanStatus[n1].Phases[i2].Steps[i3].Status = ExecutionPending |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. it's a nit, but still: using n1, i2, i3
for indexes/keys is confusing. short var names are good until they aren't ;)
|
||
// PhaseState captures the state of the rollout. | ||
type PhaseState string | ||
// TODO in the future when we again support manual plan execution, snapshot should be saved only manually executed plans |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate on why this should be the case?
|
||
// new instance, need to run deploy plan | ||
if i.NoPlanEverExecuted() { | ||
return kudo.String("deploy"), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
magic strings ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's your point here? :DD
okay okay, I'll do better :)
|
||
// did the instance change so that we need to run deploy/upgrade/update plan? | ||
instanceSnapshot, err := i.getSnapshotedSpec() | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getSnapshotedSpec
returns nil, nil
when no annotations
or no snapshotAnnotation
was found but you only check the err != nil
here (which could lead to a npe below)
Co-Authored-By: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>
Co-Authored-By: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>
…do into av/instance-controller-v2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part 2 done. I left a few more nits but nothing that should prevent this PR from landing! 🚢 Great work!
// TODO in the future when we again support manual plan execution, snapshot should be saved only manually executed plans | ||
err := i.SaveSnapshot() | ||
if err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we wrap it into an InstanceError
to make it visible on the event bus?
// Automatically generate RBAC rules to allow the Controller to read and write Deployments | ||
// +kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete | ||
// +kubebuilder:rbac:groups=kudo.dev,resources=instances,verbs=get;list;watch;create;update;patch;delete | ||
func (r *Reconciler) Reconcile(request ctrl.Request) (ctrl.Result, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one good looking Reconcile
method! 👏
if err != nil { | ||
log.Printf("InstanceController: Error creating PlanExecution") | ||
return err | ||
fmt.Printf("Pre-error instance %v", instance) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: remove this line - logging will happen in r. handleError
fmt.Printf("Pre-error instance %v", instance) |
// after have an active plan (which may be complete) | ||
return instance.Status.ActivePlan.Name == "" | ||
if len(missingRequiredParameters) != 0 { | ||
return nil, &executionError{err: fmt.Errorf("parameters are missing when evaluating template: %s", strings.Join(missingRequiredParameters, ",")), fatal: true, eventName: kudo.String("Missing parameter")} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do. this executionError
will bubble up and land in handleError
method where an event will be generated.
pkg/kudoctl/cmd/plan/plan_history.go
Outdated
labelSelector = fmt.Sprintf("instance=%s", options.Instance) | ||
} else { | ||
fmt.Printf("History of plan-executions for instance \"%s\" in namespace \"%s\" to operator-version \"%s\":\n", options.Instance, options.Namespace, args[0]) | ||
fmt.Printf("History of plan-executions for instance \"%s\" in namespace \"%s\" for operator-version \"%s\":\n", options.Instance, options.Namespace, args[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What Ken said! ^^
pkg/util/health/ready.go
Outdated
|
||
if plan.Status.State == kudov1alpha1.PhaseStateComplete { | ||
if obj.Status.AggregatedStatus.Status == kudov1alpha1.ExecutionComplete { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wdyt?
if obj.Status.AggregatedStatus.Status == kudov1alpha1.ExecutionComplete { | |
if isFinished(obj.Status.AggregatedStatus.Status){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
What this PR does / why we need it:
This is an implementation of controller based on KEP-18.
Some highlights:
Fixes #422