K8s Jobs for teams #204

Morriz · 2020-11-16T15:11:59Z

Link to discussion: https://github.com/redkubes/otomi-core/discussions/295

Tasks

Core

Meta-schema proof of concept

Add proof of concept functionality to values-schema.yaml for Kubernetes Jobs.
Add proof of concept jobs.demo.yaml.
Modified requirements based on initial review.

Previous configuration refactoring

Delete previous configuration that allows to specify a job for development reasons.
Refactor Job/CronJob chart to enable the spec.enabled flag by default.
Refactor Job/CronJob chart to enable to specify an array of Jobs.

Team jobs basic functionality

Teams can enter jobs in jobs.demo.yaml.
Add functionality to Helmfile (helmfile-60.teams.yaml).

Extra functionality for Jobs (e.g. when the Jobs spec changes -- should be exposed to the end-user)

~~- [ ] Explore Helm hooks.~~

Explore Kubernetes Job specification (preferred over Helm hooks).
Add a proof of concept to values-schema.yaml for review.
Add functionality to the relevant Helm Charts.
Proposal for running a running a job always and onChange so the default behaviour that a job is always created can be overridden.

Missing functionality, see #295

Definition Of Done

Tasks are done
(Unit) Tests are performed
(Architecture) Design Record(s) have been added as adr/*.md and appended to list in adr/_index.md
Specs and demo files have been updated to reflect code changes
Documentation has been updated
Functionality, code, and/or documentation has been peer reviewed
Relevant team members have been notified

The text was updated successfully, but these errors were encountered:

0-sv · 2020-12-22T11:35:02Z

I'm awaiting responses to the discussions in #248 and the review of #258 @Morriz @j-zimnowoda as of now.

0-sv · 2021-01-05T08:37:03Z

Relating to task: "Extra functionality for when the Job specification changes."

The core problem is that executing a standard release will always run, no matter if the spec has changed or not, if it is defined (e.g. in jobs.demo.yaml).

@Morriz showed me an issue in the helmfile repo that explains the problem. He did, however, merge a workaround.

My question is, would this be enough to solve our problem as well? Of course, we can use this command as we see fit, but for someone just using the otomi-values/otomi-console, I guess it wouldn't be enough. My research at least concluded that there would be no way to adjust the Job specification in native Kubernetes to differentiate between what kind of execution the job has to be (once, always, onSpecChanged, etc.). There was also the assumption that a Job would only run if the spec is changed, that is not true, since it will always create a new job.

So the solution/work-around seems to be to implement some scripting based on a key-value pair that is only exposed in otomi-values (something like: .teamConfig.teams./team-\w/.jobs[].deployment: always|onChange).

If so, we've covered the following cases:

Always, even if the spec didn't change.
Once, as it's just targeting the release

There remain the following cases:

If it is always ran, then should we be responsible for the idempotence? I can imagine a lot of cases where idempotency is default, such as with the generation of a new secret key, so long as the original references the variable. However, if someone specifies a database migration, it will become quite cumbersome, but there is no protection of letting him do it. I guess then it would only be useful if someone defines the database migration and then delete is, which is the core issue defined here.
If the spec changes, it doesn't actually matter, because it is always ran.
The user wants to specify it in otomi-values, but it should only run the one time he needs it, so in between the execution of jobs (if there are multiple), or as part of a dependency. But then the user would also need to keep track of the last time the job was executed, or else it won't make sense in version control, as it is unclear when this job has ran.

Thinking about all these cases about version-controlled jobs that actually should always run once, isn't it easier to not define it as a release, but give the end-user a tool to run a script using kubectl. That would reduce a lot of hassle imo. I would think a job by itself isn't very useful as a release, only as part of a release to glue something together. I think the end-user doesn't have a lot of fine controls for specifying the conditions of running a job, since I'm assuming he doesn't know about otomi-core, just that he wants to execute something at that moment.

As one commenter defined it in the mentioned issue: "I see this as a job for helm hooks as either a pre-upgrade or a post-upgrade hook. The chart that needs the job to run should be the one running the job." I'm inclined to agree with him.

I am thinking about whether adding this option would make the end-user's life easier or harder. I'm inclined to say it would introduce some potential for error. If someone has cli access to a Kubernetes cluster, then one of the first things that pops up in most tutorials is how to echo hello world once, so I don't think the Job specification introduces any complexity that we should take away from the end-user.

Thoughts? @redkubes/devops

Morriz · 2021-01-05T10:50:27Z

Relating to task: "Extra functionality for when the Job specification changes."

The core problem is that executing a standard release will always run, no matter if the spec has changed or not, if it is defined (e.g. in jobs.demo.yaml).

@Morriz showed me an issue in the helmfile repo that explains the problem. He did, however, merge a workaround.

That is not a workaround unfortunately, just a manual sync invocation to deploy a job.

My question is, would this be enough to solve our problem as well? Of course, we can use this command as we see fit, but for someone just using the otomi-values/otomi-console, I guess it wouldn't be enough. My research at least concluded that there would be no way to adjust the Job specification in native Kubernetes to differentiate between what kind of execution the job has to be (once, always, onSpecChanged, etc.). There was also the assumption that a Job would only run if the spec is changed, that is not true, since it will always create a new job.

So the solution/work-around seems to be to implement some scripting based on a key-value pair that is only exposed in otomi-values (something like: .teamConfig.teams./team-\w/.jobs[].deployment: always|onChange).

Indeed. Like I mentioned before in a previous review: we use apply and not sync in our pipeline, so by default this will result in a sync only if there is a diff. In order to guarantee deployment: always we need to add a script that iterates over the job releases and forces a sync on them. In order to do so EACH job must be registered as a helmfile release. I propose this naming scheme for such releases: team-$name-job-$jobname.

If so, we've covered the following cases:

Always, even if the spec didn't change.

Once, as it's just targeting the release

There remain the following cases:

If it is always ran, then should we be responsible for the idempotence? I can imagine a lot of cases where idempotency is default, such as with the generation of a new secret key, so long as the original references the variable. However, if someone specifies a database migration, it will become quite cumbersome, but there is no protection of letting him do it. I guess then it would only be useful if someone defines the database migration and then delete is, which is the core issue defined here.

If the spec changes, it doesn't actually matter, because it is always ran.

This is actually in line with the idempotency expectation. onSpecChange will be fine, as a dev would only want to re-run when script is changed.

The user wants to specify it in otomi-values, but it should only run the one time he needs it, so in between the execution of jobs (if there are multiple), or as part of a dependency. But then the user would also need to keep track of the last time the job was executed, or else it won't make sense in version control, as it is unclear when this job has ran.

Not relevant any more in light of what I just elaborated on. user can be sure it ran once when onSpecChange was chosen and will only re-run when he changes it.

Thinking about all these cases about version-controlled jobs that actually should always run once, isn't it easier to not define it as a release, but give the end-user a tool to run a script using kubectl. That would reduce a lot of hassle imo. I would think a job by itself isn't very useful as a release, only as part of a release to glue something together. I think the end-user doesn't have a lot of fine controls for specifying the conditions of running a job, since I'm assuming he doesn't know about otomi-core, just that he wants to execute something at that moment.

Our requirements are targeting the end user needs already. The 3 chosen options for deployment are covering them. We will choose to add a special release file helmfile.tpl/jobs.yaml (to not have them deployed by accident when a dev calls otomi sync) in which we will iterate over the jobs and create the needed releases. This will be used in bin/jobs.sh which will become responsible for jobs only. It will be include in bin/deploy.sh which is called from our pipeline.

As one commenter defined it in the mentioned issue: "I see this as a job for helm hooks as either a pre-upgrade or a post-upgrade hook. The chart that needs the job to run should be the one running the job." I'm inclined to agree with him.

I am thinking about whether adding this option would make the end-user's life easier or harder. I'm inclined to say it would introduce some potential for error.

I think we eliminated room for error with our requirements. If you see more please elaborate.

If someone has cli access to a Kubernetes cluster, then one of the first things that pops up in most tutorials is how to echo hello world once, so I don't think the Job specification introduces any complexity that we should take away from the end-user.

I am not sure what you are trying to say here.

Thoughts? @redkubes/devops

I hope you can proceed now ;)

0-sv self-assigned this Nov 26, 2020

0-sv added the enhancement New feature or request label Nov 26, 2020

0-sv mentioned this issue Dec 10, 2020

Task #204: team jobs mechanism, meta-schema validation #239

Closed

0-sv linked a pull request Dec 10, 2020 that will close this issue

Task #204: team jobs mechanism, meta-schema validation #239

Closed

0-sv linked a pull request Dec 14, 2020 that will close this issue

Team jobs meta-schema (values-schema.yaml) #248

Closed

0-sv mentioned this issue Dec 16, 2020

Team jobs meta-schema (values-schema.yaml) #248

Closed

0-sv linked a pull request Dec 21, 2020 that will close this issue

Team jobs #258

Merged

0-sv added Task Scrum task Story Scrum story and removed enhancement New feature or request Task Scrum task labels Dec 22, 2020

0-sv mentioned this issue Jan 5, 2021

Team jobs #258

Merged

0-sv changed the title ~~Create jobs mechanism for teams~~ K8s Jobs for teams Feb 2, 2021

Morriz closed this as completed in #258 Feb 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

K8s Jobs for teams #204

K8s Jobs for teams #204

Morriz commented Nov 16, 2020 •

edited by 0-sv

Loading

0-sv commented Dec 22, 2020 •

edited

Loading

0-sv commented Jan 5, 2021 •

edited

Loading

Morriz commented Jan 5, 2021 •

edited

Loading

K8s Jobs for teams #204

K8s Jobs for teams #204

Comments

Morriz commented Nov 16, 2020 • edited by 0-sv Loading

Tasks

Core

Meta-schema proof of concept

Previous configuration refactoring

Team jobs basic functionality

Extra functionality for Jobs (e.g. when the Jobs spec changes -- should be exposed to the end-user)

Missing functionality, see #295

Definition Of Done

0-sv commented Dec 22, 2020 • edited Loading

0-sv commented Jan 5, 2021 • edited Loading

Morriz commented Jan 5, 2021 • edited Loading

Morriz commented Nov 16, 2020 •

edited by 0-sv

Loading

0-sv commented Dec 22, 2020 •

edited

Loading

0-sv commented Jan 5, 2021 •

edited

Loading

Morriz commented Jan 5, 2021 •

edited

Loading