Chart refactor #3

FedeBev · 2021-06-14T16:01:55Z

Since there's still an open issue in the main repo and I've the feeling this chart has been dropped, I've decided to rework it completely.

The main differences are:

different values for server and agent
HPA
it works (seriously, the original didn't work for me because of this issue)
server is deployed using a StatefulSet, allowing the usage of PV (through volume claim template), resolving support K8s pvc #2
a custom image can be specified, useful when an image with a custom plugin should be used
I've no longer seen Auto-Join issues with Kubernetes #1, maybe just because of dkron upgrade (3.1.6)

Known issues:

no health check specified, always because of this (help wanted)

Let me know what are your thoughts, I'm open to feedback available to make any changes

vcastellm · 2021-11-07T22:37:38Z

Sorry for the mega-delay here @FedeBev, thank you very much for the work! I had not prioritized the K8s work until now, going to test this and merge if it works.

vcastellm

Aside from the nitpicks, it LGTM but I have been unable to do a rolling restart without breaking the cluster, what's your strategy?

dkron/templates/envsConfigmap.yaml

dkron/templates/serviceaccount.yaml

FedeBev · 2021-11-09T10:08:47Z

What kind of problem are you facing @Victorcoder? I just tried a rolling update from version v3.1.8 to v3.1.10 and everything went good except for a slight leader election delay.

Do you mind sharing your values and some details about your cluster?

I've just fixed the missing newlines and cleaned up the sample values file

vcastellm · 2021-11-14T22:27:09Z

I just do a helm install helm upgrade and the rolling is too fast for the leader election to happen correctly, when done rolling there's no leader in the cluster, all nodes ends up as followers.

Are you tweeking some other options?

FedeBev · 2021-11-24T09:32:46Z

No additional tweek on my configuration.

This also happens to me, but after a few seconds after the end of the rollout a new leader is elected again and everything goes fine.

We can try to tune the rollout strategy, but taking into account the issue about the kubernetes discovery, I don't think we can achieve something better with helm. An operator would solve the issue, but it's a huge effort.

vcastellm · 2021-11-25T09:03:37Z

@FedeBev what about a liveness, readiness check?

FedeBev · 2021-11-25T09:11:33Z

@vcastellm in the issue about the kubernetes discovery I've mentioned before, you can find why I wasn't able to use those.

I don't see any other possible way right now. I guess we should change something about how dkron starts or request a PR to the kubernetes discovery library.
In my opinion, the point is that dkron should be able to discover itself when the pod is running (/health returns 200), NOT when is ready. This way the k8s discovery works and we can implement a useful /health and ready probe.

vcastellm · 2021-11-25T09:14:46Z

Understood, thanks, I will take a look on how we can improve the /health endpoint

FedeBev · 2021-11-25T11:19:34Z

Keep me posted, I'd be glad to help.
Dkron could become a core component for a project I'm working on however our platform relies entirely on K8s and this is a big issue

Espina2 · 2021-12-05T13:40:16Z

I'm also very interested in this fix. @vcastellm @FedeBev anything that I can do to make this happen? Also, is this not better than the currently available chart?

FedeBev · 2021-12-07T09:28:36Z

@Espina2 the chart is working on my cluster in a dev environment, but it's still very young and there's still a long way to go before being production ready.
There's an issue with the k8s discovery that makes the dkron cluster unavailable due to master election during the rolling upgrade.

* Add Server Pod Disruption Budget * Add missing newline at EOF * Add preStop command to mitigate raft peers total caos in Server StatefulSet Co-authored-by: Alberto Pinardi <alberto.pinardi@criptalia.it>

omarzouk · 2023-03-01T16:52:36Z

hello! also highly interested in this. Has anyone come up with a way to work around the discovery issue? any thoughts on how to proceed?

vcastellm · 2023-03-05T22:42:42Z

Merged this PR and added some changes, I'll take over from here.

Federico Bevione and others added 3 commits June 14, 2021 17:39

chart refactor

8c1b0eb

refactor folder struct

01d942e

add env variables

c2459d2

vcastellm mentioned this pull request Nov 8, 2021

Auto-Join issues with Kubernetes #1

Closed

vcastellm requested changes Nov 8, 2021

View reviewed changes

dkron/templates/envsConfigmap.yaml Outdated Show resolved Hide resolved

dkron/templates/serviceaccount.yaml Outdated Show resolved Hide resolved

FedeBev requested a review from vcastellm November 24, 2021 09:32

Add Server PDR to mitigate fast roll-over while upgrading (#1)

873d8b9

* Add Server Pod Disruption Budget * Add missing newline at EOF * Add preStop command to mitigate raft peers total caos in Server StatefulSet Co-authored-by: Alberto Pinardi <alberto.pinardi@criptalia.it>

vcastellm merged commit 873d8b9 into distribworks:master Mar 5, 2023

vcastellm mentioned this pull request Jul 23, 2023

k8s retry-join exclude itself because of non-readiness distribworks/dkron#1323

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chart refactor #3

Chart refactor #3

FedeBev commented Jun 14, 2021 •

edited

vcastellm commented Nov 7, 2021

vcastellm left a comment

FedeBev commented Nov 9, 2021

vcastellm commented Nov 14, 2021

FedeBev commented Nov 24, 2021

vcastellm commented Nov 25, 2021

FedeBev commented Nov 25, 2021

vcastellm commented Nov 25, 2021

FedeBev commented Nov 25, 2021

Espina2 commented Dec 5, 2021

FedeBev commented Dec 7, 2021 •

edited

omarzouk commented Mar 1, 2023

vcastellm commented Mar 5, 2023

Chart refactor #3

Chart refactor #3

Conversation

FedeBev commented Jun 14, 2021 • edited

vcastellm commented Nov 7, 2021

vcastellm left a comment

Choose a reason for hiding this comment

FedeBev commented Nov 9, 2021

vcastellm commented Nov 14, 2021

FedeBev commented Nov 24, 2021

vcastellm commented Nov 25, 2021

FedeBev commented Nov 25, 2021

vcastellm commented Nov 25, 2021

FedeBev commented Nov 25, 2021

Espina2 commented Dec 5, 2021

FedeBev commented Dec 7, 2021 • edited

omarzouk commented Mar 1, 2023

vcastellm commented Mar 5, 2023

FedeBev commented Jun 14, 2021 •

edited

FedeBev commented Dec 7, 2021 •

edited