Kubernetes provisioning example fails #924

bittrance · 2021-03-14T22:30:36Z

Describe the bug
The Cloud Auto-join page for Kubernetes suggests dkron agent --retry-join "provider=k8s label_selector=\"app=dkron,component=server\"". However, this command results in cli args parse error message invalid argument "provider=k8s label_selector=\"app=dkron,component=server\"" for "--retry-join" flag: parse error on line 1, column 28: bare " in non-quoted-field.

To Reproduce

func Test_k8sQuotedLabels(t *testing.T) {
	flagset := ConfigFlagSet()
	result := flagset.Parse([]string{"--retry-join", "provider=k8s label_selector=\"foo=bar\""})
	assert.Equal(t, result, nil)
}

Expected behavior
When trying without quoting, e.g. label_selector=foo=bar, dkron log complains

ERRO[2021-03-14T23:15:05+01:00] agent: Error Joining                          cluster=LAN error="discover: label_selector: - equals in key's value, enclosing double-quote needed label_selector=\"value-with-=-symbol\"" node=jaron

so I would have expected this advise to work.

Specifications
Encountered with Docker image and on master branch at 2b2a2b2.

Additional context
On researching this, ran into containers/podman#2574 which I believe is a similar issue and suggests which suggests using StringArray instead of StringSlice. I do not have an easy way to run a locally built container in my Kubernetes cluster, so I can't easily test if that fix fully addresses this issue, just that the parsing test given above turns green with that fix.

The text was updated successfully, but these errors were encountered:

smullins3000 · 2022-05-05T00:51:08Z

Does anyone know of a fix for this issue? I'm hitting the same problem and can't see a workaround for this.

joshniec · 2022-05-29T23:58:43Z

Yes, I ran into this problem too when trying to pass the options as a CLI argument, instead, I opted to setting my dkron configuration in a yaml file, dkron.yml.

For example, my dkron.yml file looks something like (stored in the same directory as my manifests for the dkron agent):

server: false
data-dir: /opt/local/dkron
disable-http-tls: true
enable-prometheus: true
log-level: debug
ui: false
retry-join:
  - provider=k8s namespace=dkron label_selector="app=dkron,component=server"
serf-reconnect-timeout: 5s
tags:
  role: agent
  type: worker

I use kustomize's ConfigMapGenerator to create ConfigMap resource, I do this because I also pass in another yaml file containing the slack processor configuration:

configMapGenerator:
  - name: dkron-agent-config
    files:
      - dkron.yml
      - dkron-processor-slack.yml

And my manifest looks something like this, I pass --config=/etc/dkron/dkron.yml as a CLI argument (note: I run dkron agents as a DaemonSet on my Kubernetes cluster, and my dkron server as a StatefulSet, I use a customized docker container that includes both the dkron binary & kubectl, and the dkron agents execute jobs via kubectl exec and some other magic, I also omitted some other details that I cannot share here):

---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: dkron-agent
  labels:
    app: dkron
    component: agent
spec:
  selector:
    matchLabels:
      app: dkron
      component: agent
  template:
    metadata:
      labels:
        app: dkron
        component: agent
    spec:
      serviceAccountName: dkron-agent-sa
      volumes:
        - name: dkron-agent-config
          projected:
            sources:
              - configMap:
                  name: dkron-agent-config
      containers:
        - name: dkron-agent
          image: <omitted>
          args:
            - agent
            - --config=/etc/dkron/dkron.yml
          ports:
            - name: http
              containerPort: 80
            - name: grpc
              containerPort: 6868
            - name: serf
              containerPort: 8946
          imagePullPolicy: Always
          volumeMounts:
            - name: dkron-agent-config
              mountPath: /etc/dkron
              readOnly: true

I use RBAC resources to control what apis/resources the dkron agents can interact with on the api-server when calling kubectl

nikunjbadjatya · 2022-09-27T16:21:32Z

Hi @joshniec

Can you share how have you setup your dkron cluster instance on K8S ?
Is the dkron.yml same for all the pods in the statefulset ? for our setup the cloud auto join does not work (3.2.1 version)

I have added an issue here #1191

Thanks.

joshniec · 2022-09-27T17:18:13Z

I no longer work at the same place when I first posted the comment, but trying to remember.

I had a 3 node server cluster running on stateful sets. I recall it being extremely unstable, I would not at all recommend running your dkron servers in Kubernetes, the agents are fine.

Yes dkron.yml was the same for all statefulsets.

Cloud auto discovery for me did not work if it was defined in dkron.yml, it had to be passed via the cli.

I would never recommend running the servers in kube, the implementation of raft here seems to be extremely buggy and unexpectedly breaks down/stops working at seemingly random times or if a host reboots.

smullins3000 · 2022-10-15T15:55:53Z

Cloud auto discovery for me did not work if it was defined in dkron.yml, it had to be passed via the cli.

How did you get the CLI to work? That was the original problem in this issue.

vcastellm · 2023-03-06T07:33:21Z

Take a look to https://github.com/distribworks/dkron-helm/blob/master/dkron/templates/server-statefulSet.yaml#L46 it should work just fine.

Closing in the meantime

bittrance changed the title ~~Kubernetes provisioning example does fails~~ Kubernetes provisioning example fails Mar 14, 2021

vcastellm closed this as completed Mar 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubernetes provisioning example fails #924

Kubernetes provisioning example fails #924

bittrance commented Mar 14, 2021

smullins3000 commented May 5, 2022

joshniec commented May 29, 2022 •

edited

Loading

nikunjbadjatya commented Sep 27, 2022

joshniec commented Sep 27, 2022

smullins3000 commented Oct 15, 2022

vcastellm commented Mar 6, 2023

Kubernetes provisioning example fails #924

Kubernetes provisioning example fails #924

Comments

bittrance commented Mar 14, 2021

smullins3000 commented May 5, 2022

joshniec commented May 29, 2022 • edited Loading

nikunjbadjatya commented Sep 27, 2022

joshniec commented Sep 27, 2022

smullins3000 commented Oct 15, 2022

vcastellm commented Mar 6, 2023

joshniec commented May 29, 2022 •

edited

Loading