Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong pvc for middle manager #135

Open
sneerin opened this issue Feb 9, 2021 · 12 comments
Open

Wrong pvc for middle manager #135

sneerin opened this issue Feb 9, 2021 · 12 comments

Comments

@sneerin
Copy link

sneerin commented Feb 9, 2021

create Pod druid-tiny-cluster-middlemanagers-0 in StatefulSet druid-tiny-cluster-middlemanagers failed error: failed to create PVC -druid-tiny-cluster-middlemanagers-0: PersistentVolumeClaim "-druid-tiny-cluster-middlemanagers-0" is invalid: metadata.name: Invalid value: "-druid-tiny-cluster-middlemanagers-0": a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is 'a-z0-9?(.a-z0-9?)*')

@AdheipSingh
Copy link
Contributor

can you send your yaml ?
ideally tiny-cluster.yaml should not throw this error, regardless can you tell which CRD did you install ? wonder if the openapi has something to do

@sneerin
Copy link
Author

sneerin commented Feb 9, 2021

I'm using the sample, what I'm basically looking at, is adding more peons, so I can add more datasources.
`
middlemanagers:
nodeType: "middleManager"
druid.port: 8080
replicas: 3
nodeConfigMountPath: /opt/druid/conf/druid/cluster/data/middleManager
ports:
-
name: peon-0-pt
containerPort: 8100
-
name: peon-1-pt
containerPort: 8101
-
name: peon-2-pt
containerPort: 8102
-
name: peon-3-pt
containerPort: 8103
-
name: peon-4-pt
containerPort: 8104
-
name: peon-5-pt
containerPort: 8105
-
name: peon-6-pt
containerPort: 8106
-
name: peon-7-pt
containerPort: 8107
-
name: peon-8-pt
containerPort: 8108
-
name: peon-9-pt
containerPort: 8109

  runtime.properties: |-
    druid.service=druid/middleManager
    druid.worker.capacity=3
    druid.indexer.runner.javaOpts=-server -XX:MaxDirectMemorySize=10240g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/druid/data/tmp -Dlog4j.debug -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=50 -XX:GCLogFileSize=10m -XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError -XX:+UseG1GC -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -Xloggc:/druid/data/logs/peon.gc.%t.%p.log -XX:HeapDumpPath=/druid/data/logs/peon.%t.%p.hprof -Xms1G -Xmx1G
    druid.indexer.task.baseTaskDir=/druid/data/baseTaskDir
    druid.server.http.numThreads=10
    druid.indexer.fork.property.druid.processing.buffer.sizeBytes=268435456
    druid.indexer.fork.property.druid.processing.numMergeBuffers=1
    druid.indexer.fork.property.druid.processing.numThreads=10
    druid.indexer.task.hadoopWorkingPath=/druid/data/hadoop-working-path
    druid.indexer.task.defaultHadoopCoordinates=[\"org.apache.hadoop:hadoop-client:2.7.3\"]
  extra.jvm.options: |-
    -Xmx1G
    -Xms1G
  volumeClaimTemplates:
   - metadata:
       name: data-volume
     spec:
       accessModes:
       - ReadWriteOnce
       resources:
         requests:
           storage: 2Gi

  volumeMounts:
    - mountPath: /druid/data
      name: data-volume
  resources:
    requests:
      memory: "3Gi"
      cpu: "4"
    limits:
      memory: "3Gi"
      cpu: "4"`

@dsx
Copy link
Contributor

dsx commented Feb 16, 2021

Same symptoms:

create Pod druid-druid-historicals-0 in StatefulSet druid-druid-historicals failed error: failed to create PVC -druid-druid-historicals-0: PersistentVolumeClaim "-druid-druid-historicals-0" is invalid: metadata.name: Invalid value: "-druid-druid-historicals-0": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z09[]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')

Kubernetes v1.20.2
druid-operator v0.0.5

@himanshug
Copy link
Member

@sneerin @dsx thanks for reporting the issue,

My first guess is that it is the yaml indentation issue, and there is one extra space before whole of following section

   - metadata:
       name: data-volume
     spec:
       accessModes:
       - ReadWriteOnce
       resources:
         requests:
           storage: 2Gi

can you retry with one space removed from the beginning for all of those lines. Please confirm if that fixes the issue, I will add some validation logic in operator to catch for this specific case.

If that does not fix the issue, then .....

It sounds like StatefulSet is getting created successfully . Can you post the volumeClaimTemplates snippet from the StatefulSet spec that druid operator created in k8s.

e.g. run kubectl get sts druid-tiny-cluster-middlemanagers -o yaml and copy the volumeClaimTemplates section from that.

My guess is that somehow volumeClaimTemplates's name is getting set to nil/empty instead of data-volume but want to verify that.

@dsx
Copy link
Contributor

dsx commented Feb 17, 2021

My guess is that somehow volumeClaimTemplates's name is getting set to nil/empty instead of data-volume but want to verify that.

That is exactly what's happening:

  volumeClaimTemplates:
  - apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      creationTimestamp: null
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 120Gi
      storageClassName: do-block-storage
      volumeMode: Filesystem
    status:
      phase: Pending

@himanshug
Copy link
Member

@dsx did you check for the indentation issues ? if that did not fix the problem, is it possible to post your/a-version-of full tiny-cluster.yaml that I can use to reproduce the issue.

@dsx
Copy link
Contributor

dsx commented Feb 19, 2021

@himanshug indentation seems to be fine. Here is manifest I'm using to deploy.

@himanshug
Copy link
Member

I took another look at this today and I think there is a bug in CRD definition that contains schema validation.

for now, you can use the CRD in https://github.com/druid-io/druid-operator/blob/master/deploy/crds/druid.apache.org_druids_crd.yaml , that has no schema validation and hopefully that will make things work.

I will further look into what is wrong with the schema validation in https://github.com/druid-io/druid-operator/blob/master/deploy/crds/druid.apache.org_druids.yaml

@lum-splunk
Copy link

lum-splunk commented Mar 12, 2021

I was able to get the latest CRD to work by making this modification to the volumeClaimTemplates.metadata section

                          metadata:
                            description: 'Standard object''s metadata. More info:
                              https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata'
                            properties:
                              name:
                                description: 'Name of volume.'
                                type: string
                              labels:
                                additionalProperties:
                                  type: string
                                descriptions: 'Volume labels.'
                                type: object

@sneerin
Copy link
Author

sneerin commented Mar 16, 2021

Hi, thanks a lot for sharing all the details, is there a PR I can use?

@AdheipSingh
Copy link
Contributor

@lum-splunk can you raise a PR for the fix.
Thanks

@lum-splunk
Copy link

@AdheipSingh created a PR here #152

@AdheipSingh AdheipSingh mentioned this issue Apr 16, 2021
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants