Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't bind PVC to StorageClass using gluster block #23881

Open
uselessidbr opened this issue Sep 28, 2019 · 9 comments

Comments

@uselessidbr
Copy link

commented Sep 28, 2019

Unable to claim a PVC from a StorageClass of a block storage.

The claim still on pending state. On events it show:

  waiting for a volume to be created, either by external provisioner "gluster.org/glusterblock-infra-storage" or manually created by system administrator
Version

oc v3.11.0+62803d0-1
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

openshift v3.11.0+d545883-301
kubernetes v1.11.0+d4cacc0

Steps To Reproduce
  1. Create Storage
  2. Select StorageClass: glusterfs-registry-block
  3. See storage in pending state forever
Current Result

PVC in pending state showing:

| waiting for a volume to be created, either by external provisioner "gluster.org/glusterblock-infra-storage" or manually created by system administrator
-- | --

Expected Result

PVC should bind to PV dynamically.

Additional Information

OC ADM DIAGNOSTICS:
oc_adm_diag.txt

OC GET ALL
oc_get_all.txt

[root@master1 ~]# oc get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
metrics-cassandra-1 Pending glusterfs-registry-block 1h
[root@master1 ~]# oc describe pvc metrics-cassandra-1
Name: metrics-cassandra-1
Namespace: openshift-infra
StorageClass: glusterfs-registry-block
Status: Pending
Volume:
Labels:
Annotations: volume.beta.kubernetes.io/storage-class=glusterfs-registry-block
volume.beta.kubernetes.io/storage-provisioner=gluster.org/glusterblock-infra-storage
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
Events:
Type Reason Age From Message


Normal ExternalProvisioning 3m (x332 over 1h) persistentvolume-controller waiting for a volume to be created, either by external provisioner "gluster.org/glusterblock-infra-storage" or manually created by system administrator

root@master1 ~]# oc get storageclass
]NAME PROVISIONER AGE
glusterfs-registry-block gluster.org/glusterblock-infra-storage 1h
glusterfs-storage kubernetes.io/glusterfs 1h
glusterfs-storage-block gluster.org/glusterblock-app-storage 1h

[root@master1 ~]# oc describe storageclass glusterfs-registry-block
Name: glusterfs-registry-block
IsDefaultClass: No
Annotations:
Provisioner: gluster.org/glusterblock-infra-storage
Parameters: chapauthenabled=true,hacount=3,restsecretname=heketi-registry-admin-secret-block,restsecretnamespace=infra-storage,resturl=http://heketi-registry.infra-storage.svc:8080,restuser=admin
AllowVolumeExpansion:
MountOptions:
ReclaimPolicy: Delete
VolumeBindingMode: Immediate
Events:

Inventory:

openshift_master_dynamic_provisioning_enabled=true

openshift_hosted_registry_storage_kind=glusterfs
openshift_hosted_registry_storage_volume_size=20Gi
openshift_hosted_registry_selector='node-role.kubernetes.io/infra=true'

openshift_metrics_install_metrics=true
openshift_metrics_cassandra_storage_type=pv
openshift_metrics_hawkular_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_cassandra_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_heapster_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_storage_volume_size=10Gi
openshift_metrics_cassandra_pvc_storage_class_name="glusterfs-registry-block"

openshift_logging_install_logging=true
openshift_logging_kibana_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_curator_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_es_pvc_size=10Gi
openshift_logging_elasticsearch_storage_type=pvc
openshift_logging_es_pvc_storage_class_name="glusterfs-registry-block"
openshift_logging_es_pvc_dynamic=true

openshift_storage_glusterfs_timeout=900
openshift_storage_glusterfs_namespace=app-storage
openshift_storage_glusterfs_storageclass=true
openshift_storage_glusterfs_storageclass_default=false
openshift_storage_glusterfs_block_deploy=true
openshift_storage_glusterfs_block_host_vol_size=50
openshift_storage_glusterfs_block_storageclass=true
openshift_storage_glusterfs_block_storageclass_default=false
openshift_storage_glusterfs_wipe=true

openshift_storage_glusterfs_registry_timeout=900
openshift_storage_glusterfs_registry_namespace=infra-storage
openshift_storage_glusterfs_registry_block_deploy=true
openshift_storage_glusterfs_registry_block_host_vol_size=30
openshift_storage_glusterfs_registry_block_storageclass=true
openshift_storage_glusterfs_registry_block_storageclass_default=false
openshift_storage_glusterfs_registry_wipe=true

[glusterfs]
node1.openshift.local glusterfs_devices='[ "/dev/sdc" ]'
node2.openshift.local glusterfs_devices='[ "/dev/sdc" ]'
node3.openshift.local glusterfs_devices='[ "/dev/sdc" ]'

[glusterfs_registry]
master1.openshift.local glusterfs_devices='[ "/dev/sdc" ]'
master2.openshift.local glusterfs_devices='[ "/dev/sdc" ]'
master3.openshift.local glusterfs_devices='[ "/dev/sdc" ]'

@uselessidbr

This comment has been minimized.

Copy link
Author

commented Sep 29, 2019

On heketi pod i got this message on deploy:

Setting up heketi database
File: /var/lib/heketi/heketi.db
Size: 65536 Blocks: 104 IO Block: 131072 regular file
Device: 100004h/1048580d Inode: 12260247531261666053 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2019-09-29 02:35:55.934796553 +0000
Modify: 2019-09-29 02:45:25.856185586 +0000
Change: 2019-09-29 02:45:25.858185601 +0000
Birth: -
Heketi v9.0.0-1-g57a5f356-release-9
[heketi] INFO 2019/09/29 02:50:15 Loaded kubernetes executor
[heketi] INFO 2019/09/29 02:50:15 Volumes per cluster limit is set to default value of 1000
[heketi] INFO 2019/09/29 02:50:15 Block: Auto Create Block Hosting Volume set to true
[heketi] INFO 2019/09/29 02:50:15 Block: New Block Hosting Volume size 30 GB
[heketi] INFO 2019/09/29 02:50:15 Started Node Health Cache Monitor
[heketi] INFO 2019/09/29 02:50:15 GlusterFS Application Loaded
[heketi] INFO 2019/09/29 02:50:15 Started background pending operations cleaner

But no block volume was created.

@uselessidbr

This comment has been minimized.

Copy link
Author

commented Sep 29, 2019

Hello!

I think i managed to solve the problem.

As i noticed that the pvc was waiting for the provisioner, and i couldnt see any log at the provisioner i thought that the "provisioner" tag should be wrong at storageclass definition:

Provisioner: gluster.org/glusterblock-infra-storage

Altought there was a environment in the pod:

image

So i changed the provisioner to "gluster.org/glusterblock" in the storageclass:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: glusterfs-registry-block-teste
provisioner: gluster.org/glusterblock
parameters:
chapauthenabled: 'true'
hacount: '3'
restsecretname: heketi-registry-admin-secret-block
restsecretnamespace: infra-storage
resturl: 'http://heketi-registry.infra-storage.svc:8080'
restuser: admin
reclaimPolicy: Delete
volumeBindingMode: Immediate

After that i had to change the secret "heketi-registry-admin-secret-block" on namespace "infra-storage" because it had the type "gluster.org/glusterblock-infra-storage" but the provisioner was trying to find "gluster.org/glusterblock".

Had to copy the content of the secret, delete it and recreate with the expected type: "gluster.org/glusterblock".

@uselessidbr

This comment has been minimized.

Copy link
Author

commented Sep 29, 2019

Update.

The only thing i'm concerned about is that i have two provisioners (app-storage and infra-storage) and both are receiving request to delete volumes. I'm not sure that it would be a problem unless the name matches on both ends.

Is that an expected behaviour? Both have distinct resturl on storageclass but something is definitely wrong with the environment "provisioner_name" that is what i think that could isolated both provisioners.

@cgruver

This comment has been minimized.

Copy link

commented Oct 17, 2019

Good catch! I was having similar issues with a newly deployed OKD 3.11 cluster. An older cluster is working fine. After comparing the two, your notes matched up. The older cluster uses a provisioner type of "gluster.org/glusterblock". I suspect a bug was introduced into the Ansible playbooks, or an Ansible version upgrade has broken logic in the playbooks that is deprecated.

@cgruver

This comment has been minimized.

Copy link

commented Oct 17, 2019

This may be it right here:

openshift-ansible/roles/openshift_storage_glusterfs/templates/gluster-block-storageclass.yml.j2

You will see that it sets "provisioner: gluster.org/glusterblock-{{ glusterfs_namespace }}". I suspect that this should really be "provisioner: gluster.org/glusterblock"


apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: glusterfs-{{ glusterfs_name }}-block
{% if glusterfs_block_storageclass_default is defined and glusterfs_block_storageclass_default %}
annotations:
storageclass.kubernetes.io/is-default-class: "true"
{% endif %}
provisioner: gluster.org/glusterblock-{{ glusterfs_namespace }}
parameters:
resturl: "http://{% if glusterfs_heketi_is_native %}heketi-{{ glusterfs_name }}.{{ glusterfs_namespace }}.svc:8080{% else %}{{ glusterfs_heketi_url }}:{{ glusterfs_heketi_port }}{% endif %}"
restuser: "admin"
chapauthenabled: "true"
hacount: "3"
{% if glusterfs_heketi_admin_key is defined %}
restsecretnamespace: "{{ glusterfs_namespace }}"
restsecretname: "heketi-{{ glusterfs_name }}-admin-secret-block"
{%- endif -%}

@cgruver

This comment has been minimized.

Copy link

commented Oct 17, 2019

This change was introduced on Aug. 28 to address this:

https://bugzilla.redhat.com/show_bug.cgi?id=1738394

So, we may be missing some additional context? Perhaps there's a necessary change to the provisioner itself to indicate which namespace it is in.

At this point, I am suspecting that the glusterblock-provisioner container image is not properly consuming the parameter: PROVISIONER_NAME from the deployment config. Perhaps there is a bug there, or we may just need a newer image that properly consumes that parameter.

@cgruver

This comment has been minimized.

Copy link

commented Oct 17, 2019

Also look here:

kubernetes-incubator/external-storage#1168

The provisioner was modified to use the PROVISIONER_NAME provided in the environment.

This functionality is either still not working, or we ended up with older container images that still have the bug.

@cgruver

This comment has been minimized.

Copy link

commented Oct 18, 2019

The containers currently in quay.io or docker hub are older than the code changes to the Ansible role for Gluster install. There also appear to be some lingering issues with the glusterblock-provisioner code. It looks like it has "gluster.org/glusterblock" set as a global constant for the provisioner name, but does not replace that constant everywhere with the environment variable.

I'm testing a fix this morning.

@cgruver

This comment has been minimized.

Copy link

commented Oct 18, 2019

I have a fix working in my OKD 3.11 cluster. The repo at "https://github.com/kubernetes-incubator/external-storage" is marked as deprecated, so I don't know if anyone will answer a pull request. However, you can get the fix from here: "https://github.com/cgruver/external-storage".

You will need to build a local instance of the glusterblock-provisioner container image and push it to your registry. I'm using Sonatype Nexus as a local and proxy registry for my OKD clusters. I have a local registry path called openshift which is where I put the container images for installation and updates.

git clone https://github.com/cgruver/external-storage.git
cd external-storage/gluster/block
export REGISTRY=your.registry.com:5000/openshift/
export VERSION=v3.11
make container

docker login your.registry.com:5000
docker push ${REGISTRY}/glusterblock-provisioner/${VERSION}

The last thing that you will need to do is modify the DeploymentConfig for the glusterblock-provisioner to pull the correct image, if you are not doing a clean install.

If you are doing a clean install, then your Ansible inventory file just needs to know where the image is:

openshift_storage_glusterfs_block_image=your.registry.com:5000/openshift/glusterblock-provisioner:v3.11
openshift_storage_glusterfs_registry_block_image=your.registry.com:5000/openshift/glusterblock-provisioner:v3.11

That should fix it...

I'm going to provide cleaner code in the fix, and submit a pull request to the original repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.