Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

harbor-core cannot find redis #786

Closed
mpdevul opened this issue Oct 1, 2021 · 17 comments · Fixed by #792
Closed

harbor-core cannot find redis #786

mpdevul opened this issue Oct 1, 2021 · 17 comments · Fixed by #792
Labels
kind/bug Something isn't working

Comments

@mpdevul
Copy link

mpdevul commented Oct 1, 2021

Trying the release 1.1.1 on OpenShift cluster 4.7 the harbor core pod fails,

NAME                                                                  READY   STATUS             RESTARTS   AGE
pod/harborcluster-sample-harbor-harbor-chartmuseum-7567d7df57-xqzjt   1/1     Running            0          78m
pod/harborcluster-sample-harbor-harbor-core-6747f59db7-q8knj          0/1     CrashLoopBackOff   18         78m
pod/harborcluster-sample-harbor-harbor-notaryserver-64b55cf98fnm67g   1/1     Running            0          78m
pod/harborcluster-sample-harbor-harbor-notarysigner-dc956749c-5glsc   1/1     Running            0          78m
pod/harborcluster-sample-harbor-harbor-portal-65d4d4b895-68rsv        1/1     Running            0          78m
pod/harborcluster-sample-harbor-harbor-registry-5996b5684d-rglg8      1/1     Running            0          78m
pod/harborcluster-sample-harbor-harbor-registryctl-7d997f9dc7-kls89   1/1     Running            0          78m
pod/harborcluster-sample-harbor-harbor-trivy-86d87c695d-9jgsz         1/1     Running            0          78m
pod/postgresql-cluster-sample-ns-harborcluster-sample-0               1/1     Running            0          15m

Logs below.

2021-10-01T09:38:41Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 500ms : redigo: no sentinels available; last error: dial tcp: lookup rfs-harborcluster-sample-redis on 11.32.0.10:53: no such host
2021-10-01T09:38:41Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 842.78964ms : redigo: no sentinels available; last error: dial tcp: lookup rfs-harborcluster-sample-redis on 11.32.0.10:53: no such host
2021-10-01T09:38:42Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 1.126973569s : redigo: no sentinels available; last error: dial tcp: lookup rfs-harborcluster-sample-redis on 11.32.0.10:53: no such host
2021-10-01T09:38:43Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 3.373485054s : redigo: no sentinels available; last error: dial tcp: lookup rfs-harborcluster-sample-redis on 11.32.0.10:53: no such host
2021-10-01T09:38:47Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 6.475083242s : redigo: no sentinels available; last error: dial tcp: lookup rfs-harborcluster-sample-redis on 11.32.0.10:53: no such host
2021-10-01T09:38:53Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 10s : redigo: no sentinels available; last error: dial tcp: lookup rfs-harborcluster-sample-redis on 11.32.0.10:53: no such host
2021-10-01T09:39:03Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 10s : redigo: no sentinels available; last error: dial tcp: lookup rfs-harborcluster-sample-redis on 11.32.0.10:53: no such host
2021-10-01T09:39:13Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 5.958670678s : redigo: no sentinels available; last error: dial tcp: lookup rfs-harborcluster-sample-redis on 11.32.0.10:53: no such host
2021-10-01T09:39:19Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 10s : redigo: no sentinels available; last error: dial tcp: lookup rfs-harborcluster-sample-redis on 11.32.0.10:53: no such host
2021-10-01T09:39:29Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 10s : redigo: no sentinels available; last error: dial tcp: lookup rfs-harborcluster-sample-redis on 11.32.0.10:53: no such host
2021-10-01T09:39:39Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 10s : redigo: no sentinels available; last error: dial tcp: lookup rfs-harborcluster-sample-redis on 11.32.0.10:53: no such host
2021-10-01T09:39:49Z [FATAL] [/core/main.go:165]: failed to initialize cache: retry timeout: redigo: no sentinels available; last error: dial tcp: lookup rfs-harborcluster-sample-redis on 11.32.0.10:53: no such host

dont see any pod call rfs-harborcluster-sample-redis created or the corresponding service.

I did see the redis pod was getting created in the older release. Could this have been missed.

@cndoit18
Copy link
Collaborator

Hi, can you provide your installation steps?

@mpdevul
Copy link
Author

mpdevul commented Oct 11, 2021

Hi, I have switched to the release tag v1.1.1, then
kubectl apply -f manifests/cluster/deployment.yaml
I see the pods up and running.
image

Then I do kubectl apply -f manifests/samples/standard_stack_fs.yaml

also for the posgres db to come up on openshift I had to refer to the below issue, and updated the RBAC as mentioned.

zalando/postgres-operator#985

postgres db comes up fine post this, but I do not see any redis related pods or service getting created in cluster-sample-ns, and as mentioned originally, the harbor-core pod goes into crashloopback as it cannot find the service related to redis, which are not getting created.

@chlins
Copy link
Member

chlins commented Oct 11, 2021

@mpdevul Hi, can you provide redis operator logs and check whether redis cr has been created?

@mpdevul
Copy link
Author

mpdevul commented Oct 11, 2021

I see forbidden errors, looks we may need some more RBAC updates ?


time="2021-10-11T13:39:41Z" level=info msg="Listening on :9710 for metrics exposure" src="asm_amd64.s:1337"

time="2021-10-11T13:39:41Z" level=warning msg="controller name not provided, it should have a name, fallback name to: *redisfailover.RedisFailoverHandler" controller=redisfailover operator=redis-operator src="generic.go:64"

time="2021-10-11T13:39:41Z" level=info msg="starting operator" operator=redis-operator src="main.go:87"

time="2021-10-11T13:39:41Z" level=info msg="operator initialized" operator=redis-operator src="operator.go:103"

time="2021-10-11T13:39:41Z" level=info msg="starting controller" controller=redisfailover operator=redis-operator src="generic.go:178"

time="2021-10-11T13:39:41Z" level=warning msg="error processing cluster-sample-ns/harborcluster-sample-redis job (requeued): services \"rfs-harborcluster-sample-redis\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>" controller=redisfailover operator=redis-operator src="generic.go:224"

time="2021-10-11T13:39:42Z" level=warning msg="error processing cluster-sample-ns/harborcluster-sample-redis job (requeued): services \"rfs-harborcluster-sample-redis\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>" controller=redisfailover operator=redis-operator src="generic.go:224"

time="2021-10-11T13:39:42Z" level=warning msg="error processing cluster-sample-ns/harborcluster-sample-redis job (requeued): services \"rfs-harborcluster-sample-redis\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>" controller=redisfailover operator=redis-operator src="generic.go:224"

time="2021-10-11T13:39:42Z" level=error msg="Error processing cluster-sample-ns/harborcluster-sample-redis: services \"rfs-harborcluster-sample-redis\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>" controller=redisfailover operator=redis-operator src="generic.go:224"

time="2021-10-11T13:40:11Z" level=warning msg="error processing cluster-sample-ns/harborcluster-sample-redis job (requeued): services \"rfs-harborcluster-sample-redis\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>" controller=redisfailover operator=redis-operator src="generic.go:224"

time="2021-10-11T13:40:11Z" level=warning msg="error processing cluster-sample-ns/harborcluster-sample-redis job (requeued): services \"rfs-harborcluster-sample-redis\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>" controller=redisfailover operator=redis-operator src="generic.go:224"

time="2021-10-11T13:40:11Z" level=warning msg="error processing cluster-sample-ns/harborcluster-sample-redis job (requeued): services \"rfs-harborcluster-sample-redis\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>" controller=redisfailover operator=redis-operator src="generic.go:224"

time="2021-10-11T13:40:11Z" level=error msg="Error processing cluster-sample-ns/harborcluster-sample-redis: services \"rfs-harborcluster-sample-redis\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>" controller=redisfailover operator=redis-operator src="generic.go:224"

@bitsf bitsf added the kind/bug Something isn't working label Oct 12, 2021
@chlins
Copy link
Member

chlins commented Oct 13, 2021

Upstream redis operator related issue: spotahome/redis-operator#98

@chlins
Copy link
Member

chlins commented Oct 13, 2021

@mpdevul Hi, can you share your redis clusterrole yaml? ( kubectl get clusterrole redisoperator -o yaml)

I think you can try to edit the clusterrole manually. ref: spotahome/redis-operator#161

@mpdevul
Copy link
Author

mpdevul commented Oct 13, 2021

@chlins the current clusterrole below, and thanks for the link, will update manually and try to deploy again as suggested

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  annotations:
    goharbor.io/deploy-engine: Kustomization
    goharbor.io/deploy-mode: cluster
    goharbor.io/operator-version: v1.1.1
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRole","metadata":{"annotations":{"goharbor.io/deploy-engine":"Kustomization","goharbor.io/deploy-mode":"cluster","goharbor.io/operator-version":"v1.1.1"},"name":"redisoperator"},"rules":[{"apiGroups":["databases.spotahome.com"],"resources":["redisfailovers"],"verbs":["*"]},{"apiGroups":["apiextensions.k8s.io"],"resources":["customresourcedefinitions"],"verbs":["*"]},{"apiGroups":[""],"resources":["pods","services","endpoints","events","configmaps"],"verbs":["*"]},{"apiGroups":[""],"resources":["secrets"],"verbs":["get"]},{"apiGroups":["apps"],"resources":["deployments","statefulsets"],"verbs":["*"]},{"apiGroups":["policy"],"resources":["poddisruptionbudgets"],"verbs":["*"]}]}
  creationTimestamp: "2021-10-06T10:47:06Z"
  managedFields:
  - apiVersion: rbac.authorization.k8s.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:goharbor.io/deploy-engine: {}
          f:goharbor.io/deploy-mode: {}
          f:goharbor.io/operator-version: {}
          f:kubectl.kubernetes.io/last-applied-configuration: {}
      f:rules: {}
    manager: kubectl-client-side-apply
    operation: Update
    time: "2021-10-06T10:47:06Z"
  name: redisoperator
  resourceVersion: "12914869"
  selfLink: /apis/rbac.authorization.k8s.io/v1/clusterroles/redisoperator
  uid: 5399cd4b-507b-461f-92e8-10ac3b2b467a
rules:
- apiGroups:
  - databases.spotahome.com
  resources:
  - redisfailovers
  verbs:
  - '*'
- apiGroups:
  - apiextensions.k8s.io
  resources:
  - customresourcedefinitions
  verbs:
  - '*'
- apiGroups:
  - ""
  resources:
  - pods
  - services
  - endpoints
  - events
  - configmaps
  verbs:
  - '*'
- apiGroups:
  - ""
  resources:
  - secrets
  verbs:
  - get
- apiGroups:
  - apps
  resources:
  - deployments
  - statefulsets
  verbs:
  - '*'
- apiGroups:
  - policy
  resources:
  - poddisruptionbudgets
  verbs:
  - '*'

@mpdevul
Copy link
Author

mpdevul commented Oct 13, 2021

@chlins

after updating the rbac permissions, i can see the redis pod running, but the core pod still goes into crashloopback

kubectl get pods -n cluster-sample-ns
NAME                                                              READY   STATUS             RESTARTS   AGE
harborcluster-sample-harbor-harbor-chartmuseum-5799f5b8bb-wsrd4   1/1     Running            0          34h
harborcluster-sample-harbor-harbor-core-c4497774b-797hn           0/1     CrashLoopBackOff   5          10m
harborcluster-sample-harbor-harbor-core-ddd9f555d-kr95x           0/1     Running            448        34h
harborcluster-sample-harbor-harbor-notaryserver-6d6477d648cfqvq   1/1     Running            1          34h
harborcluster-sample-harbor-harbor-notarysigner-67789cb587lm74s   1/1     Running            1          34h
harborcluster-sample-harbor-harbor-portal-849769b95b-g9s55        1/1     Running            0          34h
harborcluster-sample-harbor-harbor-registry-5759d8d7b4-5mp7r      1/1     Running            0          34h
harborcluster-sample-harbor-harbor-registryctl-55db969444-cnzcf   1/1     Running            0          34h
harborcluster-sample-harbor-harbor-trivy-6856fc7c57-tbqbs         1/1     Running            0          4m42s
postgresql-cluster-sample-ns-harborcluster-sample-0               1/1     Running            0          27m
rfs-harborcluster-sample-redis-6b7f4c4756-xhgz7                   1/1     Running            0          45m

 kubectl logs harborcluster-sample-harbor-harbor-core-c4497774b-797hn  -n cluster-sample-ns
Appending internal tls trust CA to ca-bundle ...
find: '/etc/harbor/ssl': No such file or directory
Internal tls trust CA appending is Done.
Appending trust CA to ca-bundle ...
 /harbor_cust_cert/ca.crt Appended ...
CA appending is Done.
2021-10-13T07:09:16Z [INFO] [/controller/artifact/annotation/parser.go:71]: the annotation parser to parser artifact annotation version v1alpha1 registered
2021-10-13T07:09:16Z [INFO] [/controller/artifact/processor/processor.go:59]: the processor to process media type application/vnd.cncf.helm.config.v1+json registered
2021-10-13T07:09:16Z [INFO] [/controller/artifact/processor/processor.go:59]: the processor to process media type application/vnd.cnab.manifest.v1 registered
2021-10-13T07:09:16Z [INFO] [/controller/artifact/processor/processor.go:59]: the processor to process media type application/vnd.oci.image.index.v1+json registered
2021-10-13T07:09:16Z [INFO] [/controller/artifact/processor/processor.go:59]: the processor to process media type application/vnd.docker.distribution.manifest.list.v2+json registered
2021-10-13T07:09:16Z [INFO] [/controller/artifact/processor/processor.go:59]: the processor to process media type application/vnd.docker.distribution.manifest.v1+prettyjws registered
2021-10-13T07:09:16Z [INFO] [/controller/artifact/processor/processor.go:59]: the processor to process media type application/vnd.oci.image.config.v1+json registered
2021-10-13T07:09:16Z [INFO] [/controller/artifact/processor/processor.go:59]: the processor to process media type application/vnd.docker.container.image.v1+json registered
2021-10-13T07:09:16Z [INFO] [/pkg/reg/adapter/native/adapter.go:36]: the factory for adapter docker-registry registered
2021-10-13T07:09:16Z [INFO] [/pkg/reg/adapter/harbor/adaper.go:31]: the factory for adapter harbor registered
2021-10-13T07:09:16Z [INFO] [/pkg/reg/adapter/dockerhub/adapter.go:25]: Factory for adapter docker-hub registered
2021-10-13T07:09:16Z [INFO] [/pkg/reg/adapter/huawei/huawei_adapter.go:42]: the factory of Huawei adapter was registered
2021-10-13T07:09:16Z [INFO] [/pkg/reg/adapter/googlegcr/adapter.go:35]: the factory for adapter google-gcr registered
2021-10-13T07:09:16Z [INFO] [/pkg/reg/adapter/awsecr/adapter.go:43]: the factory for adapter aws-ecr registered
2021-10-13T07:09:16Z [INFO] [/pkg/reg/adapter/azurecr/adapter.go:15]: Factory for adapter azure-acr registered
2021-10-13T07:09:16Z [INFO] [/pkg/reg/adapter/aliacr/adapter.go:31]: the factory for adapter ali-acr registered
2021-10-13T07:09:16Z [INFO] [/pkg/reg/adapter/jfrog/adapter.go:47]: the factory of jfrog artifactory adapter was registered
2021-10-13T07:09:16Z [INFO] [/pkg/reg/adapter/quay/adapter.go:55]: the factory of Quay adapter was registered
2021-10-13T07:09:16Z [INFO] [/pkg/reg/adapter/helmhub/adapter.go:30]: the factory for adapter helm-hub registered
2021-10-13T07:09:16Z [INFO] [/pkg/reg/adapter/gitlab/adapter.go:17]: the factory for adapter gitlab registered
2021-10-13T07:09:16Z [INFO] [/pkg/reg/adapter/dtr/adapter.go:22]: the factory of dtr adapter was registered
2021-10-13T07:09:16Z [INFO] [/pkg/reg/adapter/artifacthub/adapter.go:30]: the factory for adapter artifact-hub registered
2021-10-13T07:09:16Z [INFO] [/pkg/reg/adapter/tencentcr/adapter.go:42]: the factory for adapter tencent-tcr registered
2021-10-13T07:09:16Z [INFO] [/pkg/reg/adapter/githubcr/adapter.go:29]: the factory for adapter github-ghcr registered
2021-10-13T07:09:16Z [INFO] [/core/controllers/base.go:155]: Config path: /etc/core/app.conf
2021-10-13T07:09:16Z [INFO] [/core/main.go:163]: initializing cache ...
2021-10-13T07:09:16Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 500ms : dial tcp 127.0.0.1:6379: connect: connection refused
2021-10-13T07:09:16Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 723.609052ms : dial tcp 127.0.0.1:6379: connect: connection refused
2021-10-13T07:09:17Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 1.455154283s : dial tcp 127.0.0.1:6379: connect: connection refused
2021-10-13T07:09:18Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 3.295431403s : dial tcp 127.0.0.1:6379: connect: connection refused
2021-10-13T07:09:22Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 6.038750412s : dial tcp 127.0.0.1:6379: connect: connection refused
2021-10-13T07:09:28Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 8.046064995s : dial tcp 127.0.0.1:6379: connect: connection refused
2021-10-13T07:09:36Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 10s : dial tcp 127.0.0.1:6379: connect: connection refused
2021-10-13T07:09:46Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 10s : dial tcp 127.0.0.1:6379: connect: connection refused
2021-10-13T07:09:56Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 10s : dial tcp 127.0.0.1:6379: connect: connection refused
2021-10-13T07:10:06Z [ERROR] [/lib/cache/cache.go:110]: failed to ping redis+sentinel://:xxxxx@rfs-harborcluster-sample-redis:26379/mymaster/0?idle_timeout_seconds=30, retry after 10s : dial tcp 127.0.0.1:6379: connect: connection refused
2021-10-13T07:10:16Z [FATAL] [/core/main.go:165]: failed to initialize cache: retry timeout: dial tcp 127.0.0.1:6379: connect: connection refused

All the service details below

kubectl get svc -n cluster-sample-ns
NAME                                                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)            AGE
harborcluster-sample-harbor-harbor-chartmuseum           ClusterIP   11.32.242.236   <none>        443/TCP            41h
harborcluster-sample-harbor-harbor-core                  ClusterIP   11.33.201.34    <none>        443/TCP,8001/TCP   41h
harborcluster-sample-harbor-harbor-notaryserver          ClusterIP   11.33.81.2      <none>        443/TCP            41h
harborcluster-sample-harbor-harbor-notarysigner          ClusterIP   11.33.59.23     <none>        7899/TCP           41h
harborcluster-sample-harbor-harbor-portal                ClusterIP   11.33.190.230   <none>        443/TCP            41h
harborcluster-sample-harbor-harbor-registry              ClusterIP   11.32.63.100    <none>        443/TCP,8001/TCP   41h
harborcluster-sample-harbor-harbor-registryctl           ClusterIP   11.33.60.163    <none>        443/TCP            41h
harborcluster-sample-harbor-harbor-trivy                 ClusterIP   11.33.142.249   <none>        443/TCP            41h
postgresql-cluster-sample-ns-harborcluster-sample        ClusterIP   11.33.82.70     <none>        5432/TCP           41h
postgresql-cluster-sample-ns-harborcluster-sample-repl   ClusterIP   11.33.134.49    <none>        5432/TCP           41h
rfs-harborcluster-sample-redis                           ClusterIP   11.32.151.32    <none>        26379/TCP          55m

kubectl describe svc rfs-harborcluster-sample-redis -n cluster-sample-ns
Name:              rfs-harborcluster-sample-redis
Namespace:         cluster-sample-ns
Labels:            app.kubernetes.io/component=sentinel
                   app.kubernetes.io/instance=cluster-sample-ns
                   app.kubernetes.io/managed-by=redis-operator
                   app.kubernetes.io/name=harborcluster-sample-redis
                   app.kubernetes.io/part-of=redis-failover
                   goharbor.io/harbor-cluster=harborcluster-sample
                   redisfailovers.databases.spotahome.com/name=harborcluster-sample-redis
Annotations:       <none>
Selector:          app.kubernetes.io/component=sentinel,app.kubernetes.io/name=harborcluster-sample-redis,app.kubernetes.io/part-of=redis-failover
Type:              ClusterIP
IP:                11.32.151.32
Port:              sentinel  26379/TCP
TargetPort:        26379/TCP
Endpoints:         11.1.3.44:26379
Session Affinity:  None
Events:            <none>

@chlins
Copy link
Member

chlins commented Oct 13, 2021

@mpdevul Can you check redis operator logs again, under normal circumstances, redis should setup two pods, but your redis pod only has one pod, so it can not work well, so harbor core can not connect to redis successfully and crash.

@chlins
Copy link
Member

chlins commented Oct 13, 2021

In my environment

rfr-harborcluster-sample-redis-0                                  1/1     Running   0          28h
rfs-harborcluster-sample-redis-6b7f4c4756-gl64f                   1/1     Running   0          28h

rfr-harborcluster-sample-redis-0 is managed by statefulset and rfs-harborcluster-sample-redis-6b7f4c4756-gl64f is managed by deployment.

@mpdevul
Copy link
Author

mpdevul commented Oct 13, 2021

@chlins below are the logs I see

kubectl logs redisoperator-5f86d99d49-v7v5r -n harbor-operator-ns|more
time="2021-10-13T06:21:57Z" level=info msg="Listening on :9710 for metrics exposure" src="asm_amd64.s:1337"
time="2021-10-13T06:21:57Z" level=warning msg="controller name not provided, it should have a name, fallback name to: *redisfailover.RedisFailoverHandler" controller=redisfailover operator=redis-operator src="generic.go:64"
time="2021-10-13T06:21:57Z" level=info msg="starting operator" operator=redis-operator src="main.go:87"
time="2021-10-13T06:21:57Z" level=info msg="operator initialized" operator=redis-operator src="operator.go:103"
time="2021-10-13T06:21:57Z" level=info msg="starting controller" controller=redisfailover operator=redis-operator src="generic.go:178"
time="2021-10-13T06:21:57Z" level=info msg="configMap updated" configMap=rfs-harborcluster-sample-redis namespace=cluster-sample-ns service=k8s.configMap src="configmap.go:76"
time="2021-10-13T06:21:57Z" level=info msg="configMap updated" configMap=rfr-s-harborcluster-sample-redis namespace=cluster-sample-ns service=k8s.configMap src="configmap.go:76"
time="2021-10-13T06:21:57Z" level=info msg="configMap updated" configMap=rfr-readiness-harborcluster-sample-redis namespace=cluster-sample-ns service=k8s.configMap src="configmap.go:76"
time="2021-10-13T06:21:57Z" level=info msg="configMap updated" configMap=rfr-harborcluster-sample-redis namespace=cluster-sample-ns service=k8s.configMap src="configmap.go:76"
time="2021-10-13T06:21:57Z" level=info msg="podDisruptionBudget updated" namespace=cluster-sample-ns podDisruptionBudget=rfr-harborcluster-sample-redis service=k8s.podDisruptionBudget src="poddisruptionbudget.go:77"
time="2021-10-13T06:21:57Z" level=info msg="statefulSet updated" namespace=cluster-sample-ns service=k8s.statefulSet src="statefulset.go:101" statefulSet=rfr-harborcluster-sample-redis
time="2021-10-13T06:21:57Z" level=warning msg="error processing cluster-sample-ns/harborcluster-sample-redis job (requeued): admission webhook \"validate-pdbs.gs.com\" denied the request: Validation failed for pdb rfs-harborcluster-sample-redis because unable to validate minAvailability in pdb rfs-harborcluster-sample-redis because ... spec.MinAvailable cannot be 0 for pdb rfs-harborcluster-sample-redis in cluster-sample-ns namespace" controller=redisfailover operator=redi
s-operator src="generic.go:224"
time="2021-10-13T06:21:57Z" level=info msg="configMap updated" configMap=rfs-harborcluster-sample-redis namespace=cluster-sample-ns service=k8s.configMap src="configmap.go:76"
time="2021-10-13T06:21:57Z" level=info msg="configMap updated" configMap=rfr-s-harborcluster-sample-redis namespace=cluster-sample-ns service=k8s.configMap src="configmap.go:76"
time="2021-10-13T06:21:57Z" level=info msg="configMap updated" configMap=rfr-readiness-harborcluster-sample-redis namespace=cluster-sample-ns service=k8s.configMap src="configmap.go:76"
time="2021-10-13T06:21:57Z" level=info msg="configMap updated" configMap=rfr-harborcluster-sample-redis namespace=cluster-sample-ns service=k8s.configMap src="configmap.go:76"
time="2021-10-13T06:21:57Z" level=info msg="podDisruptionBudget updated" namespace=cluster-sample-ns podDisruptionBudget=rfr-harborcluster-sample-redis service=k8s.podDisruptionBudget src="poddisruptionbudget.go:77"
time="2021-10-13T06:21:57Z" level=info msg="statefulSet updated" namespace=cluster-sample-ns service=k8s.statefulSet src="statefulset.go:101" statefulSet=rfr-harborcluster-sample-redis
time="2021-10-13T06:21:59Z" level=warning msg="error processing cluster-sample-ns/harborcluster-sample-redis job (requeued): admission webhook \"validate-pdbs.gs.com\" denied the request: Validation failed for pdb rfs-harborcluster-sample-redis because unable to validate minAvailability in pdb rfs-harborcluster-sample-redis because ... spec.MinAvailable cannot be 0 for pdb rfs-harborcluster-sample-redis in cluster-sample-ns namespace" controller=redisfailover operator=redi
s-operator src="generic.go:224"
time="2021-10-13T06:21:59Z" level=info msg="configMap updated" configMap=rfs-harborcluster-sample-redis namespace=cluster-sample-ns service=k8s.configMap src="configmap.go:76"
time="2021-10-13T06:21:59Z" level=info msg="configMap updated" configMap=rfr-s-harborcluster-sample-redis namespace=cluster-sample-ns service=k8s.configMap src="configmap.go:76"
time="2021-10-13T06:21:59Z" level=info msg="configMap updated" configMap=rfr-readiness-harborcluster-sample-redis namespace=cluster-sample-ns service=k8s.configMap src="configmap.go:76"
time="2021-10-13T06:21:59Z" level=info msg="configMap updated" configMap=rfr-harborcluster-sample-redis namespace=cluster-sample-ns service=k8s.configMap src="configmap.go:76"
time="2021-10-13T06:21:59Z" level=info msg="podDisruptionBudget updated" namespace=cluster-sample-ns podDisruptionBudget=rfr-harborcluster-sample-redis service=k8s.podDisruptionBudget src="poddisruptionbudget.go:77"
time="2021-10-13T06:21:59Z" level=info msg="statefulSet updated" namespace=cluster-sample-ns service=k8s.statefulSet src="statefulset.go:101" statefulSet=rfr-harborcluster-sample-redis
time="2021-10-13T06:22:01Z" level=warning msg="error processing cluster-sample-ns/harborcluster-sample-redis job (requeued): admission webhook \"validate-pdbs.gs.com\" denied the request: Validation failed for pdb rfs-harborcluster-sample-redis because unable to validate minAvailability in pdb rfs-harborcluster-sample-redis because ... spec.MinAvailable cannot be 0 for pdb rfs-harborcluster-sample-redis in cluster-sample-ns namespace" controller=redisfailover operator=redi
s-operator src="generic.go:224"

@chlins

Could the deployment manifest and its dependent objects be provided here, since i do not see any log of redis operator which shows that it is trying to create the deployment, i will try to create manually ??

@chlins
Copy link
Member

chlins commented Oct 13, 2021

@mpdevul Did your environement install any other validation webhooks? from the logs warn admission webhook \"validate-pdbs.gs.com\" denied the request.

@mpdevul
Copy link
Author

mpdevul commented Oct 13, 2021

@chlins , i see the below for the statefulset, need to add rbac for pvc as well i believe, will try that and get back.

Events:
  Type     Reason        Age                    From                    Message
  ----     ------        ----                   ----                    -------
  Warning  FailedCreate  17m (x44 over 162m)    statefulset-controller  create Pod rfr-harborcluster-sample-redis-0 in StatefulSet rfr-harborcluster-sample-redis failed error: failed to create PVC harborcluster-sample-rfr-harborcluster-sample-redis-0: persistentvolumeclaims "harborcluster-sample-rfr-harborcluster-sample-redis-0" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>
  Warning  FailedCreate  7m24s (x46 over 162m)  statefulset-controller  create Claim harborcluster-sample-rfr-harborcluster-sample-redis-0 for Pod rfr-harborcluster-sample-redis-0 in StatefulSet rfr-harborcluster-sample-redis failed error: persistentvolumeclaims "harborcluster-sample-rfr-harborcluster-sample-redis-0" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>
```

@mpdevul
Copy link
Author

mpdevul commented Oct 14, 2021

@chlins , updating RBAC didnt help, so created the PVC manually and the pod controlled by the statefuleset comes up but, the readiness probes fails, the core pods are still in crashloopback status.

[devulm@d59323-005 k8s-cmd]$ kubectl get pods -n cluster-sample-ns
NAME                                                              READY   STATUS             RESTARTS   AGE
harborcluster-sample-harbor-harbor-chartmuseum-5799f5b8bb-wsrd4   1/1     Running            0          2d7h
harborcluster-sample-harbor-harbor-core-c4497774b-797hn           0/1     CrashLoopBackOff   284        21h
harborcluster-sample-harbor-harbor-core-ddd9f555d-kr95x           0/1     CrashLoopBackOff   727        2d7h
harborcluster-sample-harbor-harbor-notaryserver-6d6477d648cfqvq   1/1     Running            1          2d7h
harborcluster-sample-harbor-harbor-notarysigner-67789cb587lm74s   1/1     Running            1          2d7h
harborcluster-sample-harbor-harbor-portal-849769b95b-g9s55        1/1     Running            0          2d7h
harborcluster-sample-harbor-harbor-registry-5759d8d7b4-5mp7r      1/1     Running            0          2d7h
harborcluster-sample-harbor-harbor-registryctl-55db969444-cnzcf   1/1     Running            0          2d7h
harborcluster-sample-harbor-harbor-trivy-6856fc7c57-tbqbs         1/1     Running            0          21h
postgresql-cluster-sample-ns-harborcluster-sample-0               1/1     Running            0          20m
rfr-harborcluster-sample-redis-0                                  0/1     Running            0          17h
rfs-harborcluster-sample-redis-6b7f4c4756-xhgz7                   1/1     Running            0          22h


Logs of the Redis pod.

1:C 13 Oct 2021 11:19:02.625 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 13 Oct 2021 11:19:02.625 # Redis version=5.0.10, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 13 Oct 2021 11:19:02.625 # Configuration loaded
1:S 13 Oct 2021 11:19:02.625 * Running mode=standalone, port=6379.
1:S 13 Oct 2021 11:19:02.625 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:S 13 Oct 2021 11:19:02.625 # Server initialized
1:S 13 Oct 2021 11:19:02.625 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue r
un the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted af
ter THP is disabled.
1:S 13 Oct 2021 11:19:02.626 * Ready to accept connections
1:S 13 Oct 2021 11:19:02.627 * Connecting to MASTER 127.0.0.1:6379
1:S 13 Oct 2021 11:19:02.627 * MASTER <-> REPLICA sync started
1:S 13 Oct 2021 11:19:02.627 * Non blocking connect for SYNC fired the event.
1:S 13 Oct 2021 11:19:02.627 * Master replied to PING, replication can continue...
1:S 13 Oct 2021 11:19:02.627 * Partial resynchronization not possible (no cached master)
1:S 13 Oct 2021 11:19:02.627 * Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master
1:S 13 Oct 2021 11:19:03.629 * Connecting to MASTER 127.0.0.1:6379
1:S 13 Oct 2021 11:19:03.629 * MASTER <-> REPLICA sync started
1:S 13 Oct 2021 11:19:03.629 * Non blocking connect for SYNC fired the event.
1:S 13 Oct 2021 11:19:03.629 * Master replied to PING, replication can continue...
1:S 13 Oct 2021 11:19:03.629 * Partial resynchronization not possible (no cached master)
1:S 13 Oct 2021 11:19:03.629 * Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master
1:S 13 Oct 2021 11:19:04.631 * Connecting to MASTER 127.0.0.1:6379
1:S 13 Oct 2021 11:19:04.631 * MASTER <-> REPLICA sync started
1:S 13 Oct 2021 11:19:04.631 * Non blocking connect for SYNC fired the event.
1:S 13 Oct 2021 11:19:04.631 * Master replied to PING, replication can continue...
1:S 13 Oct 2021 11:19:04.631 * Partial resynchronization not possible (no cached master)
1:S 13 Oct 2021 11:19:04.631 * Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master
1:S 13 Oct 2021 11:19:05.633 * Connecting to MASTER 127.0.0.1:6379
1:S 13 Oct 2021 11:19:05.633 * MASTER <-> REPLICA sync started

@chlins
Copy link
Member

chlins commented Oct 14, 2021

@chlins , updating RBAC didnt help, so created the PVC manually and the pod controlled by the statefuleset comes up but, the readiness probes fails, the core pods are still in crashloopback status.

[devulm@d59323-005 k8s-cmd]$ kubectl get pods -n cluster-sample-ns
NAME                                                              READY   STATUS             RESTARTS   AGE
harborcluster-sample-harbor-harbor-chartmuseum-5799f5b8bb-wsrd4   1/1     Running            0          2d7h
harborcluster-sample-harbor-harbor-core-c4497774b-797hn           0/1     CrashLoopBackOff   284        21h
harborcluster-sample-harbor-harbor-core-ddd9f555d-kr95x           0/1     CrashLoopBackOff   727        2d7h
harborcluster-sample-harbor-harbor-notaryserver-6d6477d648cfqvq   1/1     Running            1          2d7h
harborcluster-sample-harbor-harbor-notarysigner-67789cb587lm74s   1/1     Running            1          2d7h
harborcluster-sample-harbor-harbor-portal-849769b95b-g9s55        1/1     Running            0          2d7h
harborcluster-sample-harbor-harbor-registry-5759d8d7b4-5mp7r      1/1     Running            0          2d7h
harborcluster-sample-harbor-harbor-registryctl-55db969444-cnzcf   1/1     Running            0          2d7h
harborcluster-sample-harbor-harbor-trivy-6856fc7c57-tbqbs         1/1     Running            0          21h
postgresql-cluster-sample-ns-harborcluster-sample-0               1/1     Running            0          20m
rfr-harborcluster-sample-redis-0                                  0/1     Running            0          17h
rfs-harborcluster-sample-redis-6b7f4c4756-xhgz7                   1/1     Running            0          22h

Logs of the Redis pod.

1:C 13 Oct 2021 11:19:02.625 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 13 Oct 2021 11:19:02.625 # Redis version=5.0.10, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 13 Oct 2021 11:19:02.625 # Configuration loaded
1:S 13 Oct 2021 11:19:02.625 * Running mode=standalone, port=6379.
1:S 13 Oct 2021 11:19:02.625 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:S 13 Oct 2021 11:19:02.625 # Server initialized
1:S 13 Oct 2021 11:19:02.625 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue r
un the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted af
ter THP is disabled.
1:S 13 Oct 2021 11:19:02.626 * Ready to accept connections
1:S 13 Oct 2021 11:19:02.627 * Connecting to MASTER 127.0.0.1:6379
1:S 13 Oct 2021 11:19:02.627 * MASTER <-> REPLICA sync started
1:S 13 Oct 2021 11:19:02.627 * Non blocking connect for SYNC fired the event.
1:S 13 Oct 2021 11:19:02.627 * Master replied to PING, replication can continue...
1:S 13 Oct 2021 11:19:02.627 * Partial resynchronization not possible (no cached master)
1:S 13 Oct 2021 11:19:02.627 * Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master
1:S 13 Oct 2021 11:19:03.629 * Connecting to MASTER 127.0.0.1:6379
1:S 13 Oct 2021 11:19:03.629 * MASTER <-> REPLICA sync started
1:S 13 Oct 2021 11:19:03.629 * Non blocking connect for SYNC fired the event.
1:S 13 Oct 2021 11:19:03.629 * Master replied to PING, replication can continue...
1:S 13 Oct 2021 11:19:03.629 * Partial resynchronization not possible (no cached master)
1:S 13 Oct 2021 11:19:03.629 * Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master
1:S 13 Oct 2021 11:19:04.631 * Connecting to MASTER 127.0.0.1:6379
1:S 13 Oct 2021 11:19:04.631 * MASTER <-> REPLICA sync started
1:S 13 Oct 2021 11:19:04.631 * Non blocking connect for SYNC fired the event.
1:S 13 Oct 2021 11:19:04.631 * Master replied to PING, replication can continue...
1:S 13 Oct 2021 11:19:04.631 * Partial resynchronization not possible (no cached master)
1:S 13 Oct 2021 11:19:04.631 * Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master
1:S 13 Oct 2021 11:19:05.633 * Connecting to MASTER 127.0.0.1:6379
1:S 13 Oct 2021 11:19:05.633 * MASTER <-> REPLICA sync started

@mpdevul It looks like redis is still not working properly, i think you can try to delete redis two pods then wait k8s recreate them(but i'm not sure the solution whether can help).

@mpdevul
Copy link
Author

mpdevul commented Oct 14, 2021

@chlins

Tried that did not help, could you please point to an article where i can try using redis in external mode

NAME                                                              READY   STATUS             RESTARTS   AGE
harborcluster-sample-harbor-harbor-chartmuseum-5799f5b8bb-wsrd4   1/1     Running            0          2d8h
harborcluster-sample-harbor-harbor-core-c4497774b-797hn           0/1     CrashLoopBackOff   292        22h
harborcluster-sample-harbor-harbor-core-ddd9f555d-kr95x           0/1     CrashLoopBackOff   734        2d8h
harborcluster-sample-harbor-harbor-notaryserver-6d6477d648cfqvq   1/1     Running            1          2d8h
harborcluster-sample-harbor-harbor-notarysigner-67789cb587lm74s   1/1     Running            1          2d8h
harborcluster-sample-harbor-harbor-portal-849769b95b-g9s55        1/1     Running            0          2d8h
harborcluster-sample-harbor-harbor-registry-5759d8d7b4-5mp7r      1/1     Running            0          2d8h
harborcluster-sample-harbor-harbor-registryctl-55db969444-cnzcf   1/1     Running            0          2d8h
harborcluster-sample-harbor-harbor-trivy-6856fc7c57-tbqbs         1/1     Running            0          22h
postgresql-cluster-sample-ns-harborcluster-sample-0               1/1     Running            0          24m
rfr-harborcluster-sample-redis-0                                  0/1     Running            0          4m5s
rfs-harborcluster-sample-redis-6b7f4c4756-xzrc8                   1/1     Running            0          4m16s

@chlins
Copy link
Member

chlins commented Oct 14, 2021

@chlins

Tried that did not help, could you please point to an article where i can try using redis in external mode

NAME                                                              READY   STATUS             RESTARTS   AGE
harborcluster-sample-harbor-harbor-chartmuseum-5799f5b8bb-wsrd4   1/1     Running            0          2d8h
harborcluster-sample-harbor-harbor-core-c4497774b-797hn           0/1     CrashLoopBackOff   292        22h
harborcluster-sample-harbor-harbor-core-ddd9f555d-kr95x           0/1     CrashLoopBackOff   734        2d8h
harborcluster-sample-harbor-harbor-notaryserver-6d6477d648cfqvq   1/1     Running            1          2d8h
harborcluster-sample-harbor-harbor-notarysigner-67789cb587lm74s   1/1     Running            1          2d8h
harborcluster-sample-harbor-harbor-portal-849769b95b-g9s55        1/1     Running            0          2d8h
harborcluster-sample-harbor-harbor-registry-5759d8d7b4-5mp7r      1/1     Running            0          2d8h
harborcluster-sample-harbor-harbor-registryctl-55db969444-cnzcf   1/1     Running            0          2d8h
harborcluster-sample-harbor-harbor-trivy-6856fc7c57-tbqbs         1/1     Running            0          22h
postgresql-cluster-sample-ns-harborcluster-sample-0               1/1     Running            0          24m
rfr-harborcluster-sample-redis-0                                  0/1     Running            0          4m5s
rfs-harborcluster-sample-redis-6b7f4c4756-xzrc8                   1/1     Running            0          4m16s

@mpdevul I will try to reproduce your issue in my environemnt to find out the root cause, if you want to use out-cluster redis service, currently we have no article for guiding the config, but you can see it from crd spec. code reference:

,
update your harborcluster cr to config external redis.

A simple example:

spec:
  cache:
    kind: Redis
    spec:
      redis:
        host: 127.0.0.1
        port: 6379

tips: if update harborcluster cr directly not work, maybe need delete harbor firstly and then re-apply harborcluster cr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants