Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KubeClarity installation using Helm fails | grype-server pod CrashLoopBackOff status #377

Closed
sprathod369 opened this issue Mar 24, 2023 · 4 comments

Comments

@sprathod369
Copy link

What happened:

KubeClarity installation using Helm fails after following the steps outlined at https://github.com/openclarity/kubeclarity#install-using-helm

What you expected to happen:

All pods are expected to up and running. KubeClarity UI should be accessible after port forwarding

How to reproduce it (as minimally and precisely as possible):

  1. Install the latest Helm charts
# helm repo add kubeclarity https://openclarity.github.io/kubeclarity
"kubeclarity" has been added to your repositories
  1. Update the Helm charts to ensure latest version
helm repo update kubeclarity
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "kubeclarity" chart repository
Update Complete. ⎈Happy Helming!⎈
  1. Save KubeClarity default chart values
    No changes made to default "value.yaml"

  2. Deploy KubeClarity with Helm

# helm install --values values.yaml --create-namespace kubeclarity kubeclarity/kubeclarity -n kubeclarity
NAME: kubeclarity
LAST DEPLOYED: Fri Mar 24 14:20:03 2023
NAMESPACE: kubeclarity
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Thank you for installing KUBECLARITY.

Your release is named kubeclarity.
Here is how to access the KubeClarity UI:
    $ kubectl port-forward --namespace kubeclarity svc/kubeclarity-kubeclarity 9999:8080
    $ Open KubeClarity UI in the browser: http://localhost:9999/
  1. Confirm if all pods are running
# kubectl get pods --namespace kubeclarity
NAME                                                    READY   STATUS     RESTARTS      AGE
kubeclarity-kubeclarity-76b95bb8c-k8l6q                 0/1     Init:2/3   0             29s
kubeclarity-kubeclarity-grype-server-7d5bb4fbfc-gr4mv   0/1     Error      2 (22s ago)   29s
kubeclarity-kubeclarity-postgresql-0                    1/1     Running    0             29s
kubeclarity-kubeclarity-sbom-db-748c45594b-mzjv4        1/1     Running    0             29s

Are there any error messages in KubeClarity logs?

(e.g. kubectl logs -n kubeclarity --selector=app=kubeclarity)

Seems Readiness probe fails for grype-server pod. Please see the logs below

# kubectl describe pod/kubeclarity-kubeclarity-grype-server-7d5bb4fbfc-gr4mv --namespace kubeclarity
Name:             kubeclarity-kubeclarity-grype-server-7d5bb4fbfc-gr4mv
Namespace:        kubeclarity
Priority:         0
Service Account:  kubeclarity-kubeclarity-grype-server
Node:             kind-worker2/172.20.0.2
Start Time:       Fri, 24 Mar 2023 14:20:04 +0530
Labels:           app=kubeclarity-kubeclarity-grype-server
                  pod-template-hash=7d5bb4fbfc
Annotations:      <none>
Status:           Running
IP:               10.244.1.100
IPs:
  IP:           10.244.1.100
Controlled By:  ReplicaSet/kubeclarity-kubeclarity-grype-server-7d5bb4fbfc
Containers:
  grype-server:
    Container ID:  containerd://01823316ed798f55c2de2b793b5e3ee40f166bc9447961c924ae873ac1a7dca4
    Image:         gcr.io/eticloud/k8sec/grype-server:v0.1.5
    Image ID:      gcr.io/eticloud/k8sec/grype-server@sha256:0791bcb8688a437ba71ae4d9d1bf350be88fff6124c661d40d1317124e860fae
    Port:          <none>
    Host Port:     <none>
    Args:
      run
      --log-level
      warning
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Fri, 24 Mar 2023 14:20:49 +0530
      Finished:     Fri, 24 Mar 2023 14:20:49 +0530
    Ready:          False
    Restart Count:  3
    Limits:
      cpu:     1
      memory:  1G
    Requests:
      cpu:      200m
      memory:   200Mi
    Liveness:   http-get http://:8080/healthz/live delay=10s timeout=10s period=30s #success=1 #failure=5
    Readiness:  http-get http://:8080/healthz/ready delay=0s timeout=10s period=30s #success=1 #failure=5
    Environment:
      DB_ROOT_DIR:  /tmp/
    Mounts:
      /tmp from tmp-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-qcp6m (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  tmp-volume:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  kube-api-access-qcp6m:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  83s                default-scheduler  Successfully assigned kubeclarity/kubeclarity-kubeclarity-grype-server-7d5bb4fbfc-gr4mv to kind-worker2
  Normal   Pulled     81s                kubelet            Successfully pulled image "gcr.io/eticloud/k8sec/grype-server:v0.1.5" in 925.975547ms
  Normal   Pulled     77s                kubelet            Successfully pulled image "gcr.io/eticloud/k8sec/grype-server:v0.1.5" in 867.647423ms
  Warning  Unhealthy  76s (x2 over 79s)  kubelet            Readiness probe failed: Get "http://10.244.1.100:8080/healthz/ready": dial tcp 10.244.1.100:8080: connect: connection refused
  Normal   Pulled     62s                kubelet            Successfully pulled image "gcr.io/eticloud/k8sec/grype-server:v0.1.5" in 891.527807ms
  Normal   Pulling    40s (x4 over 82s)  kubelet            Pulling image "gcr.io/eticloud/k8sec/grype-server:v0.1.5"
  Normal   Created    39s (x4 over 81s)  kubelet            Created container grype-server
  Normal   Pulled     39s                kubelet            Successfully pulled image "gcr.io/eticloud/k8sec/grype-server:v0.1.5" in 851.217379ms
  Normal   Started    38s (x4 over 80s)  kubelet            Started container grype-server
  Warning  BackOff    25s (x7 over 75s)  kubelet            Back-off restarting failed container


Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version --short):
    Client Version: v1.26.1, Kustomize Version: v4.5.7, Server Version: v1.25.3

  • Helm version (use helm version):
    v3.11.1

  • KubeClarity version (use kubectl -n kubeclarity exec deploy/kubeclarity -- ./backend version)

  • KubeClarity Helm Chart version (use helm -n kubeclarity list)
    NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
    kubeclarity kubeclarity 1 2023-03-24 14:20:03.624246496 +0530 IST deployed kubeclarity-v2.14.0 v2.14.0

  • Cloud provider or hardware configuration:
    On-prem, Ubuntu 20x

  • Others:

@FrimIdan
Copy link
Member

Hi @sprathod369 thanks for reporting the issue.
I see that the pod is in CrashLoopBackOff, can you please share the log before the crash, you can use logs -p to get them.

kubectl logs -p pod/kubeclarity-kubeclarity-grype-server-7d5bb4fbfc-gr4mv --namespace kubeclarity

@sprathod369
Copy link
Author

Thanks @FrimIdan - here are the logs
kubectl logs -p pod/kubeclarity-kubeclarity-grype-server-7d5bb4fbfc-gr4mv --namespace kubeclarity
time="2023-04-05T05:07:42Z" level=fatal msg="Failed to start scanner: failed to load DB: unable to check for vulnerability database update: unable to download listing: Get "https://toolbox-data.anchore.io/grype/databases/listing.json\": x509: certificate signed by unknown authority" func=main.run file="/build/grype-server/cmd/grype-server/main.go:53"

While using grype standalone in the past, I had observed a similar ad-hoc issue with database update. My pipeline works (if I'm lucky) but the same pipeline fails multiple times due to the database download issue. There are already some open issues wrt vulnerability database download and updates reported in Grype GitHub.

I'm able to curl https://toolbox-data.anchore.io/grype/databases/listing.json from my Ubuntu VM so I doubt if this is getting blocked by firewall on my end.

@sprathod369
Copy link
Author

Unless I'm missing something, I've commented on a reported issue in Grype community #310 issues as even the latest version of Kubeclarity (v2.18.1) setup fails using the default as-is values.yaml.

Below are my K8s logs.

root@xxxx:~/KubeClarity# kubectl logs pod/kubeclarity-kubeclarity-grype-server-7d969b865-9lxks --namespace kubeclarity
time="2023-05-25T11:12:48Z" level=fatal msg="Failed to start scanner: failed to load DB: unable to check for vulnerability database update: unable to download listing: Get \"https://toolbox-data.anchore.io/grype/databases/listing.json\": tls: failed to verify certificate: x509: certificate signed by unknown authority" func=main.run file="/build/grype-server/cmd/grype-server/main.go:53"

How do I check what Grype image version is in use by the Kubeclarity v2.18.1?

@FrimIdan
Copy link
Member

FrimIdan commented May 28, 2023

How do I check what Grype image version is in use by the Kubeclarity v2.18.1?

You can see in the go.mod of the grype server repo, current grype version is v0.61.0

BTW - I've tested the latest version and it works fine on my env, it might be something with your outbound network configuration, are you running behind a proxy or something?

@FrimIdan FrimIdan closed this as not planned Won't fix, can't repro, duplicate, stale Jul 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants