Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve setup for a more production ready workload #2

Closed
wants to merge 46 commits into from
Closed

improve setup for a more production ready workload #2

wants to merge 46 commits into from

Conversation

anubisg1
Copy link

  • split gui, poller and rest into separate pods
  • use env variables where needed
  • poller configured as a statefulset
  • use ingress to expose suzieq webui and rest server
  • updated configuration files to match current suzieq requirements and showing multiple namespaces
  • use official suzieq image

@hyposcaler-bot
Copy link
Contributor

Hey! Thanks for the PR, more than happy to merge it, just want to take some time this weekend to try a test deployment first..

@bitcollector1
Copy link

Thanks folks, this has been a perfect project to get me introduced to K8, really appreciate the updates to support the newest version of suzieq!

@bitcollector1
Copy link

I did manage to find a couple of typos, one of them in is gui.yml

 image: docker pull netenglabs/suzieq:0.17.0

@bitcollector1
Copy link

Using manual storage currently

[root@kube4 suzieq-0.17.0]# kubectl get storageclass
No resources found
[root@kube4 suzieq-0.17.0]# kubectl get pv parquet
NAME      CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                 STORAGECLASS   REASON   AGE
parquet   1000Gi     RWX            Retain           Bound    default/parquet-pvc   manual                  2m19s
[root@kube4 suzieq-0.17.0]# kubectl get pvc parquet-pvc
NAME          STATUS   VOLUME    CAPACITY   ACCESS MODES   STORAGECLASS   AGE
parquet-pvc   Bound    parquet   1000Gi     RWX            manual         2m43s

@bitcollector1
Copy link

[root@kube4 suzieq-0.17.0]# kubectl describe pvc parquet-pvc
Name:          parquet-pvc
Namespace:     default
StorageClass:  manual
Status:        Bound
Volume:        parquet
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      1000Gi
Access Modes:  RWX
VolumeMode:    Filesystem
Used By:       sq-gui-deploy-674f644bc6-dlfmf
               sq-poller-0
               sq-rest-deploy-7f485f66b5-5zklc
Events:        <none>

@bitcollector1
Copy link

[root@kube4 suzieq-0.17.0]# kubectl describe pv parquet
Name:            parquet
Labels:          type=local
Annotations:     pv.kubernetes.io/bound-by-controller: yes
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    manual
Status:          Bound
Claim:           default/parquet-pvc
Reclaim Policy:  Retain
Access Modes:    RWX
VolumeMode:      Filesystem
Capacity:        1000Gi
Node Affinity:   <none>
Message:         
Source:
    Type:          HostPath (bare host directory volume)
    Path:          /home/suzieq
    HostPathType:  
Events:            <none>

@anubisg1
Copy link
Author

anubisg1 commented Apr 8, 2022

ok, that is a weird setup you have there...
i fully tested the manifests now and i know they work.

in my example:

  1. i have two storage classes definied. one is NFS shared storage class, and the other is a single node disk "microk8s-hostpath"
  2. i know i have permission issues on the NFS side so i use microk8ts-hostpath in this example and i do that by editing the pvc manifest
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: parquet-pvc
spec:
  storageClassName: microk8s-hostpath    <<<<--- HERE 
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

i then apply all manifests:

admin@aks-node-0:~/suzieq-on-k8s/manifest$ kubectl create namespace suzieq
namespace/suzieq created
admin@aks-node-0:~/suzieq-on-k8s/manifest$ kubectl apply -f . -n suzieq
configmap/sq-inventory created
configmap/sq-conf created
deployment.apps/sq-gui-deploy created
service/sq-gui-service created
ingress.networking.k8s.io/suzieq-ingress created
service/sq-poller-headless created
statefulset.apps/sq-poller created
persistentvolumeclaim/parquet-pvc created
deployment.apps/sq-rest-deploy created
service/sq-rest-service created
secret/sq-credentials created

everything works:

admin@aks-node-0:~/suzieq-on-k8s/manifest$ kubectl get all -n suzieq
NAME                                  READY   STATUS    RESTARTS   AGE
pod/sq-gui-deploy-674f644bc6-8r2m5    1/1     Running   0          43s
pod/sq-poller-0                       1/1     Running   0          43s
pod/sq-rest-deploy-7f485f66b5-78cjm   1/1     Running   0          42s

NAME                         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
service/sq-gui-service       NodePort    10.152.183.23    <none>        80:30080/TCP   43s
service/sq-poller-headless   ClusterIP   None             <none>        <none>         43s
service/sq-rest-service      NodePort    10.152.183.140   <none>        80:30081/TCP   42s

NAME                             READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/sq-gui-deploy    1/1     1            1           43s
deployment.apps/sq-rest-deploy   1/1     1            1           43s

NAME                                        DESIRED   CURRENT   READY   AGE
replicaset.apps/sq-gui-deploy-674f644bc6    1         1         1       43s
replicaset.apps/sq-rest-deploy-7f485f66b5   1         1         1       43s

NAME                         READY   AGE
statefulset.apps/sq-poller   1/1     43s


admin@aks-node-0:~/suzieq-on-k8s/manifest$ kubectl logs sq-poller-0  -n suzieq
2022-04-08 08:08:36,665 - suzieq.poller.controller - WARNING - log level WARNING


admin@aks-node-0:~/suzieq-on-k8s/manifest$ kubectl logs sq-gui-deploy-674f644bc6-8r2m5  -n suzieq

  Starting Suzieq GUI
2022-04-08 08:08:46.640 INFO    matplotlib.font_manager: generated new fontManager

  You can now view your Streamlit app in your browser.

  Network URL: http://10.1.7.231:8501
  External URL: http://xx.xx.xx.xx:8501


admin@aks-node-0:~/suzieq-on-k8s/manifest$ kubectl logs sq-rest-deploy-7f485f66b5-78cjm  -n suzieq
INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

@bitcollector1
Copy link

nice work! I appreciate all your efforts.....yeah, I inherited this cluster#*ck from the guy we fired ;) Trying to come up to speed but there is so much depth and I'm just getting started.

@anubisg1
Copy link
Author

anubisg1 commented Apr 8, 2022

after fixing my NFS share, it works over there too without problems

admin@aks-node-0:~/suzieq-on-k8s/manifest$ kubectl get pvc -n suzieq
NAME          STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
parquet-pvc   Bound    pvc-ef00660c-a31e-4208-911c-a6dbcd30c802   5Gi        RWO            nfs-csi        61s


admin@aks-node-0:~/suzieq-on-k8s/manifest$ kubectl get pod -n suzieq
NAME                              READY   STATUS    RESTARTS   AGE
sq-gui-deploy-674f644bc6-dd8tb    1/1     Running   0          64s
sq-poller-0                       1/1     Running   0          63s
sq-rest-deploy-7f485f66b5-576cn   1/1     Running   0          63s


admin@aks-node-0:~/suzieq-on-k8s/manifest$ kubectl logs sq-poller-0 -n suzieq
2022-04-08 08:13:18,144 - suzieq.poller.controller - WARNING - log level WARNING

therefore i would suggest:

  1. use the manifest as they are, you only need change the configmap if there is anything you need to change, but for testing you don't need to change anything at all
  2. ensure that you kubernetes cluster is configured appropriately in regarding of storage and storageclasses

@bitcollector1
Copy link

bitcollector1 commented Apr 8, 2022

Looks like the storage is the issue now, no more pods crashing

<invalid>   Warning   FailedMount         pod/sq-gui-deploy-6dd9fd9f89-ck5jr     MountVolume.SetUp failed for volume "sq-conf" : configmap "sq-conf" not found
<invalid>   Warning   FailedMount         pod/sq-poller-0                        MountVolume.SetUp failed for volume "inventory" : configmap "sq-inventory" not found
<invalid>   Warning   FailedMount         pod/sq-poller-0                        MountVolume.SetUp failed for volume "sq-conf" : configmap "sq-conf" not found
<invalid>   Warning   FailedMount         pod/sq-rest-deploy-7f485f66b5-2fd8d    MountVolume.SetUp failed for volume "sq-conf" : configmap "sq-conf" not found
[root@kube4 manifest]# kubectl get all -n suzieq
NAME                                  READY   STATUS              RESTARTS   AGE
pod/sq-gui-deploy-6dd9fd9f89-ck5jr    0/1     ContainerCreating   0          4m3s
pod/sq-poller-0                       0/1     ContainerCreating   0          4m3s
pod/sq-rest-deploy-7f485f66b5-2fd8d   0/1     ContainerCreating   0          4m2s

NAME                         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
service/sq-gui-service       NodePort    10.98.85.55      <none>        80:30080/TCP   4m3s
service/sq-poller-headless   ClusterIP   None             <none>        <none>         4m3s
service/sq-rest-service      NodePort    10.105.205.132   <none>        80:30081/TCP   4m2s

NAME                             READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/sq-gui-deploy    0/1     1            0           4m3s
deployment.apps/sq-rest-deploy   0/1     1            0           4m3s

NAME                                        DESIRED   CURRENT   READY   AGE
replicaset.apps/sq-gui-deploy-6dd9fd9f89    1         1         0       4m3s
replicaset.apps/sq-rest-deploy-7f485f66b5   1         1         0       4m3s

NAME                         READY   AGE
statefulset.apps/sq-poller   0/1     4m3s

@anubisg1
Copy link
Author

anubisg1 commented Apr 8, 2022

did you apply the configmaps??

kubectl get configmap -n suzieq

@bitcollector1
Copy link

Thanks again for everything, pretty sure I can get this last issue figured out. I learned sooooo much today!

@bitcollector1
Copy link

bitcollector1 commented Apr 8, 2022

apparently I did not....thought it would be built with the one command if I chucked it into the manifest folder but it appears it did not.....so thanks again.....I owe you a few beers ;)

[root@kube4 manifest]# kubectl get configmap
NAME               DATA   AGE
kube-root-ca.crt   1      106d
sq-conf            1      55s
sq-inventory       1      55s
[root@kube4 manifest]# kubectl get configmap -n suzieq
NAME               DATA   AGE
kube-root-ca.crt   1      37d

@anubisg1
Copy link
Author

anubisg1 commented Apr 8, 2022

Thanks again for everything, pretty sure I can get this last issue figured out. I learned sooooo much today!

don't worry...

if you just want to test that everything works, follow the "TLDR" section in the readme.
in the issue you just showed me, you didn't apply "configmap.yaml" (or maybe you did in a different namespace).

you can apply all manifests at once (and you should ;) ) .. again, check the quick instructions below

  • create a namespace
kubectl create namespace suzieq
  • provide your own certificates or generate a self signed ones
openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -nodes
kubectl -n suzieq create secret tls suzieq-tls-secret \
  --cert=cert.pem \
  --key=key.pem
  • deploy suzieq
kubectl -n suzieq apply -f manifest/.
kubectl -n suzieq get all

@anubisg1
Copy link
Author

anubisg1 commented Apr 8, 2022

@hyposcaler-bot

all issues are solved, what's left is just due bitcollector knowledge gap.
please review and merge :)

@bitcollector1
Copy link

Okay, so the issue was permissions on the worker node where these files were created! It turns out there was no write permissions by default. Man I'm ready for the weekend now! Fun times, LOL.

update to 0.17.1
update to 0.17.1
update to 0.17.1
@bitcollector1
Copy link

I ran into an issue where the poller kept crashing and also seemed to take awhile to get results so I added a second worker and bumped up the resources and things are looking much better now!

Here was the error I managed to catch

root@kube4 suzieq-0.17.0]# kubectl get all -n suzieq
NAME                                  READY   STATUS      RESTARTS   AGE
pod/haproxy                           1/1     Running     0          18h
pod/sq-gui-deploy-b9dd5f6bd-vd9rl     1/1     Running     0          2m53s
pod/sq-poller-0                       0/1     OOMKilled   1          2m52s
pod/sq-rest-deploy-6dc7c876f4-bxcpk   1/1     Running     0          2m52s
containers:
        - name: poller
          image: netenglabs/suzieq:0.17.1
          command: ["sq-poller", "-I", "inventory.yml", "-w 2"]
          resources:
            requests:
              memory: "512Mi"
              cpu: "500m"
            limits:
              memory: "1000Mi"
              cpu: "1000m"

@bitcollector1
Copy link

Last step is to figure out the HA proxy for my bare metal setup, this has been kicking my butt pretty hard. It's so much easier to just use a cloud provider it seems.

@anubisg1
Copy link
Author

Last step is to figure out the HA proxy for my bare metal setup, this has been kicking my butt pretty hard. It's so much easier to just use a cloud provider it seems.

That's easy, just install metallb and configure your ingress svc with service type load balancer

@anubisg1
Copy link
Author

hello @hyposcaler-bot is there any expectation to merge?

@hyposcaler-bot
Copy link
Contributor

hyposcaler-bot commented May 24, 2022 via email

@anubisg1 anubisg1 closed this by deleting the head repository Mar 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants