Update readme k8s and helm chart

vectornguyen76 · Nov 27, 2023 · 5d4ed64 · 5d4ed64
1 parent 109db23
commit 5d4ed64
Show file tree

Hide file tree

Showing 2 changed files with 102 additions and 48 deletions.
diff --git a/kubernetes/README.md b/kubernetes/README.md
@@ -19,6 +19,7 @@
    - [Deploy Backend](#deploy-backend)
    - [Deploy Frontend](#deploy-frontend)
    - [Deploy Qdrant](#deploy-qdrant)
+   - [Deploy Triton Inference Server](#deploy-triton-inference-server)
    - [Deploy Image Search](#deploy-image-search)
    - [Deploy Text Search](#deploy-text-search)
 5. [References](#references)
@@ -63,13 +64,29 @@ Guidelines for setting up a Kubernetes environment suitable for production.
 ### Create and Manage Cluster and NodeGroup
 
 - **Creating a Cluster and Node Group**
+
   ```
   eksctl create cluster -f cluster-config-eksctl.yaml
   ```
+
+   <p align="center">
+   <img src="./assets/create-cluster.png" alt="Creating a Cluster and Node Group" />
+   <br>
+   <em>Fig: Creating a Cluster and Node Group</em>
+   </p>
+
 - **Deleting a Cluster and Node Group**
+
   ```
   eksctl delete cluster -f cluster-config-eksctl.yaml --disable-nodegroup-eviction
   ```
+
+   <p align="center">
+   <img src="./assets/delete-cluster.png" alt="Deleting a Cluster and Node Group" />
+   <br>
+   <em>Fig: Deleting a Cluster and Node Group</em>
+   </p>
+
 - **Creating only Node Group**
   ```
   eksctl create nodegroup -f cluster-config-eksctl.yaml
@@ -212,81 +229,110 @@ Instructions to install the AWS EBS CSI driver in the production environment.
    kubectl delete pvc -l app.kubernetes.io/instance=qdrant-db
    ```
 
-### Deploy Image Search
+### Deploy Triton Inference Server
 
-1. **Install Image Search Service**
+1.  **Model Repository**
 
-   ```
-   kubectl apply -f image-search-deployment.yaml,image-search-service.yaml
-   ```
+- Create s3
+  ```
+  aws s3api create-bucket --bucket qai-triton-repository --region us-east-1
+  ```
+- Copy model repository from local to s3
+  ```
+  aws s3 cp ./../image-search-engine/model_repository s3://qai-triton-repository/model_repository --recursive
+  ```
+- To load the model from the AWS S3, you need to convert the following AWS credentials in the base64 format and add it to the values.yaml
+
+  ```
+  echo -n 'REGION' | base64
+  ```
+
+  ```
+  echo -n 'SECRECT_KEY_ID' | base64
+  ```
+
+  ```
+  echo -n 'SECRET_ACCESS_KEY' | base64
+  ```
+
+- Update path model repository in values.yaml
+  ```
+  modelRepositoryPath: s3://qai-triton-repository/model_repository
+  ```
+
+2. **Deploy Prometheus and Grafana**
+
+   The inference server metrics are collected by Prometheus and viewable by Grafana. The inference server helm chart assumes that Prometheus
+   and Grafana are available so this step must be followed even if you don't want to use Grafana.
 
-2. **Uninstall Image Search Service**
    ```
-   kubectl delete -f image-search-deployment.yaml,image-search-service.yaml
+   helm install search-engine-metrics --set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false prometheus-community/kube-prometheus-stack
    ```
 
-### Deploy Text Search
-
-1. **Install Text Search Service**
+   Then port-forward to the Prometheus and Grafana service so you can access it from
+   your local browser.
 
    ```
-   kubectl apply -f text-search-deployment.yaml,text-search-service.yaml
+   kubectl port-forward service/search-engine-metrics-grafana 8080:80
+   kubectl port-forward service/search-engine-metrics-kube-prometheus 9090:9090
    ```
 
-2. **Uninstall Text Search Service**
+3. **Deploy the Inference Server**
+   Deploy the inference server using the default configuration with the
+   following commands.
+
    ```
-   kubectl delete -f text-search-deployment.yaml,text-search-service.yaml
+   cd helm-charts/triton-inference-server
+
+   helm install search-engine-serving .
    ```
 
-### Deploy Triton Inference Server
+4. **Clean up**
 
-1.  Model Repository
+   Once you've finished using the inference server you should use helm to
+   delete the deployment.
 
-- Create s3
-  ```
-  aws s3api create-bucket --bucket qai-triton-repository --region us-east-1
-  ```
-- Copy model repository from local to s3
-  ```
-  aws s3 cp ./../image-search-engine/model_repository s3://qai-triton-repository/model_repository --recursive
-  ```
-  To load the model from the AWS S3, you need to convert the following AWS credentials in the base64 format and add it to the values.yaml
+   ```
+   helm uninstall search-engine-metrics
+   helm uninstall search-engine-serving
+   ```
 
-```
-echo -n 'REGION' | base64
-```
+   For the Prometheus and Grafana services, you should [explicitly delete
+   CRDs](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack#uninstall-helm-chart):
 
-```
-echo -n 'SECRECT_KEY_ID' | base64
-```
+   ```
+   kubectl delete crd alertmanagerconfigs.monitoring.coreos.com alertmanagers.monitoring.coreos.com podmonitors.monitoring.coreos.com probes.monitoring.coreos.com prometheuses.monitoring.coreos.com prometheusrules.monitoring.coreos.com servicemonitors.monitoring.coreos.com thanosrulers.monitoring.coreos.com
+   ```
 
-```
-echo -n 'SECRET_ACCESS_KEY' | base64
-```
+   You may also want to delete the AWS bucket you created to hold the
+   model repository.
 
-## Prometheus Grafana
+   ```
+   aws s3 rm -r gs://qai-triton-repository
+   ```
 
-helm install search-engine-metrics --set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=true prometheus-community/kube-prometheus-stack
+### Deploy Image Search
 
-kubectl port-forward service/search-engine-metrics-grafana 8080:80
-kubectl port-forward service/search-engine-metrics-kube-prometheus 9090:9090
+1. **Install Image Search Service**
 
-## Deploy the Inference Server
+   ```
+   kubectl apply -f image-search-deployment.yaml,image-search-service.yaml
+   ```
 
-```
-cd helm-charts/triton-inference-server
-helm install qai-triton-inference .
+2. **Uninstall Image Search Service**
+   ```
+   kubectl delete -f image-search-deployment.yaml,image-search-service.yaml
+   ```
 
-helm install search-engine-serving .
-```
+### Deploy Text Search
 
-1. **Install Triton Inference Server Service**
+1. **Install Text Search Service**
 
    ```
    kubectl apply -f text-search-deployment.yaml,text-search-service.yaml
    ```
 
-2. **Uninstall Triton Inference Server Service**
+2. **Uninstall Text Search Service**
    ```
    kubectl delete -f text-search-deployment.yaml,text-search-service.yaml
    ```

diff --git a/kubernetes/helm-charts/README.md b/kubernetes/helm-charts/README.md
@@ -1,5 +1,13 @@
 # Helm Charts
 
-https://github.com/nginxinc/kubernetes-ingress
-https://github.com/kubernetes/ingress-nginx
-https://kubernetes.github.io/ingress-nginx/deploy/#quick-start
+Helm is a package manager for Kubernetes, which simplifies the process of defining, installing, and upgrading even the most complex Kubernetes applications. Below are references to Helm Charts and documentation for various Kubernetes services and tools.
+
+## References
+
+- **Helm Installation**: [Helm Official Install Guide](https://helm.sh/docs/intro/install/)
+- **Ingress Nginx**: [GitHub Repository](https://github.com/kubernetes/ingress-nginx) | [Quick Start Guide](https://kubernetes.github.io/ingress-nginx/deploy/#quick-start)
+- **Triton Inference Server on AWS**: [Deployment Guide](https://github.com/triton-inference-server/server/tree/main/deploy/aws)
+- **Prometheus Community Helm Charts**: [GitHub Repository](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack)
+- **Qdrant Helm Charts**: [GitHub Repository](https://github.com/qdrant/qdrant-helm/tree/main/charts/qdrant)
+- **Elastic Cloud on Kubernetes**: [Documentation](https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-stack-helm-chart.html) | [Kubernetes Deployment](https://github.com/elastic/cloud-on-k8s/tree/main/deploy/eck-stack)
+- **Grafana Dashboards**: [Explore Dashboards](https://grafana.com/grafana/dashboards)