Need
- (eksctl)[https://eksctl.io/installation/]
- (aws cli)[https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html]
- (helm cli)[https://helm.sh/docs/intro/install/#helm]
- (kubectl)[https://kubernetes.io/docs/tasks/tools/#kubectl]
Ensure you have quotas for
- ${gpu_count}*4 for On-Demand G and VT instances in the region of choice
- At least 1 load-balancer per each model you want. (Not per server running)
Modify the following lines in create_cluster.sh
To get your account id run
aws sts get-caller-identityRun ./create_cluster.sh to generate the cluster
- Specify your embedding models
Modify embedding_models.yaml for the models that you want to use
- Install the helm chart
helm upgrade -i embedding-release oci://registry-1.docker.io/trieve/embeddings-helm -f embedding_models.yaml- Get your model endpoints
kubectl get inghelm uninstall embedding-release
./delete_cluster.sh