This project is RAG model that embedds Polish's Penal Code and answers questions with gpt-4o-mini LLM model. It's deployed on the Azure Kubernetes Service (AKS)
The 'rag-api' is API written in FastAPI that is serving as a backend service for the RAG communication. It consists of one POST Endpoint that retrieves the relevant context in Milvus DB, feeds it into LLM and returns the LLM answer based on that context
The 'frontend' is streamlit application that communicates with the rag-api and provide user-friendly interface to chat with RAG model
The 'ingesting' is Argo Workflows's workflow that chunks Polish's Penal Code, embedds it and stores the embeddings in Milvus Vector Database
Application is designed to run on kubernetes' pods
Azure Kubernetes Cluster (AKS) is deployed using terraform
-
Terraform Deployment
- Navigate to the
terraform
directory. - Create terraform.tfvars file and assign values
resource_group_name = "law-rag-model-rg" resource_group_location = "West US" aks_name = "law-rag-model-aks" dns_prefix = "law-rag-model-dns" node_count = 1 vm_size = "Standard_B2s" os_disk_size_gb = 32 subscription_id = "<YOUR_SUBSCRIPTION_ID>"
- Initialize Terraform:
terraform init
- Apply the Terraform configuration to deploy the AKS cluster:
terraform apply
- Navigate to the
-
Application Deployment
- Navigate to the deploy directory.
- Use Helm to deploy the application:
helm install <release-name> ./helm-chart
- Alternatively, apply the Kubernetes YAML files:
kubectl apply -f sample.yaml -n namespace