Skip to content

Udayel/RAGElastic-LLM

Repository files navigation

amazon-sagemaker-elastic

RAG implementation using Amazon SageMaker Jumpstart and Elastic

Prerequisites:

  1. Sign up for a free trial of Elasticsearch cluster with Elastic Cloud
  2. Create a new deployment on AWS following the steps
  3. Add a new machine learning node following the below steps. This will enable to run machine learning models in your deployment
    1. Click on Edit under Deployment Name in the left navigation bar
    2. Scroll down to the Machine Learning instances box
    3. Click +Add Capacity
    4. Under Size per zone, click and select 2GB RAM
    5. Click on Save and then Confirm
  4. Reset and download the elastic user password following these steps
  5. Copy the deployment ID from the Overview page under Deployment name
  6. Load an embedding model into Elasticsearch. Here, we have used all-distilroberta-v1 model hosted in Hugging Face model hub. You can choose other sentence transformer type of models based on your case. Import this python notebook load_embedding_llm_to_elastic.ipynb in Amazon SageMaker and run it. Provide Cloud Id, Elasticsearch username and Elasticsearch password when prompted. This will download the model from Hugging Face, chunk it up, load it into Elasticsearch and deploy the model onto the machine learning node of Elasticsearch cluster.

Implementation Steps:

  1. Choose your LLM. Amazon SageMaker jumpstart offers a wide selection of proprietary and publicly available foundation models from various model providers. Login to Amazon SageMaker Studio , open Amazon SageMaker jumpstart and search for your preferred Foundation model.
  2. Deploy your LLM. Amazon SageMaker jumpstart studio also provides a no-code interface to Deploy the model. You can easily deploy a model with few clicks. After the deployment is successful. Copy the Endpoint Name.
  3. Download this git repo and setup the RAG Application. Launch an EC2 instance and clone the code from this github link. Setup a virtual environment. Install the required python libraries by running the command pip install -r requirements.txt. Update the config.sh file with the following:
    1. ES_CLOUD_ID: Elastic Cloud Deployment ID
    2. ES_USERNAME: Elasticsearch Cluster User
    3. ES_PASSWORD: Elasticsearch User password
    4. FLAN_T5_ENDPOINT: Amazon SageMaker Endpoint Name pointing to Flan T5
    5. FALCON_40B_ENDPOINT: Amazon SageMaker Endpoint Name pointing to Falcon 40B
    6. AWS_REGION: AWS Region
  4. Run the application using the command streamlit run rag_elastic_aws.py. This will start a web browser and the url will be printed to the command line