amazon-sagemaker-elastic

RAG implementation using Amazon SageMaker Jumpstart and Elastic

Prerequisites:

Sign up for a free trial of Elasticsearch cluster with Elastic Cloud
Create a new deployment on AWS following the steps
Add a new machine learning node following the below steps. This will enable to run machine learning models in your deployment
1. Click on Edit under Deployment Name in the left navigation bar
2. Scroll down to the Machine Learning instances box
3. Click +Add Capacity
4. Under Size per zone, click and select 2GB RAM
5. Click on Save and then Confirm
Reset and download the elastic user password following these steps
Copy the deployment ID from the Overview page under Deployment name
Load an embedding model into Elasticsearch. Here, we have used all-distilroberta-v1 model hosted in Hugging Face model hub. You can choose other sentence transformer type of models based on your case. Import this python notebook load_embedding_llm_to_elastic.ipynb in Amazon SageMaker and run it. Provide Cloud Id, Elasticsearch username and Elasticsearch password when prompted. This will download the model from Hugging Face, chunk it up, load it into Elasticsearch and deploy the model onto the machine learning node of Elasticsearch cluster.

Choose your LLM. Amazon SageMaker jumpstart offers a wide selection of proprietary and publicly available foundation models from various model providers. Login to Amazon SageMaker Studio , open Amazon SageMaker jumpstart and search for your preferred Foundation model.
Deploy your LLM. Amazon SageMaker jumpstart studio also provides a no-code interface to Deploy the model. You can easily deploy a model with few clicks. After the deployment is successful. Copy the Endpoint Name.
Download this git repo and setup the RAG Application. Launch an EC2 instance and clone the code from this github link. Setup a virtual environment. Install the required python libraries by running the command pip install -r requirements.txt. Update the config.sh file with the following:
1. ES_CLOUD_ID: Elastic Cloud Deployment ID
2. ES_USERNAME: Elasticsearch Cluster User
3. ES_PASSWORD: Elasticsearch User password
4. FLAN_T5_ENDPOINT: Amazon SageMaker Endpoint Name pointing to Flan T5
5. FALCON_40B_ENDPOINT: Amazon SageMaker Endpoint Name pointing to Falcon 40B
6. AWS_REGION: AWS Region
Run the application using the command streamlit run rag_elastic_aws.py. This will start a web browser and the url will be printed to the command line

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.streamlit		.streamlit
README.md		README.md
config.sh		config.sh
elastic_aw.py		elastic_aw.py
load_embedding_llm_to_elastic.ipynb		load_embedding_llm_to_elastic.ipynb
requirements.txt		requirements.txt