**Resource:** https://github.com/aws-samples/Meta-Llama-on-AWS/blob/main/RAG-recipes/llama3-rag-langchain-smjs.ipynb

* Embedding Model Used: *HuggingFace BGE Large EN Embedding model*
* LLM Used: *HuggingFace Llama 3 8b Instruct LLM model*

In [146]:
%%writefile requirements.txt
langchain==0.1.14
pypdf==4.1.0
faiss-cpu==1.8.0
boto3==1.34.58
sqlalchemy==2.0.29

Overwriting requirements.txt


In [147]:
import sqlalchemy
print(sqlalchemy.__version__)

2.0.30


In [148]:
pip install nvidia-ml-py3==7.352.0

Note: you may need to restart the kernel to use updated packages.


In [149]:
pip install sqlparse==0.5.0

Note: you may need to restart the kernel to use updated packages.


In [150]:
pip install scikit-learn==1.3.0

Note: you may need to restart the kernel to use updated packages.


In [151]:
pip install omegaconf==2.2.3

Note: you may need to restart the kernel to use updated packages.


In [152]:
pip install gluonts==0.15.1

Note: you may need to restart the kernel to use updated packages.


In [153]:
pip install langchain==0.1.14

Note: you may need to restart the kernel to use updated packages.


In [154]:
pip install boto3==1.34.58

Note: you may need to restart the kernel to use updated packages.


In [155]:
!pip install -U -r requirements.txt



In [156]:
import langchain
print(langchain.__version__)

0.1.14


In [157]:
try:
    import sagemaker
except ImportError:
    !pip install sagemaker

In [13]:
# Import the JumpStartModel class from the SageMaker JumpStart library
from sagemaker.jumpstart.model import JumpStartModel

In [14]:
# Specify the model ID for the HuggingFace Llama 3 8b Instruct LLM model
model_id = "meta-textgeneration-llama-3-8b-instruct"
accept_eula = True
model = JumpStartModel(model_id=model_id, model_version="2.7.0", instance_type= "ml.g5.2xlarge")

In [15]:
predictor = model.deploy(accept_eula=accept_eula, instance_type="ml.g5.2xlarge")

----------------!

In [16]:
# Specify the model ID for the HuggingFace BGE Large EN Embedding model
model_id = "huggingface-sentencesimilarity-bge-large-en-v1-5"
text_embedding_model = JumpStartModel(model_id=model_id,model_version="1.1.1")

In [17]:
embedding_predictor = text_embedding_model.deploy(instance_type="ml.g5.2xlarge")

---------!

In [158]:
import json
import sagemaker

from langchain_core.prompts import PromptTemplate
from langchain_community.llms import SagemakerEndpoint
from langchain_community.embeddings import SagemakerEndpointEmbeddings
from langchain_community.llms.sagemaker_endpoint import LLMContentHandler
from langchain_community.embeddings.sagemaker_endpoint import EmbeddingsContentHandler

In [159]:
sess = sagemaker.session.Session()
region = sess._region_name

In [160]:
llm_endpoint_name = "meta-textgeneration-llama-3-8b-instruct-2024-10-20-09-25-36-231"
embedding_endpoint_name = "hf-sentencesimilarity-bge-large-en-v1-5-2024-10-20-09-35-36-025"

In [161]:
from typing import Dict

class Llama38BContentHandler(LLMContentHandler):
    content_type = "application/json"
    accepts = "application/json"

    def transform_input(self, prompt: str, model_kwargs: dict) -> bytes:
        payload = {
            "inputs": prompt,
            # default parameters
            "parameters": {
                # Controls the maximum number of tokens the model can generate
                "max_new_tokens": 700, #started at 1000
                # lower value makes the model more deterministic / higher value allows for more diverse responses
                "top_p": 0.9,
                # randomness
                "temperature": 0.4, #started at 0.6
                "stop": ["<|eot_id|>"],
            },
        }
        input_str = json.dumps(
            payload,
        )
        print(input_str)
        return input_str.encode("utf-8")

    def transform_output(self, output: bytes) -> str:
        response_json = json.loads(output.read().decode("utf-8"))
        print(response_json)
        content = response_json["generated_text"].strip()
        return content

In [162]:
# Instantiate the content handler for Llama3-8B
llama_content_handler = Llama38BContentHandler()

# Setup for using the Llama3-8B model with SageMaker Endpoint
llm = SagemakerEndpoint(
     endpoint_name=llm_endpoint_name,
     region_name=region,
     model_kwargs={"max_new_tokens": 1024, "top_p": 0.9, "temperature": 0.4},
     content_handler=llama_content_handler
 )

In [163]:
from typing import List

class BGEContentHandlerV15(EmbeddingsContentHandler):
    content_type = "application/json"
    accepts = "application/json"

    def transform_input(self, text_inputs: List[str], model_kwargs: dict) -> bytes:
        """
        Transforms the input into bytes that can be consumed by SageMaker endpoint.
        Args:
            text_inputs (list[str]): A list of input text strings to be processed.
            model_kwargs (Dict): Additional keyword arguments to be passed to the endpoint.
               Possible keys and their descriptions:
               - mode (str): Inference method. Valid modes are 'embedding', 'nn_corpus', and 'nn_train_data'.
               - corpus (str): Corpus for Nearest Neighbor. Required when mode is 'nn_corpus'.
               - top_k (int): Top K for Nearest Neighbor. Required when mode is 'nn_corpus'.
               - queries (list[str]): Queries for Nearest Neighbor. Required when mode is 'nn_corpus' or 'nn_train_data'.
        Returns:
            The transformed bytes input.
        """
        input_str = json.dumps(
            {
                "text_inputs": text_inputs,
                **model_kwargs
            }
        )
        return input_str.encode("utf-8")

    def transform_output(self, output: bytes) -> List[List[float]]:
        """
        Transforms the bytes output from the endpoint into a list of embeddings.
        Args:
            output: The bytes output from SageMaker endpoint.
        Returns:
            The transformed output - list of embeddings
        Note:
            The length of the outer list is the number of input strings.
            The length of the inner lists is the embedding dimension.
        """
        response_json = json.loads(output.read().decode("utf-8"))
        return response_json["embedding"]

In [164]:
bge_content_handler = BGEContentHandlerV15()
sagemaker_embeddings = SagemakerEndpointEmbeddings(
    endpoint_name=embedding_endpoint_name,
    region_name=region,
    model_kwargs={"mode": "embedding"},
    content_handler=bge_content_handler,
)

In [165]:
!pip install openpyxl



In [166]:
import os

# Set the directory path to the correct folder
#directory_path = 'user-default-efs'

contents = os.listdir()
# Filter the list to get only PDF files
xlsx_files = [item for item in contents if item.endswith('.xlsx')]

# Print the PDF files
print("Excel files in the specified directory:")
for item in xlsx_files:
    print(item)

Excel files in the specified directory:
recipe_final1_split_1.xlsx


In [167]:
import json
from typing import List, Dict
from langchain import PromptTemplate
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain.indexes.vectorstore import VectorStoreIndexWrapper
from langchain.schema import HumanMessage
import pandas as pd
from langchain.docstore.document import Document

# Define the optimized prompt template
prompt_template = """<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a helpful assistant specialized in providing recipes. When answering questions about recipes, always structure your response as follows:
1. Recipe Name
2. Ingredients (as a numbered list)
3. Instructions (as a numbered list)
4. Any additional notes or tips
Ensure that ingredients and instructions are always presented as numbered lists for clarity and ease of use.
Use the following context to answer the user's query:
{context}
<|eot_id|><|start_header_id|>user<|end_header_id|>
{query}
<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Provide a complete recipe following the structure outlined above.
"""

PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "query"]
)

# 1. Document Processing
documents = []
metadata = [{} for _ in xlsx_files]  # Initialize metadata for each Excel file

for idx, file in enumerate(xlsx_files):
    # Load Excel file
    df = pd.read_excel(file)
    
    # Process each row in the Excel file
    for _, row in df.iterrows():
        recipe = f"Title: {row['title']}\nIngredients: {row['ingredients']}\nInstructions: {row['instructions']}"
        document = Document(page_content=recipe, metadata=metadata[idx])
        documents.append(document)

# Apply text splitting if needed
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=50,
)
docs = text_splitter.split_documents(documents)

# 2. Vector Store Creation
vectorstore_faiss = FAISS.from_documents(
    docs,
    sagemaker_embeddings,
)
wrapper_store_faiss = VectorStoreIndexWrapper(vectorstore=vectorstore_faiss)

# 3. Load Allergens
with open('allergens.json', 'r') as f:
    allergens = json.load(f)

# 4. Define Filtering and Query Functions
def filter_documents_by_allergen(docs, allergen):
    if allergen:
        return [doc for doc in docs if allergen.lower() not in doc.page_content.lower()]
    return docs

def query_with_allergen_filter(query: str, allergen: str = None):
    # Retrieve relevant documents
    retrieved_docs = wrapper_store_faiss.vectorstore.similarity_search(query)
    
    # Filter documents based on allergen
    filtered_docs = filter_documents_by_allergen(retrieved_docs, allergen)
    
    # Generate response using filtered documents
    context = "\n".join([doc.page_content for doc in filtered_docs])
    
    formatted_prompt = PROMPT.format(context=context, query=query)
    
    # Use the invoke method instead of calling the LLM directly
    response = llm.invoke([HumanMessage(content=formatted_prompt)])
    
    # Check if response is a string or an object with content attribute
    if isinstance(response, str):
        return response
    elif hasattr(response, 'content'):
        return response.content
    else:
        return str(response)  # fallback to string representation

# Modify the process_query function to include the allergen information in the query
def process_query(user_query: str):
    allergen = None
    if "allergic to" in user_query.lower():
        allergen = user_query.lower().split("allergic to")[-1].split(",")[0].strip()
        query = f"Please provide a recipe that doesn't include {allergen}. {user_query}"
    else:
        query = user_query
    
    result = query_with_allergen_filter(query, allergen)
    return result

In [168]:
# Example Usage
query = "I am allergic to milk, please give me a dairy-free recipe."
result = process_query(query)
print(result)

{"inputs": "Human: <|begin_of_text|><|start_header_id|>system<|end_header_id|>\nYou are a helpful assistant specialized in providing recipes. When answering questions about recipes, always structure your response as follows:\n1. Recipe Name\n2. Ingredients (as a numbered list)\n3. Instructions (as a numbered list)\n4. Any additional notes or tips\nEnsure that ingredients and instructions are always presented as numbered lists for clarity and ease of use.\nUse the following context to answer the user's query:\nTitle: Banana-Cream Cheese Breakfast Cakes\nIngredients: 8 ounces cream cheese, at room temperature, 1 teaspoon vanilla extract, 2 large egg whites, at room temperature, 1/2 cup sugar, 1 1/2 cups plus 3 tablespoons cake flour, 1 1/2 teaspoons baking soda, 1/2 teaspoon kosher salt, 1 stick plus 2 tablespoons unsalted butter, at room temperature, 1 cup plus 6 tablespoons sugar, 2 teaspoons vanilla extract, 3 large eggs, at room temperature, 3 very ripe bananas, mashed\nTitle: \"V\" 

In [169]:
# Another Example Usage run previously by Phoebe
query = "I have brussel sprouts and carrots, can you give me a healthy dairy-free recipe to make."
result = process_query(query)
print(result)

{"inputs": "Human: <|begin_of_text|><|start_header_id|>system<|end_header_id|>\nYou are a helpful assistant specialized in providing recipes. When answering questions about recipes, always structure your response as follows:\n1. Recipe Name\n2. Ingredients (as a numbered list)\n3. Instructions (as a numbered list)\n4. Any additional notes or tips\nEnsure that ingredients and instructions are always presented as numbered lists for clarity and ease of use.\nUse the following context to answer the user's query:\nInstructions: Bring a stockpot of lightly salted water to a boil over high heat. Add the potatoes, and cook until tender, 15 to 20 minutes. Drain potatoes. Cut potatoes into 1/3-inch-thick rounds, and transfer to a bowl. Set aside. Bring a stockpot of lightly salted water to a boil over high heat, and prepare an ice bath. Trim bottoms from Brussels sprouts. Peel away the inner leaves, one by one. Add carrots to boiling water, and cook until tender, about 4 minutes. Using a slotted

In [170]:
# Example with multiple queries
queries = [
    "Give me a recipe for a vegetarian pasta dish.",
    "I'm allergic to nuts. Can you give me a dessert recipe?",
    "I need a gluten-free breakfast recipe."
]

for q in queries:
    print(f"\nQuery: {q}")
    result = process_query(q)
    print(result)
    print("-" * 50)


Query: Give me a recipe for a vegetarian pasta dish.
{"inputs": "Human: <|begin_of_text|><|start_header_id|>system<|end_header_id|>\nYou are a helpful assistant specialized in providing recipes. When answering questions about recipes, always structure your response as follows:\n1. Recipe Name\n2. Ingredients (as a numbered list)\n3. Instructions (as a numbered list)\n4. Any additional notes or tips\nEnsure that ingredients and instructions are always presented as numbered lists for clarity and ease of use.\nUse the following context to answer the user's query:\nTitle: Baked Pasta with Tomatoes, Shiitakes, and Prosciutto\nIngredients: 2 cups finely chopped onion, 2 large garlic cloves, minced, 1/4 teaspoon dried hot red pepper flakes, or to taste, 1 teaspoon dried basil, crumbled, 1 teaspoon dried oregano, crumbled, 2 tablespoons olive oil, 1 pound fresh shiitakes, stems discarded and the caps sliced, 8 ounces (1 stick) unsalted butter, 6 tablespoons all-purpose flour, 4 cups milk, 2 (

In [171]:
# Another Example Usage run previously by Phoebe
query = "How to cook a dish with pork tenderloin?"
result = process_query(query)
print(result)

{"inputs": "Human: <|begin_of_text|><|start_header_id|>system<|end_header_id|>\nYou are a helpful assistant specialized in providing recipes. When answering questions about recipes, always structure your response as follows:\n1. Recipe Name\n2. Ingredients (as a numbered list)\n3. Instructions (as a numbered list)\n4. Any additional notes or tips\nEnsure that ingredients and instructions are always presented as numbered lists for clarity and ease of use.\nUse the following context to answer the user's query:\nInstructions: Preheat oven to 450 degrees F.\nCoat tenderloins in steak seasoning or season with salt and pepper. Combine brown sugar and 5 spice powder. Rub sugar and spice over pork. Brush baking dish or broiler pan with oil. Place pork in oven and bake 10 minutes. Combine barbecue sauce with orange zest and juice and sesame oil. Baste meat liberally with sauce and roast meat 15 minutes longer. Remove meat from oven and transfer to a serving platter. Coat meat with sesame and ci

In [173]:
# Another query example
query = "I have brussel sprouts and carrots, can you give me a healthy dairy-free recipe to make."
result = process_query(query)
print(result)

{"inputs": "Human: <|begin_of_text|><|start_header_id|>system<|end_header_id|>\nYou are a helpful assistant specialized in providing recipes. When answering questions about recipes, always structure your response as follows:\n1. Recipe Name\n2. Ingredients (as a numbered list)\n3. Instructions (as a numbered list)\n4. Any additional notes or tips\nEnsure that ingredients and instructions are always presented as numbered lists for clarity and ease of use.\nUse the following context to answer the user's query:\nInstructions: Bring a stockpot of lightly salted water to a boil over high heat. Add the potatoes, and cook until tender, 15 to 20 minutes. Drain potatoes. Cut potatoes into 1/3-inch-thick rounds, and transfer to a bowl. Set aside. Bring a stockpot of lightly salted water to a boil over high heat, and prepare an ice bath. Trim bottoms from Brussels sprouts. Peel away the inner leaves, one by one. Add carrots to boiling water, and cook until tender, about 4 minutes. Using a slotted

In [174]:
# Another Example Usage run previously by Phoebe
query = "How to cook a dish with pork tenderloin?"
result = process_query(query)
print(result)

{"inputs": "Human: <|begin_of_text|><|start_header_id|>system<|end_header_id|>\nYou are a helpful assistant specialized in providing recipes. When answering questions about recipes, always structure your response as follows:\n1. Recipe Name\n2. Ingredients (as a numbered list)\n3. Instructions (as a numbered list)\n4. Any additional notes or tips\nEnsure that ingredients and instructions are always presented as numbered lists for clarity and ease of use.\nUse the following context to answer the user's query:\nInstructions: Preheat oven to 450 degrees F.\nCoat tenderloins in steak seasoning or season with salt and pepper. Combine brown sugar and 5 spice powder. Rub sugar and spice over pork. Brush baking dish or broiler pan with oil. Place pork in oven and bake 10 minutes. Combine barbecue sauce with orange zest and juice and sesame oil. Baste meat liberally with sauce and roast meat 15 minutes longer. Remove meat from oven and transfer to a serving platter. Coat meat with sesame and ci