<a href="https://colab.research.google.com/github/tyoungg/Data-Warehouse/blob/master/intro_to_rag.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**Introduction to RAG**
### *Michaela Buchanan - Mark III Systems*

This notebook references the Intro to LLM workshop notebook which is linked here:

---
Imports for all the code below. Please run this before anything else!

In [None]:
!pip -q install bitsandbytes

from transformers import AutoTokenizer, FalconForCausalLM, BitsAndBytesConfig
import sys
import os
from sentence_transformers.util import semantic_search, dot_score
from sentence_transformers import SentenceTransformer
import pandas as pd
import torch

---
### Introduction

In this notebook we will be going through a basic RAG workflow using a dataset of recipes and [Falcon 7B Instruct](https://huggingface.co/tiiuae/falcon-7b-instruct). The goal will be to see the model use specific information from [this recipe dataset](https://huggingface.co/datasets/Shengtao/recipe) in its responses after we build out the RAG pipeline.

---
### Import Model

The first step is to import our LLM and associated tokenizer, in this case Falcon 7B Instruct. Just like in the Intro to LLM we are going to use a sharded version of the model to avoid running out of RAM in our free tier Google Colab environment. This means that the model is loaded in using more smaller files rather than fewer larger files that cause the compute environement to crash. We are also loading in our in 4 bit precision to avoid running out of GPU memory.

In [None]:
model_name = "vilsonrodrigues/falcon-7b-instruct-sharded"

bb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

falcon_model = FalconForCausalLM.from_pretrained(
    model_name,
    quantization_config=bb_config,
    use_cache=False,
    low_cpu_mem_usage=True
)

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True, clean_up_tokenization_spaces=True)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Loading checkpoint shards:   0%|          | 0/15 [00:00<?, ?it/s]

---
### Baseline Inference

Let's set a baseline of what kind of responses our model gives us without using RAG so that we can hopefully see a difference with RAG. First we set up the prompts we would like to use. Feel free to edit or add to these if you are curious about another prompt.

In [None]:
prompt_task = "Question: "
prompt_end = "\nAnswer: "

prompts = [
    "What two ingredients do I need for two-ingredient pizza dough?",
    "What do I need for a quick tartar sauce?",
    "What should I preheat the oven to to make blueberry muffins?"
]

Now we feed them into our model to see what responses we get.

In [None]:
for prompt in prompts:
  prompt = prompt_task + prompt + prompt_end
  inputs = tokenizer(prompt, return_tensors="pt").to('cuda')

  generate_ids = falcon_model.generate(input_ids=inputs.input_ids, attention_mask=inputs.attention_mask, max_new_tokens=25, pad_token_id=tokenizer.eos_token_id)

  print("\n")
  print(tokenizer.decode(generate_ids[0]))
  print("\n------------------------------------------------------------")

print("\n\nAll done")



Question: What two ingredients do I need for two-ingredient pizza dough?
Answer: 2 cups of all-purpose flour and 1/2 teaspoon of salt.<|endoftext|>

------------------------------------------------------------


Question: What do I need for a quick tartar sauce?
Answer: 1 cup mayonnaise, 1/4 cup ketchup, 1/4 cup apple cider vinegar, 1/4 cup

------------------------------------------------------------


Question: What should I preheat the oven to to make blueberry muffins?
Answer: 350°F (175°C)<|endoftext|>

------------------------------------------------------------


All done


---
### Import Recipe Dataset

The results seem reasonable. However we would like our model to use recipe information from a recipe dataset rather than information it remembers from its pretraining. To start that process we first need to import our dataset. We download it from Hugging Face and grab only the first 500 entries. This is only done to save time for this demo because creating the embeddings for all entries in the dataset takes a lot of time and, since this is just a demo, we don't need all of the dataset entries.

In [None]:
df = pd.read_csv("hf://datasets/Shengtao/recipe/recipe.csv")
df = df.head(500)

df

Unnamed: 0,title,url,category,author,description,rating,rating_count,review_count,ingredients,directions,...,vitamin_k_mcg,biotin_mcg,vitamin_b12_mcg,mono_fat_g,poly_fat_g,trans_fatty_acid_g,omega_3_fatty_acid_g,omega_6_fatty_acid_g,instructions_list,image
0,Simple Macaroni and Cheese,https://www.allrecipes.com/recipe/238691/simpl...,main-dish,g0dluvsugly,A very quick and easy fix to a tasty side-dish...,4.42,834,575,1 (8 ounce) box elbow macaroni ; ¼ cup butter ...,Bring a large pot of lightly salted water to a...,...,,,,,,,,,['Bring a large pot of lightly salted water to...,https://www.allrecipes.com/thmb/GZrTl8DBwmRuor...
1,Gourmet Mushroom Risotto,https://www.allrecipes.com/recipe/85389/gourme...,main-dish,Myleen Sagrado Sjödin,Authentic Italian-style risotto cooked the slo...,4.80,3388,2245,"6 cups chicken broth, divided ; 3 tablespoons ...","In a saucepan, warm the broth over low heat. W...",...,,,,,,,,,"['Warm broth in a saucepan over low heat.', 'M...",https://www.allrecipes.com/thmb/xCk4IEjfAYBikO...
2,Dessert Crepes,https://www.allrecipes.com/recipe/19037/desser...,breakfast-and-brunch,ANN57,Essential crepe recipe. Sprinkle warm crepes ...,4.80,1156,794,"4 eggs, lightly beaten ; 1 ⅓ cups milk ; 2 ta...","In large bowl, whisk together eggs, milk, melt...",...,,,,,,,,,"['Whisk together eggs, milk, flour, melted but...",https://www.allrecipes.com/thmb/VwULr05JFDluPI...
3,Pork Steaks,https://www.allrecipes.com/recipe/70463/pork-s...,meat-and-poultry,BABYLOVE1222,My mom came up with this recipe when I was a c...,4.57,689,539,¼ cup butter ; ¼ cup soy sauce ; 1 bunch green...,"Melt butter in a skillet, and mix in the soy s...",...,,,,,,,,,['Melt butter in a skillet over medium heat; s...,https://www.allrecipes.com/thmb/mYkvln7o9pb35l...
4,Quick and Easy Pizza Crust,https://www.allrecipes.com/recipe/20171/quick-...,bread,CHEF RIDER,This is a great recipe when you don't want to ...,4.70,3741,2794,1 (.25 ounce) package active dry yeast ; 1 tea...,Preheat oven to 450 degrees F (230 degrees C)....,...,,,,,,,,,['Preheat oven to 450 degrees F (230 degrees C...,https://www.allrecipes.com/thmb/V3Llo-ottudIs_...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
495,Spicy Grilled Shrimp,https://www.allrecipes.com/recipe/12775/spicy-...,seafood,SUBEAST,"So fast and easy to prepare, these shrimp are ...",4.60,1121,836,1 large clove garlic ; 1 teaspoon coarse salt ...,Preheat grill for medium heat. In a small bowl...,...,,,,,,,,,"['Preheat a grill for medium heat.', 'Crush ga...",https://www.allrecipes.com/thmb/bH-TT4Cifwew96...
496,Grilled Chicken Marinade,https://www.allrecipes.com/recipe/241890/grill...,meat-and-poultry,Jennifer,Simply the best chicken marinade for any occas...,4.77,236,167,¼ cup red wine vinegar ; ¼ cup reduced-sodium ...,"Whisk vinegar, soy sauce, olive oil, parsley, ...",...,,,,,,,,,"['Whisk vinegar, soy sauce, olive oil, parsley...",https://www.allrecipes.com/thmb/b8GwNUs4HIz9-X...
497,Wool Roll Bread,https://www.allrecipes.com/recipe/284058/wool-...,uncategorized,Chef John,I've spun quite a few yarns but one thing I've...,0.00,0,0,½ cup water ; ¼ cup all-purpose flour ; ½ cup ...,"To make ""water roux,"" whisk together water and...",...,,,,,,,,,"['To make ""water roux,"" whisk together water a...",https://www.allrecipes.com/thmb/BL8waIMOSis0wl...
498,Easy Pizza Sauce I,https://www.allrecipes.com/recipe/11771/easy-p...,side-dish,Frank Sweterlitsch,A simple pizza sauce used by many pizzerias. T...,4.34,570,430,1 (6 ounce) can tomato paste ; 1 ½ cups water ...,"Mix together the tomato paste, water, and oliv...",...,,,,,,,,,"['Mix together water, tomato paste, and olive ...",https://www.allrecipes.com/thmb/aYLD9MOpFl7p7A...


Now we are going to combine the `title`, `directions`, and `ingredients` columns into one entry per row. This will be what we embed for our model to reference using RAG.

In [None]:
text = []
for i, row in df.iterrows():
    text.append(row['title'] + ": " + row['directions'] + " Ingredients list: " + row['ingredients'])

print("Made recipe list")

Made recipe list


---
### Embedding

Now we are going to setup our embedding model. We will be using the [SentenceTransformers](https://sbert.net/) library for this example but there are lots of other libraries/frameworks out there for doing RAG.

There are many models available through SentenceTransformers. Some are more accurate but larger and slower to run while others are more lightweight but may suffer in performance. We will be using `all-mpnet-base-v2` which offers a nice balance of size and performance. However if you would like to learn more or try out a different model see the [documentation linked here](https://sbert.net/docs/sentence_transformer/pretrained_models.html).

In [None]:
st_model = SentenceTransformer("all-mpnet-base-v2")



In the cell below we take each entry in our list of recipes and embed it using the `all-mpnet-base-v2` model. `.encode` creates our embeddings which we append to a list called `embeddings`.

In [None]:
embeddings = []

for line in text:
    embedding = st_model.encode(line, convert_to_tensor=True)
    embeddings.append(embedding)

print("Made recipe embeddings")

Made recipe embeddings


---
### Inference Using RAG

Now we are ready to see the results of using our embeddings in our prompts for `Falcon 7B Instruct`. For each prompt in our list of prompts we run through the loop below. We first encode the prompt and then use `semantic_search` from SentenceTransformers to find the most similar embedded recipes. With the `top_k=2` argument we only get the top two results. Then we reference which index these results correspond to and fetch the appropriate elements from our `text` recipe list. These results are included in the prompt and the prompt is passed to our LLM just like before.

In [None]:
for prompt in prompts:
    prompt = prompt_task + prompt + prompt_end

    prompt_embed = st_model.encode(prompt, convert_to_tensor=True)
    hits = semantic_search(prompt_embed, embeddings, top_k=2)

    result_line1 = hits[0][0]['corpus_id']
    result_line2 = hits[0][1]['corpus_id']

    prompt = text[result_line1] + "\n" + text[result_line2] + "\n\n" + prompt

    inputs = tokenizer(prompt, return_tensors="pt")

    generate_ids = falcon_model.generate(input_ids=inputs.input_ids.to('cuda'), attention_mask=inputs.attention_mask.to('cuda'), max_new_tokens=25, pad_token_id=tokenizer.eos_token_id)

    print("\n")
    print(tokenizer.decode(generate_ids[0]))
    print("\n------------------------------------------------------------")

print("\n\nAll done")



Two-Ingredient Pizza Dough: Mix flour and Greek yogurt together in a bowl; transfer to a work surface floured with self-rising flour. Knead dough, adding more flour as needed to keep dough from being too sticky, for 8 to 10 minutes. Spray a 12-inch pizza pan with cooking spray and spread dough to edges of pan. Ingredients list: 1 ½ cups self-rising flour, plus more for kneading ; 1 cup plain Greek yogurt ;   cooking spray
No-Yeast Pizza Crust: Mix flour, baking powder, and salt together in a bowl; stir in milk and olive oil until a soft dough forms. Turn dough onto a lightly floured surface and knead 10 times. Shape dough into a ball. Cover dough with an inverted bowl and let sit for 10 minutes. Roll dough into a 12-inch circle on a baking sheet. Ingredients list: 1 ⅓ cups all-purpose flour ; 1 teaspoon baking powder ; ½ teaspoon salt ; ½ cup fat-free milk ; 2 tablespoons olive oil

Question: What two ingredients do I need for two-ingredient pizza dough?
Answer: 1.5 cups self-rising 