# Demo parts of Recipe Recommender

Contents:
* Get sample of recipe data from S3
* Show test request
* Get model endpoint (most recent version from group)
* Create request embedding
* Get top 5 results from Vector DB

Note: this does not include every part of our E2E system, but includes some of the highlights of where some of our data is stored, information we include along the way, and different components that help get us to our result.

### Getting the data

Our primary dataset is our list of recipes and associated information. We got this [dataset from Kaggle](https://www.kaggle.com/datasets/irkaal/foodcom-recipes-and-reviews) and it includes over 500,000 recipes. To help with scaling in our system, we chose to use a subset of 1,000 recipes to test the design.

As part of our system, we have the raw data uploaded to S3 and a lambda function that does all our data preprocessing. A sample of our final recipe data is below.

In [1]:
import pandas as pd

uri = "s3://cs401r-mlops-final/preprocessed-data/preprocessed_data.csv"
full_data = pd.read_csv(uri)

full_data.head()

  full_data = pd.read_csv(uri)


Unnamed: 0,RecipeId,Name,AuthorId,AuthorName,CookTime,PrepTime,TotalTime,DatePublished,Description,Images,...,SodiumContent,CarbohydrateContent,FiberContent,SugarContent,ProteinContent,RecipeServings,RecipeYield,RecipeInstructions,EmbeddingSentence,AverageRating
0,38,Low-Fat Berry Blue Frozen Dessert,1533,Dancer,24.0,0.75,24.75,1999-08-09T21:46:00Z,Make and share this Low-Fat Berry Blue Frozen ...,['https://img.sndimg.com/food/image/upload/w_5...,...,29.8,37.1,3.6,30.2,3.2,4.0,,"['Toss 2 cups berries with sugar.', 'Let stand...",Dessert Low Protein Low Cholesterol Healthy Fr...,4.25
1,39,Biryani,1567,elly9812,0.416667,4.0,4.416667,1999-08-29T13:12:00Z,Make and share this Biryani recipe from Food.com.,['https://img.sndimg.com/food/image/upload/w_5...,...,368.4,84.4,9.0,20.4,63.4,6.0,,['Soak saffron in warm milk for 5 minutes and ...,Chicken Thigh & Leg Chicken Poultry Meat Asian...,3.0
2,40,Best Lemonade,1566,Stephen Little,0.083333,0.5,0.583333,1999-09-05T19:52:00Z,This is from one of my first Good House Keepi...,['https://img.sndimg.com/food/image/upload/w_5...,...,1.8,81.5,0.4,77.2,0.3,4.0,,"['Into a 1 quart Jar with tight fitting lid, p...",Low Protein Low Cholesterol Healthy Summer < 6...,4.333333
3,41,Carina's Tofu-Vegetable Kebabs,1586,Cyclopz,0.333333,24.0,24.333333,1999-09-03T14:54:00Z,This dish is best prepared a day in advance to...,['https://img.sndimg.com/food/image/upload/w_5...,...,1558.6,64.2,17.3,32.1,29.3,2.0,4 kebabs,"['Drain the tofu, carefully squeezing out exce...",Beans Vegetable Low Cholesterol Weeknight Broi...,4.5
4,42,Cabbage Soup,1538,Duckie067,0.5,0.333333,0.833333,1999-09-19T06:19:00Z,Make and share this Cabbage Soup recipe from F...,"""https://img.sndimg.com/food/image/upload/w_55...",...,959.3,25.1,4.8,17.7,4.3,4.0,,['Mix everything together and bring to a boil....,Low Protein Vegan Low Cholesterol Healthy Wint...,2.666667


Additional information that is vital to our system is user information and requests. For the purposes of this demo, we are excluding the user request information which is primarily used for filtering out recipes with allergies for the user so we make sure they are not considered. Below is a test request for a recipe.

In [6]:
import boto3
import json

bucket = "cs401r-mlops-final"
json_key = "raw-data/test_request.txt"

s3 = boto3.client("s3")

json_response = s3.get_object(Bucket=bucket, Key=json_key)
json_body = json_response['Body'].read().decode('utf-8')
dict_request = json.loads(json_body)

query = str(dict_request.get('request'))

In [4]:
print(dict_request)

{'user': {'username': 'josh_phelps23', 'allergies': ['peanuts', 'shellfish'], 'likes': ['chicken', 'avocado', 'rice'], 'dislikes': ['mushrooms', 'blue cheese'], 'macros': {'protein': 'high', 'carbs': 'medium', 'fats': 'low'}, 'recipes_tried': [{'recipe_id': 23891, 'rating': '5'}, {'recipe_id': 10485, 'rating': '3'}, {'recipe_id': 78652, 'rating': '1'}]}, 'request': {'ingredients_available': ['chicken', 'brown rice', 'broccoli', 'garlic'], 'max_time_minutes': 30, 'meal_type': 'Dinner', 'preferences': {'spice_level': 'medium', 'diet_type': 'gluten-free'}}}


### Setting up Model & Embedding

We use our model to create an embedding for the incoming request. We chose our model based on which model is able to find a result most similar to the incoming request as well as a score based on the time the model takes to find the result. We included the time factor so that as we scale our system, it will process requests quickly.

We include our model in the group "recipe-recommender-models" and use the current version as the endpoint for our UI. It will generate the embedding to be used for comparison to the recipes

In [None]:
!pip install sentence-transformers
!pip install pinecone

In [None]:
from monitoring.model_management import GetEndpointName
group_name = "recipe-recommender-models"
endpoint_name = GetEndpointName(group_name)
print(endpoint_name)

all-MiniLM-L6-v2-endpoint-12


In [7]:
import torch

request = json.dumps({"inputs": query})
client = boto3.client("sagemaker-runtime")

response = client.invoke_endpoint(
    EndpointName=endpoint_name,
    ContentType="application/json",
    Body=request
)

embedding = json.loads(response["Body"].read().decode("utf-8"))
token_embeddings = torch.tensor(embedding).squeeze()
sentence_embedding = token_embeddings.max(dim=0)[0].detach().tolist()

print(len(sentence_embedding))

384


### Sending Request

After generating the request embedding, we find the closest match to the recipe database we have. An example of the UI is shown below, which would normally be used to create and send the request:

![Example of UI input screen](input_example.png "")

The vector database logs the embeddings using the recipe ID, and includes key features in the meta data such as the recipe name, ingredient list, and instructions. Here we'll get the top 5 related recipes to our request and print the IDs of the recipes, including the top related recipe. 

In [13]:
from pinecone import Pinecone

# Replace with Pinecone API key
PINECONE_API_KEY = "pcsk_422pfD_Md475J4UU75vyUtvvbt18F8qQ4HPaXQ8ALaqzQHbXkwhV1wjdewEWoj7fDiUHhy"
pc_api_key =PINECONE_API_KEY
pc = Pinecone(api_key=pc_api_key)

index = pc.Index("recipe-recommendations")
query_vector = sentence_embedding

result = index.query(
    vector=query_vector,
    top_k=5,
    include_metadata=True,
    # include_values=True
)

ids = []
for i in range(5):
    ids.append(result.get('matches')[i].get('id'))

recipe_id = result.get('matches')[0].get('id')

In [15]:
print(f"Top Recipe ID: {recipe_id}")
print(f"All top recipe IDs: {ids}")

Top Recipe ID: 316
All top recipe IDs: ['316', '451', '525', '667', '1857']


### Returning Recipe Info

The final data returned to our UI needs to be in a format that is easy for users to follow and use for their meals! We'll show the raw output below of our top match, but our UI formats our results in a much cleaner output. An example of this is shown here!

![Example of UI Output](output_example.png)

In [22]:
# NOTE: this is a different recipe than the one above because it was using a different request!
result.get('matches')[0]

{'id': '316',
 'metadata': {'amounts': "['olive oil', 'olive oil', 'onion', 'garlic', 'long "
                         "grain white rice', 'cumin', 'canned chicken broth', "
                         "'currants', 'pine nuts', 'fresh parsley', 'of fresh "
                         "mint', 'grape leaves', 'plain yogurt', 'of fresh "
                         "mint', 'garlic', 'lemon wedge']",
              'cook_time': '1.5166666666666666',
              'description': 'Make and share this Dolmades with Yogurt-Mint '
                             'Sauce recipe from Food.com.',
              'ingredients': "['1/4', '3', '1', '3', '1', '1 1/2', '4', '1/3', "
                             "'1/3', '1/3', '1/4', '1', '1', '1/4', '1']",
              'instructions': "['FOR GRAPE LEAVES:', 'Heat 1/4 cup oil in "
                              'heavy medium saucepan over medium low heat. '
                              'Add  onion and garlic. Saute until very tender, '
                              "a

And that's the end of our demo! We hope this shows you some of the basics of how we created our recipe recommender system!