<a href="https://colab.research.google.com/github/davidyu8/gouda-group-project/blob/main/find_recipe.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Recipe Reccomender ™

## Preparing the Data

In [167]:
import json
import pandas as pd
import sqlite3
import numpy as np


# set up data set (this is the smaller one, with about 40,000 recipes)
with open("recipes_raw/recipes_raw_nosource_ar.json") as f:
    data = json.load(f)
df = pd.DataFrame(data)
df = df.T

# can also use larger data set, just put database in working directory
conn = sqlite3.connect("recipes1M.db")


In [168]:
df.head()

Unnamed: 0,title,ingredients,instructions,picture_link
rmK12Uau.ntP510KeImX506H6Mr6jTu,Slow Cooker Chicken and Dumplings,"[4 skinless, boneless chicken breast halves AD...","Place the chicken, butter, soup, and onion in ...",55lznCYBbs2mT8BTx6BTkLhynGHzM.S
5ZpZE8hSVdPk2ZXo1mZTyoPWJRSCPSm,Awesome Slow Cooker Pot Roast,[2 (10.75 ounce) cans condensed cream of mushr...,"In a slow cooker, mix cream of mushroom soup, ...",QyrvGdGNMBA2lDdciY0FjKu.77MM0Oe
clyYQv.CplpwJtjNaFGhx0VilNYqRxu,Brown Sugar Meatloaf,"[1/2 cup packed brown sugar ADVERTISEMENT, 1/2...",Preheat oven to 350 degrees F (175 degrees C)....,LVW1DI0vtlCrpAhNSEQysE9i/7rJG56
BmqFAmCrDHiKNwX.IQzb0U/v0mLlxFu,Best Chocolate Chip Cookies,"[1 cup butter, softened ADVERTISEMENT, 1 cup w...",Preheat oven to 350 degrees F (175 degrees C)....,0SO5kdWOV94j6EfAVwMMYRM3yNN8eRi
N.jCksRjB4MFwbgPFQU8Kg.yF.XCtOi,Homemade Mac and Cheese Casserole,[8 ounces whole wheat rotini pasta ADVERTISEME...,Preheat oven to 350 degrees F. Line a 2-quart ...,YCnbhplMgiraW4rUXcybgSEZinSgljm


_note from Colby_:

In order to avoid using a ton of forloops, I reshaped the ingredients column a bit by using the `join` method. Basically, it combines all the elements in a list into a string separated by commas and whitespace. So this is helpful in avoiding the use of too many forloops when dealing with nested iterables.

In [169]:
# cleaning up and preparing the data

# create the Score column to track matching recipes
df["Score"] = 0

# reshape ingredients column from a list into a single string
df["ingredients"] = df["ingredients"].str.join(', ')

# clean up row names, drop NaN values
df = df.reset_index(drop = True)
df = df.dropna()

In [170]:
df.head()

Unnamed: 0,title,ingredients,instructions,picture_link,Score
0,Slow Cooker Chicken and Dumplings,"4 skinless, boneless chicken breast halves ADV...","Place the chicken, butter, soup, and onion in ...",55lznCYBbs2mT8BTx6BTkLhynGHzM.S,0
1,Awesome Slow Cooker Pot Roast,2 (10.75 ounce) cans condensed cream of mushro...,"In a slow cooker, mix cream of mushroom soup, ...",QyrvGdGNMBA2lDdciY0FjKu.77MM0Oe,0
2,Brown Sugar Meatloaf,"1/2 cup packed brown sugar ADVERTISEMENT, 1/2 ...",Preheat oven to 350 degrees F (175 degrees C)....,LVW1DI0vtlCrpAhNSEQysE9i/7rJG56,0
3,Best Chocolate Chip Cookies,"1 cup butter, softened ADVERTISEMENT, 1 cup wh...",Preheat oven to 350 degrees F (175 degrees C)....,0SO5kdWOV94j6EfAVwMMYRM3yNN8eRi,0
4,Homemade Mac and Cheese Casserole,8 ounces whole wheat rotini pasta ADVERTISEMEN...,Preheat oven to 350 degrees F. Line a 2-quart ...,YCnbhplMgiraW4rUXcybgSEZinSgljm,0


## Implementing the Function

_note from Colby_:

I already worked out a recipe function for the smaller dataset, so I started a brand new one for the bigger dataset that runs by querying the database instead. I'm keeping both just to document, see how maybe we can combine the best aspects of both together. The second function is still a bit broken at the moment though :/
I'm not sure how to use the LIKE keyword when querying in order to grab multiple different ingredient matches, so atm the second function can only look for 1-ingredient matches.

In [205]:
def find_recipe_1(ingredients, min_score = 1):
    """
    A function that reccomends a recipe to cook based on the user's available ingredients. Uses the smaller dataset.
    
    ingredients: a list of ingredients supplied as strings
    min_score: the minimum number of ingredient matches a recipe has to satisfy in order to be reccomended
    returns: a portion of the original dataframe only containing recipes that the user may want to cook
    """
    
    # reset the Score column every time the function is called
    df["Score"] = 0
    
    # iterate through list of input ingredients
    for ingr in ingredients:
      
        # increment score by 1 every time the matching ingredient name is found in a recipe
        df["Score"] += df['ingredients'].apply(lambda x: ingr in x)
  
    # return recipes in which the minimum score is satisfied
    return df[df["Score"] >= min_score]

In [214]:
def find_recipe_2(ingredients, min_score = 1):
    """
    Same as find_recipe_1, but uses the bigger dataset.
    """
    
    # open up dataset, automatically close
    with sqlite3.connect("recipes1M.db") as conn:
        
        # grab ingredient matches (only works with one ingredient right now)
        query = \
        f"""
        SELECT R.title, R.ingredients, R.url
        FROM recipes R
        WHERE R.ingredients LIKE "%{ingredients}%"
    
        """
        
        # query database
        df = pd.read_sql_query(query, conn)
    
    # return matching recipes
    return df

## Testing it Out

In [206]:
find_recipe_1(["chicken"]).head() # all recipes where chicken is used

Unnamed: 0,title,ingredients,instructions,picture_link,Score
0,Slow Cooker Chicken and Dumplings,"4 skinless, boneless chicken breast halves ADV...","Place the chicken, butter, soup, and onion in ...",55lznCYBbs2mT8BTx6BTkLhynGHzM.S,1
9,Singapore Chili Crabs,"Sauce: ADVERTISEMENT, 1/2 cup ketchup ADVERTIS...","Whisk ketchup, chicken broth, egg, soy sauce, ...",OFp6yXFwzlrkMQ5STffYPllxQvMVLUS,1
20,Delicious Ham and Potato Soup,3 1/2 cups peeled and diced potatoes ADVERTISE...,"Combine the potatoes, celery, onion, ham and w...",Ve83fJ5ulFEOFoIjWsgDl8Ro56GAtby,1
21,Chicken Pot Pie IX,"1 pound skinless, boneless chicken breast halv...",Preheat oven to 425 degrees F (220 degrees C.)...,39ec69lJMInLCpNnHSASWBMhWBjpd5i,1
28,Ang's Balsamic Maple Chicken,"2 tablespoons maple syrup ADVERTISEMENT, 1 tab...","Whisk maple syrup, balsamic vinegar, garlic po...",sf9JzBcRWe9QjxqeXANbq/Uotw707.q,1


In [207]:
# recipes that use 4 or more of the ingredients below
find_recipe_1(["chicken", "pesto", "pork", "linguine", "tomato", "mushroom"], min_score = 4).head()

Unnamed: 0,title,ingredients,instructions,picture_link,Score
2352,Pesto Cream Sauce,1 (16 ounce) package linguine pasta ADVERTISEM...,Bring a large pot of lightly salted water to a...,EiKd3QgOIy32MreDpvxlJZop1I6RYrq,4
3511,Bolognese Sauce,"2 tablespoons olive oil ADVERTISEMENT, 4 slice...","In a large skillet, warm oil over medium heat ...",7cvE0A6paIcqgXkjwcXrolny7Rptj.y,4
6590,Spence's Pesto Chicken Pasta,"1/2 pound linguine pasta ADVERTISEMENT, 1 (8 o...",Fill a large pot with lightly salted water and...,J5shZmliz5dEC619zPYPoyEGXIX1G3S,4
8075,Herbed Chicken Pasta,"1 pound uncooked linguine ADVERTISEMENT, 2 tea...",Cook pasta in about 4 quarts of boiling salted...,afZJdecFNXTXnrO8BLekSs76rFN0eii,4
8198,Zucchini and Pork Soup,"4 pork chops ADVERTISEMENT, 1/2 cup all-purpos...",Place flour in a resealable plastic bag. Add p...,EQ8arB0P9ouk3HJGtY.J5DJ2rmctz4C,4


In [208]:
# recipes that use 4 or more of the ingredients below
find_recipe_1(["oysters", "clam", "tomato", "lemon", "scallop", "fish", "squid"], min_score = 4)

Unnamed: 0,title,ingredients,instructions,picture_link,Score
9992,Linguine Pescadoro,1 (16 ounce) package linguini pasta ADVERTISEM...,In a large pot of boiling salted water cook li...,CrLeGbg/vdFL7WLpKgag4/kQrv/2Lme,4
29623,Ivan's Mega Frutti Di Mare,1 1/2 (16 ounce) packages linguine pasta ADVER...,Fill a large pot with lightly salted water and...,qfy8U7GdYXEnnkjqrJAJ/6NcIJ.LbK2,5
30392,Authentic Seafood Paella,"2 tablespoons olive oil ADVERTISEMENT, 1 onion...",Heat olive oil in a large skillet or paella pa...,lq2X0br9s0kSC5Mmas4f2UfOL2/4WK6,4
31993,Mediterranean Seafood Medley,"20 baby squid (tubes and tentacles), cleaned A...",Soak squid in milk for 1 to 5 hours; the longe...,Ij/xrJa9S/GiMLS.eNZpNVI5N3EFJsu,4


In [209]:
# recipes that use 5 or more of the ingredients below
find_recipe_1(["pita", "beef", "yogurt", "cucumber", "dill"], min_score = 5)

Unnamed: 0,title,ingredients,instructions,picture_link,Score
19539,Beef Gyro,2 (8 ounce) containers plain yogurt ADVERTISEM...,"Blend yogurt, cucumbers, 2 tablespoons olive o...",he3rkjm/q1ZJz62lLhD2iIi5mLLDkdq,5


In [215]:
# only able to match 1 ingredient
find_recipe_2("pork")

Unnamed: 0,title,ingredients,url
0,Fennel-Rubbed Pork Tenderloin with Roasted Fen...,"[""1 teaspoon fennel seeds"", ""1 pound pork tend...",http://www.epicurious.com/recipes/food/views/f...
1,Sausage and marmalade plait recipe,"[""750 g (26.5oz) pork sausage meat"", ""1 onion,...",http://www.lovefood.com/guide/recipes/27146/sa...
2,My Family's Meat Stuffing for a Turkey,"[""2 12 lbs ground beef, your choice of cut"", ""...",http://www.food.com/recipe/my-familys-meat-stu...
3,Calico Beans,"[""1 lb ground beef"", ""12 lb bacon"", ""1 cup oni...",http://www.food.com/recipe/calico-beans-22876
4,Cheese Stuffed Pork Roast W/ Cream Sauce,"[""1 tablespoon all-purpose flour"", ""14 teaspoo...",http://www.food.com/recipe/cheese-stuffed-pork...
...,...,...,...
39772,Can't Wait Tailgate BBQ Tacos Recipe alender,"[""1 pkg. angelhair or finely shredded cole sla...",http://www.chowhound.com/recipes/wait-tailgate...
39773,Pork Chop 'n' Rice Casserole,"[""4 pork loin chops, bone in (about 1-1/2 poun...",http://www.food.com/recipe/pork-chop-n-rice-ca...
39774,Breakfast Enchiladas,"[""1 ENCHILADA FILLING"", ""2 lb hot ground pork ...",https://cookpad.com/us/recipes/349310-breakfas...
39775,Real Italian Meatballs,"[""1 lb ground beef"", ""1 lb ground pork"", ""23 c...",http://www.food.com/recipe/real-italian-meatba...
