# Project : Recipe Recommendations Based on Grocery Store Promotions

#### Notebook 1: 
This notebook covers the following : (1) web scraping the Food Network's website for recipes, and (2) cleaning the dataframe after web scraping 

---------------------------------------------------------------------------------------------------------------------------

# SECTION 1 OF NOTEBOOK 1: 
# WEB SCRAPING THE FOOD NETWORK'S WEBSITE FOR RECIPES

For the purpose of this project, I have decided to obtain the recipes from the following 10 Food Network chefs : Bobby Flay, Ina Garten, Geoffrey Zakarian, Molly Yeh, Giada De Laurentiis, Alex Guarnaschelli, Scott Conant, Brooke Williamson, Michael Simon and Michael Voltaggio.

Each of these Food Network chefs have contributed a significant amount of recipes (breafast, cocktails, dinner, side dishes, etc.) to the Food Network's website. I have scraped this website to obtain all the recipes that these chefs have either created or that another chef has shared with them on their television show. 

To scrape these recipes, the following steps have been completed (see the sections below):
1. Obtained all the webpages that contain the recipes for each chef (list of webpages);
2. Obtained all the URLs for every recipe associated to the specific chef's webpage;
3. Extracted all the recipe info (title, ingredients, time, level, etc.) from each URL;
4. Created a dataframe for each chef with all their recipes.

## 1. Imports

In [3]:
import requests
from bs4 import BeautifulSoup as bs
import pandas as pd
from ast import literal_eval
import nltk
import re

## 2. Food Network Website Scraping

### 2.1 Obtaining the Recipe Pages (containing the URLS) for Each Chef

In [618]:
# Chef Bobby Flay
urls_bf=[]
for i in range(1,337): #represents the number of website pages
    urls_bf.append("https://www.foodnetwork.com/search/bobby-flay-recipes-/p/{}/CUSTOM_FACET:RECIPE_FACET".format(i))
#urls_bf

In [322]:
# Chef : Ina Garten
urls_ig=[]
for i in range(1,148):
    urls_ig.append("https://www.foodnetwork.com/search/ina-garten-recipes-/p/{}/CUSTOM_FACET:RECIPE_FACET".format(i))

# Chef: Geoffrey Zakarian
urls_gz=[]
for i in range(1,73):
    urls_gz.append("https://www.foodnetwork.com/search/geoffrey-zakarian-/p/{}/CUSTOM_FACET:RECIPE_FACET".format(i))

# Chef : Molly Yeh
urls_my=[]
for i in range(1,70):
    urls_my.append("https://www.foodnetwork.com/search/molly-yeh-recipes-/p/{}/CUSTOM_FACET:RECIPE_FACET".format(i))

# Chef : Giada De Laurentiis
urls_gd=[]
for i in range(1,236):
    urls_gd.append("https://www.foodnetwork.com/search/giada-/p/{}/CUSTOM_FACET:RECIPE_FACET".format(i))

# Chef : Alex Guarnaschelli
urls_ag=[]
for i in range(1,54):
    urls_ag.append("https://www.foodnetwork.com/search/alex-guarnaschelli-/p/{}/CUSTOM_FACET:RECIPE_FACET".format(i))

# Chef : Scott Conant
urls_sc=[]
for i in range(1,8):
    urls_sc.append("https://www.foodnetwork.com/search/scott-conant-/p/{}/CUSTOM_FACET:RECIPE_FACET".format(i))

# Chef : Brooke Williamson
urls_bw=[]
for i in range(1,3):
    urls_bw.append("https://www.foodnetwork.com/search/brooke-williamson-/p/{}/CUSTOM_FACET:RECIPE_FACET".format(i))

# Chef : Michael Simon
urls_ms=[]
for i in range(1,64):
    urls_ms.append("https://www.foodnetwork.com/search/michael-symon-/p/{}/CUSTOM_FACET:RECIPE_FACET".format(i))

# Chef : Michael Voltaggio
urls_mv=[]
for i in range(1,7):
    urls_mv.append("https://www.foodnetwork.com/search/michael-voltaggio-/p/{}/CUSTOM_FACET:RECIPE_FACET".format(i))

### 2.2 Scraping Every Recipe URL Associated to the Specific Chef
- Each webpage has approximately 5-10 recipe URLS

#### 2.2.1. Bobby Flay 

In [17]:
all_recipes=[]
for url in urls_bf:
    r=requests.get(url)
    if r.status_code != 200:
        print("CAN'T MOVE FORWARD")
        break
    else: 
        soup=bs(r.content,'html.parser')
        recipes= soup.find_all(class_='o-RecipeResult o-ResultCard')

        for recipe in recipes:
            links = recipe.find_all('a', href=True)
            for link in links:
                all_recipes.append((link['href']))

In [185]:
# Some recipe URLs appear in double or triple - unique list only
unique_list_bf= list(set(all_recipes))
len(unique_list_bf)

2721

#### 2.2.2. All the Other Chefs (Created a Function to Scrape Every Recipe URL Associated to the Chef)

In [333]:
# urls : urls_ig,urls_gz,urls_my,urls_gd,urls_ag,urls_sc,urls_bw,urls_ms,urls_mv

def url_scrapping(urls):
    
    all_recipes=[]

    for url in urls:
        r=requests.get(url)
        if r.status_code != 200:
            print("CAN'T MOVE FORWARD")
            break
        else: 
            soup=bs(r.content,'html.parser')
            recipes= soup.find_all(class_='o-RecipeResult o-ResultCard')

            for recipe in recipes:
                links = recipe.find_all('a', href=True)
                for link in links:
                    all_recipes.append((link['href']))

    return all_recipes


In [339]:
mv=list(set(url_scrapping(urls_mv)))

In [342]:
ms=list(set(url_scrapping(urls_ms)))

In [343]:
bw=list(set(url_scrapping(urls_bw)))

In [344]:
sc=list(set(url_scrapping(urls_sc)))

In [345]:
ag=list(set(url_scrapping(urls_ag)))

In [358]:
my=list(set(url_scrapping(urls_my)))

In [359]:
gz=list(set(url_scrapping(urls_gz)))

In [362]:
gd=list(set(url_scrapping(urls_gd)))

In [364]:
ig=list(set(url_scrapping(urls_ig)))

### 2.3. Web Scraping the Recipes (name, ingredients, chef, level of difficulty, total time)
- Scraping is completed for each chef and a dataframe is created for each chef to ensure the data is properly downloaded

#### 2.3.1. Function to Scrape Each URL from the Section 2.2.1 & 2.2.2 Above

In [350]:
def recipe_scrapping(recipes):
    
    recipe_url=[]
    chef_list=[]
    title_list=[]
    level_list=[]
    time_list=[]
    ingredients_list=[]
    
    for index,i in enumerate(recipes):
        url="https:"+i
        #print(url)
        r=requests.get(url)
        if r.status_code != 200:
            print("CAN'T MOVE FORWARD")
            break
        else: 
            soup=bs(r.content,'html.parser')
            
            # Recipe Website
            web=url
            #print(web)
            recipe_url.append(web)

            # Recipe Chef
            try:
                chef=soup.find('span',class_="o-Attribution__a-Name").text
                chef_list.append(chef)
            except AttributeError:
                chef_list.append('NOT AVAILABLE')
                

            # Recipe Title
            try:
                title = soup.find('h1', class_='o-AssetTitle__a-Headline').span.text
                title_list.append(title)
            except AttributeError:
                title_list.append('NOT AVAILABLE')

            # Recipe Ingredients
            try:
                list=[]
                ingredients=soup.find_all('p',class_="o-Ingredients__a-Ingredient")
                for i in ingredients:
                    items=(i.find(class_="o-Ingredients__a-Ingredient--CheckboxLabel").text)
                    list.append(items)
                ingredients_list.append(list)
            except AttributeError:
                ingredients_list.append('NOT AVAILABLE')

            # Recipe Level
            try:
                level=soup.find('span',class_="o-RecipeInfo__a-Description").text
                level_list.append(level)
            except AttributeError:
                level_list.append("NOT AVAILABLE")

            # Recipe Time
            try:
                time=soup.find('span',class_="o-RecipeInfo__a-Description m-RecipeInfo__a-Description--Total").text
                time_list.append(time)

            except AttributeError:
                time_list.append("NOT AVAILABLE")
            
        print((index/len(recipes))*100) #to monitor the progress of web scrapping
                
    return recipe_url, chef_list, title_list, ingredients_list, level_list, time_list

#### 2.3.2 Web Scraping the Bobby Flay Recipes

In [320]:
# Bobby Flay
bf=unique_list_bf
recipe_url, chef_list, title_list, ingredients_list, level_list, time_list=recipe_scrapping(bf)

0.0
0.03675119441381845
0.0735023888276369
0.11025358324145534
0.1470047776552738
0.18375597206909225
0.2205071664829107
0.25725836089672915
0.2940095553105476
0.33076074972436603
0.3675119441381845
0.40426313855200297
0.4410143329658214
0.47776552737963984
0.5145167217934583
0.5512679162072767
0.5880191106210952
0.6247703050349137
0.6615214994487321
0.6982726938625505
0.735023888276369
0.7717750826901874
0.8085262771040059
0.8452774715178243
0.8820286659316428
0.9187798603454611
0.9555310547592797
0.9922822491730982
1.0290334435869166
1.065784638000735
1.1025358324145533
1.139287026828372
1.1760382212421905
1.2127894156560088
1.2495406100698274
1.2862918044836458
1.3230429988974641
1.3597941933112825
1.396545387725101
1.4332965821389196
1.470047776552738
1.5067989709665564
1.5435501653803747
1.5803013597941933
1.6170525542080119
1.6538037486218304
1.6905549430356486
1.7273061374494671
1.7640573318632855
1.800808526277104
1.8375597206909222
1.8743109151047408
1.9110621095185594
1.94781

16.35428151414921
16.391032708563028
16.42778390297685
16.464535097390666
16.501286291804483
16.5380374862183
16.57478868063212
16.61153987504594
16.64829106945976
16.685042263873576
16.721793458287397
16.75854465270121
16.795295847115028
16.83204704152885
16.868798235942666
16.905549430356487
16.942300624770304
16.979051819184125
17.015803013597942
17.05255420801176
17.089305402425577
17.126056596839398
17.162807791253215
17.199558985667036
17.236310180080853
17.273061374494674
17.30981256890849
17.346563763322308
17.383314957736125
17.420066152149946
17.456817346563763
17.49356854097758
17.5303197353914
17.56707092980522
17.603822124219036
17.640573318632853
17.677324513046674
17.71407570746049
17.750826901874312
17.78757809628813
17.82432929070195
17.861080485115764
17.897831679529585
17.934582873943402
17.971334068357223
18.00808526277104
18.04483645718486
18.081587651598678
18.118338846012495
18.155090040426312
18.19184123484013
18.22859242925395
18.265343623667768
18.302094818081

32.561558250643145
32.59830944505697
32.63506063947078
32.671811833884604
32.70856302829842
32.74531422271224
32.782065417126056
32.81881661153987
32.8555678059537
32.89231900036751
32.92907019478133
32.96582138919515
33.002572583608966
33.03932377802278
33.0760749724366
33.112826166850425
33.14957736126424
33.18632855567806
33.22307975009188
33.259830944505694
33.29658213891952
33.33333333333333
33.37008452774715
33.40683572216097
33.443586916574795
33.480338110988605
33.51708930540242
33.553840499816246
33.590591694230056
33.62734288864388
33.6640940830577
33.70084527747152
33.73759647188533
33.77434766629916
33.811098860712974
33.84785005512679
33.88460124954061
33.921352443954426
33.95810363836825
33.99485483278207
34.031606027195885
34.0683572216097
34.10510841602352
34.14185961043734
34.178610804851154
34.21536199926498
34.252113193678795
34.28886438809261
34.32561558250643
34.36236677692025
34.39911797133407
34.43586916574788
34.472620360161706
34.50937155457552
34.5461227489893

49.13634693127526
49.17309812568909
49.209849320102904
49.24660051451672
49.28335170893054
49.320102903344356
49.35685409775818
49.39360529217199
49.430356486585815
49.46710768099963
49.50385887541346
49.54061006982727
49.57736126424109
49.61411245865491
49.650863653068726
49.68761484748254
49.72436604189636
49.761117236310184
49.797868430723994
49.83461962513782
49.871370819551636
49.90812201396545
49.94487320837927
49.98162440279309
50.01837559720691
50.05512679162073
50.09187798603455
50.12862918044837
50.16538037486218
50.202131569276
50.238882763689816
50.27563395810364
50.31238515251746
50.349136346931274
50.3858875413451
50.42263873575891
50.459389930172726
50.49614112458654
50.53289231900037
50.569643513414185
50.60639470782801
50.64314590224183
50.679897096655644
50.716648291069454
50.75339948548327
50.790150679897096
50.82690187431091
50.86365306872474
50.900404263138554
50.93715545755237
50.973906651966196
51.010657846380006
51.04740904079382
51.08416023520764
51.12091142962

65.7111356119074
65.7478868063212
65.78463800073501
65.82138919514884
65.85814038956266
65.89489158397647
65.9316427783903
65.96839397280412
66.00514516721793
66.04189636163176
66.07864755604557
66.11539875045939
66.1521499448732
66.18890113928703
66.22565233370085
66.26240352811466
66.29915472252848
66.33590591694231
66.37265711135612
66.40940830576993
66.44615950018375
66.48291069459758
66.51966188901139
66.55641308342521
66.59316427783904
66.62991547225285
66.66666666666666
66.70341786108048
66.7401690554943
66.77692024990812
66.81367144432194
66.85042263873576
66.88717383314959
66.92392502756339
66.96067622197721
66.99742741639103
67.03417861080484
67.07092980521867
67.10768099963249
67.14443219404632
67.18118338846011
67.21793458287394
67.25468577728776
67.29143697170159
67.3281881661154
67.36493936052922
67.40169055494304
67.43844174935685
67.47519294377067
67.51194413818449
67.54869533259831
67.58544652701212
67.62219772142595
67.65894891583977
67.69570011025358
67.7324513046674

In [321]:
bobbyflay= pd.DataFrame({
        'Chef': chef_list,
        'Recipe Title': title_list,
        'Ingredients': ingredients_list,
        'Level of Difficulty': level_list,
        'Total Time Required': time_list,
        'Link to Recipe': recipe_url
        })

bobbyflay

Unnamed: 0,Chef,Recipe Title,Ingredients,Level of Difficulty,Total Time Required,Link to Recipe
0,\n \n \...,Bbq Chicken Cobb Salad,"[Deselect All, 4 chicken thighs, bone in, 2 cu...",4 servings,NOT AVAILABLE,https://www.foodnetwork.com/recipes/bobby-flay...
1,\n \n \...,"Breakfast Burritos with Mocha-Rubbed Steak, Gr...","[Deselect All, 2 tablespoons canola oil, 1 med...",Intermediate,1 hr 40 min,https://www.foodnetwork.com/recipes/bobby-flay...
2,\n \n \...,Rajas Salsa,"[Deselect All, 2 roasted red and yellow bell p...",Easy,30 min,https://www.foodnetwork.com/recipes/rajas-sals...
3,\n \n \...,Grilled Corn,"[Deselect All, 8 ears of corn, Unsalted butter...",Easy,NOT AVAILABLE,https://www.foodnetwork.com/recipes/bobby-flay...
4,\n \n \...,Korean-style BBQ Short Ribs,"[Deselect All, 4 large short ribs, 2 scallions...",Easy,2 hr 50 min,https://www.foodnetwork.com/recipes/korean-sty...
...,...,...,...,...,...,...
1974,\n \n \...,Cumin Tortillas,"[Deselect All, 6 cups peanut oil, 10 (8-inch) ...",NOT AVAILABLE,NOT AVAILABLE,https://www.foodnetwork.com/recipes/bobby-flay...
1975,\n \n \...,Herb and Garlic Crusted Halibut with Oven Bake...,"[Deselect All, 2 tablespoons chopped parsley, ...",1 hr 50 min,1 hr 50 min,https://www.foodnetwork.com/recipes/bobby-flay...
1976,\n \n \...,"Grilled White Corn Taco with Bbq Pork Loin, Ro...","[Deselect All, 2 pork loins, about 1 1/2 pound...",1 hr 40 min,1 hr 40 min,https://www.foodnetwork.com/recipes/bobby-flay...
1977,\n \n \...,Prickly Pear Sangria,"[Deselect All, One 750ml bottle Zinfandel, One...",Easy,1 hr 13 min,https://www.foodnetwork.com/recipes/prickly-pe...


In [379]:
# Bobby Flay Web Scrapping Stoped - Investigate where it stopped

recipe1979=unique_list_bf[1979]

url="https:"+recipe1979
r=requests.get(url)
print(r.status_code)

# Continue the scrapping for the others
remaining_recipes_bf=unique_list_bf[1980:]

404


In [380]:
# Web scraping continued for the remaining of the Bobby Flay Recipes

recipe_url, chef_list, title_list, ingredients_list, level_list, time_list=recipe_scrapping(remaining_recipes_bf)
bobbyflay2= pd.DataFrame({
        'Chef': chef_list,
        'Recipe Title': title_list,
        'Ingredients': ingredients_list,
        'Level of Difficulty': level_list,
        'Total Time Required': time_list,
        'Link to Recipe': recipe_url
        })

bobbyflay2

0.0
0.1349527665317139
0.2699055330634278
0.4048582995951417
0.5398110661268556
0.6747638326585695
0.8097165991902834
0.9446693657219973
1.0796221322537112
1.214574898785425
1.349527665317139
1.484480431848853
1.6194331983805668
1.7543859649122806
1.8893387314439947
2.0242914979757085
2.1592442645074224
2.2941970310391366
2.42914979757085
2.564102564102564
2.699055330634278
2.834008097165992
2.968960863697706
3.1039136302294197
3.2388663967611335
3.3738191632928474
3.508771929824561
3.643724696356275
3.7786774628879893
3.913630229419703
4.048582995951417
4.183535762483131
4.318488529014845
4.4534412955465585
4.588394062078273
4.723346828609987
4.8582995951417
4.993252361673414
5.128205128205128
5.263157894736842
5.398110661268556
5.53306342780027
5.668016194331984
5.802968960863698
5.937921727395412
6.0728744939271255
6.207827260458839
6.342780026990553
6.477732793522267
6.612685560053981
6.747638326585695
6.882591093117409
7.017543859649122
7.152496626180836
7.28744939271255
7.4224021

60.45883940620783
60.59379217273953
60.72874493927125
60.86369770580296
60.99865047233468
61.1336032388664
61.26855600539811
61.40350877192983
61.53846153846154
61.67341430499326
61.80836707152496
61.943319838056674
62.07827260458839
62.2132253711201
62.34817813765182
62.48313090418354
62.61808367071525
62.75303643724697
62.88798920377868
63.022941970310384
63.1578947368421
63.292847503373814
63.42780026990553
63.56275303643725
63.69770580296896
63.83265856950068
63.96761133603239
64.1025641025641
64.23751686909581
64.37246963562752
64.50742240215924
64.64237516869096
64.77732793522267
64.91228070175438
65.04723346828611
65.18218623481782
65.31713900134952
65.45209176788124
65.58704453441295
65.72199730094466
65.85695006747639
65.9919028340081
66.12685560053981
66.26180836707152
66.39676113360325
66.53171390013495
66.66666666666666
66.80161943319838
66.9365721997301
67.0715249662618
67.20647773279353
67.34143049932524
67.47638326585695
67.61133603238866
67.74628879892038
67.88124156545

Unnamed: 0,Chef,Recipe Title,Ingredients,Level of Difficulty,Total Time Required,Link to Recipe
0,\n \n \...,Shrimp and Littleneck Clams with Wild Rice Waf...,"[Deselect All, 1 tablespoon olive oil, 12 larg...",Easy,3 hr 30 min,https://www.foodnetwork.com/recipes/bobby-flay...
1,\n \n \...,Cranberry-Serrano Relish,[],Easy,25 min,https://www.foodnetwork.com/recipes/bobby-flay...
2,\n \n \...,Pan-Fried Flounder,"[Deselect All, 4 skinless flounder-fillets, Sa...",15 min,15 min,https://www.foodnetwork.com/recipes/pan-fried-...
3,\n \n \...,The Serengeti,"[Deselect All, 1/2 lemon, 4 to 6 large basil l...",Easy,10 min,https://www.foodnetwork.com/recipes/the-sereng...
4,\n \n \...,Huey's Beef Brisket Rub,"[Deselect All, 16 ounces light brown sugar, 8 ...",Easy,5 min,https://www.foodnetwork.com/recipes/hueys-beef...
...,...,...,...,...,...,...
736,\n \n \...,Penne with Summer Tomato Sauce with Fresh Mozz...,"[Deselect All, 4 ripe tomatoes, seeded and dic...",Easy,NOT AVAILABLE,https://www.foodnetwork.com/recipes/penne-with...
737,\n \n \...,Tangerine-Serrano Frozen Treats,"[Deselect All, 5 to 6 cups fresh tangerine jui...",Easy,8 hr 35 min,https://www.foodnetwork.com/recipes/bobby-flay...
738,\n \n \...,Nachos on the Grill with Tomatillo-Poblano Sal...,"[Deselect All, 16 ounces baked or fried tortil...",Intermediate,1 hr 25 min,https://www.foodnetwork.com/recipes/bobby-flay...
739,\n \n \...,Roasted Turkey a la Tangerine,"[Deselect All, 4 cups fresh tangerine juice, 2...",Easy,6 hr 45 min,https://www.foodnetwork.com/recipes/bobby-flay...


#### 2.3.3 Web Scraping the Other Chef's Recipes

In [351]:
# Chef Michael Voltaggio Dataframe
recipe_url, chef_list, title_list, ingredients_list, level_list, time_list=recipe_scrapping(mv)

michaelvoltaggio= pd.DataFrame({
        'Chef': chef_list,
        'Recipe Title': title_list,
        'Ingredients': ingredients_list,
        'Level of Difficulty': level_list,
        'Total Time Required': time_list,
        'Link to Recipe': recipe_url
        })

michaelvoltaggio

0.0
1.9607843137254901
3.9215686274509802
5.88235294117647
7.8431372549019605
9.803921568627452
11.76470588235294
13.725490196078432
15.686274509803921
17.647058823529413
19.607843137254903
21.568627450980394
23.52941176470588
25.49019607843137
27.450980392156865
29.411764705882355
31.372549019607842
33.33333333333333
35.294117647058826
37.254901960784316
39.21568627450981
41.17647058823529
43.13725490196079
45.09803921568628
47.05882352941176
49.01960784313725
50.98039215686274
52.94117647058824
54.90196078431373
56.86274509803921
58.82352941176471
60.78431372549019
62.745098039215684
64.70588235294117
66.66666666666666
68.62745098039215
70.58823529411765
72.54901960784314
74.50980392156863
76.47058823529412
78.43137254901961
80.3921568627451
82.35294117647058
84.31372549019608
86.27450980392157
88.23529411764706
90.19607843137256
92.15686274509804
94.11764705882352
96.07843137254902
98.0392156862745


Unnamed: 0,Chef,Recipe Title,Ingredients,Level of Difficulty,Total Time Required,Link to Recipe
0,\n \n \...,Scallops with Bacon and Yeast Sauce,"[Deselect All, 3/4 pound (3 sticks) unsalted b...",Intermediate,1 hr,https://www.foodnetwork.com/recipes/scallops-w...
1,\n \n \...,"""Charcoal"" Potatoes with Sour Cream and Vinegar","[Deselect All, 1 1/2 pounds creamer potatoes (...",Intermediate,55 min,https://www.foodnetwork.com/recipes/charcoal-p...
2,\n \n \...,Smoked Carrots with Coffee Mole Dirt,"[Deselect All, 150 grams (3/4 cup) sugar, 150 ...",Intermediate,1 hr 5 min,https://www.foodnetwork.com/recipes/smoked-car...
3,\n \n \...,SpanaKALEpita,"[Deselect All, 1 cup melted clarified butter, ...",Intermediate,2 hr 5 min,https://www.foodnetwork.com/recipes/spanakalep...
4,\n \n \...,Cucumber Salad with Sheep's Milk Feta and Lemo...,"[Deselect All, 1/4 cup olive oil, plus additio...",Easy,30 min,https://www.foodnetwork.com/recipes/cucumber-s...
5,\n \n \...,Candy Bar Pain Au Chocolat,"[Deselect All, 2 sheets puff pastry, fully tha...",Easy,40 min,https://www.foodnetwork.com/recipes/candy-bar-...
6,\n \n \...,Green Juice Martini,"[Deselect All, 1 cup fresh parsley leaves, 1 c...",Easy,20 min,https://www.foodnetwork.com/recipes/green-juic...
7,\n \n \...,Avocado Toast Pizza,"[Deselect All, One 10.5-ounce goat cheese log,...",Easy,50 min,https://www.foodnetwork.com/recipes/avocado-to...
8,\n \n \...,Pan con Tomate with Serrano Ham and Manchego C...,"[Deselect All, All-purpose flour, for dusting,...",Easy,30 min,https://www.foodnetwork.com/recipes/pan-con-to...
9,\n \n \...,Greek Yogurt Mousse with Stone Fruit,"[Deselect All, 225 grams egg yolks, 375 grams ...",Intermediate,1 hr 30 min,https://www.foodnetwork.com/recipes/greek-yogu...


In [352]:
# Chef Michael Simon Dataframe
recipe_url, chef_list, title_list, ingredients_list, level_list, time_list=recipe_scrapping(ms)

michaelsimon= pd.DataFrame({
        'Chef': chef_list,
        'Recipe Title': title_list,
        'Ingredients': ingredients_list,
        'Level of Difficulty': level_list,
        'Total Time Required': time_list,
        'Link to Recipe': recipe_url
        })

michaelsimon

0.0
0.18181818181818182
0.36363636363636365
0.5454545454545455
0.7272727272727273
0.9090909090909091
1.090909090909091
1.2727272727272727
1.4545454545454546
1.6363636363636365
1.8181818181818181
2.0
2.181818181818182
2.3636363636363638
2.5454545454545454
2.727272727272727
2.909090909090909
3.090909090909091
3.272727272727273
3.4545454545454546
3.6363636363636362
3.8181818181818183
4.0
4.181818181818182
4.363636363636364
4.545454545454546
4.7272727272727275
4.909090909090909
5.090909090909091
5.2727272727272725
5.454545454545454
5.636363636363637
5.818181818181818
6.0
6.181818181818182
6.363636363636363
6.545454545454546
6.7272727272727275
6.909090909090909
7.090909090909091
7.2727272727272725
7.454545454545454
7.636363636363637
7.818181818181818
8.0
8.181818181818182
8.363636363636363
8.545454545454545
8.727272727272728
8.90909090909091
9.090909090909092
9.272727272727273
9.454545454545455
9.636363636363637
9.818181818181818
10.0
10.181818181818182
10.363636363636363
10.545454545454545

87.09090909090908
87.27272727272727
87.45454545454545
87.63636363636364
87.81818181818181
88.0
88.18181818181819
88.36363636363636
88.54545454545455
88.72727272727273
88.9090909090909
89.0909090909091
89.27272727272727
89.45454545454545
89.63636363636364
89.81818181818181
90.0
90.18181818181819
90.36363636363637
90.54545454545455
90.72727272727272
90.9090909090909
91.0909090909091
91.27272727272727
91.45454545454545
91.63636363636364
91.81818181818183
92.0
92.18181818181819
92.36363636363636
92.54545454545455
92.72727272727272
92.9090909090909
93.0909090909091
93.27272727272728
93.45454545454545
93.63636363636364
93.81818181818183
94.0
94.18181818181817
94.36363636363636
94.54545454545455
94.72727272727272
94.9090909090909
95.0909090909091
95.27272727272728
95.45454545454545
95.63636363636364
95.81818181818181
96.0
96.18181818181817
96.36363636363636
96.54545454545455
96.72727272727273
96.9090909090909
97.0909090909091
97.27272727272728
97.45454545454545
97.63636363636363
97.8181818181

Unnamed: 0,Chef,Recipe Title,Ingredients,Level of Difficulty,Total Time Required,Link to Recipe
0,\n \n \...,Scrambled Egg Tacos,"[Deselect All, 8 tomatillos, husked, then halv...",Easy,25 min,https://www.foodnetwork.com/fnk/recipes/scramb...
1,\n \n \...,Spicy Beet Ice Cream,"[Deselect All, 3 large beets, 1 cup orange jui...",4 hr 10 min,4 hr 10 min,https://www.foodnetwork.com/recipes/michael-sy...
2,\n \n \...,Hot Smoked Salmon,"[Deselect All, 1/2 cup soy sauce, 1/2 cup hone...",Easy,2 hr,https://www.foodnetwork.com/recipes/michael-sy...
3,\n \n \...,Homemade Bagels,"[Deselect All, 2 tablespoons barley malt syrup...",Intermediate,7 hr 30 min,https://www.foodnetwork.com/recipes/michael-sy...
4,\n \n \...,Cabbage and Noodles,"[Deselect All, Kosher salt and freshly ground ...",Easy,25 min,https://www.foodnetwork.com/fnk/recipes/cabbag...
...,...,...,...,...,...,...
545,\n \n \...,Braised Shortribs with Orange and Olive Salad,"[Deselect All, 4 tablespoons olive oil, 6 poun...",Advanced,13 hr 8 min,https://www.foodnetwork.com/recipes/michael-sy...
546,\n \n \...,Grilled Halloumi and Watermelon Kebobs,"[Deselect All, 2 cups cubed halloumi, 2 cups c...",Easy,15 min,https://www.foodnetwork.com/recipes/michael-sy...
547,\n \n \...,The Red Bell,"[Deselect All, 1/2 red bell pepper, 4 fresh ba...",Easy,5 min,https://www.foodnetwork.com/fnk/recipes/the-re...
548,\n \n \...,Chocolate Molten Cake with White Chocolate Skull,"[Deselect All, Nonstick cooking spray, for the...",Easy,1 hr,https://www.foodnetwork.com/recipes/michael-sy...


In [353]:
# Chef Brooke Williamson Dataframe
recipe_url, chef_list, title_list, ingredients_list, level_list, time_list=recipe_scrapping(bw)

brookewilliamson= pd.DataFrame({
        'Chef': chef_list,
        'Recipe Title': title_list,
        'Ingredients': ingredients_list,
        'Level of Difficulty': level_list,
        'Total Time Required': time_list,
        'Link to Recipe': recipe_url
        })

brookewilliamson

0.0
5.555555555555555
11.11111111111111
16.666666666666664
22.22222222222222
27.77777777777778
33.33333333333333
38.88888888888889
44.44444444444444
50.0
55.55555555555556
61.111111111111114
66.66666666666666
72.22222222222221
77.77777777777779
83.33333333333334
88.88888888888889
94.44444444444444


Unnamed: 0,Chef,Recipe Title,Ingredients,Level of Difficulty,Total Time Required,Link to Recipe
0,\n \n \...,Twice-Baked Potatoes Stuffed with Lobster,"[Deselect All, 3 medium russet potatoes, 2 who...",Advanced,1 hr 40 min,https://www.foodnetwork.com/recipes/twice-bake...
1,\n \n \...,Italian Wedding Soup with Sausage Meatballs,"[Deselect All, 2 1/2 cups chicken broth, 12 to...",Easy,55 min,https://www.foodnetwork.com/recipes/italian-we...
2,\n \n \...,Vesper Martinis with Blue Cheese Stuffed Olive...,"[Deselect All, 2.5 ounces gin, 2.5 ounces vodk...",Easy,35 min,https://www.foodnetwork.com/recipes/vesper-mar...
3,\n \n \...,Linguine with Mussels and Cashew-Chile Pesto,"[Deselect All, 12 ounces heirloom cherry tomat...",Easy,40 min,https://www.foodnetwork.com/recipes/linguine-w...
4,\n \n \...,Marinated Beets with Charred Onion Crema,"[Deselect All, 8 small beets (1 1/2 inches in ...",Intermediate,3 hr 20 min,https://www.foodnetwork.com/recipes/marinated-...
5,\n \n \...,Whipped Coconut Cheesecake with Marinated Mango,"[Deselect All, 2 cups diced mango, 1/4 cup dar...",Easy,2 hr 10 min,https://www.foodnetwork.com/recipes/whipped-co...
6,\n \n \...,Grilled Halibut with Crispy Rice and Green Bea...,"[Deselect All, Kosher salt and freshly ground ...",Intermediate,1 hr 10 min,https://www.foodnetwork.com/recipes/grilled-ha...
7,\n \n \...,Baked 'Ulu with Shrimp Head Butter,"[Deselect All, Two 540-gram cans 'ulu (or brea...",Easy,45 min,https://www.foodnetwork.com/recipes/baked-ulu-...
8,\n \n \...,"Chicory and Apple Salad with Pine Nuts, Blue C...","[Deselect All, 1 small shallot, minced, 2 tabl...",Easy,15 min,https://www.foodnetwork.com/recipes/chicory-an...
9,\n \n \...,Devil’s Food Cake with Vanilla Baked Cherries,"[Deselect All, 1 cup all-purpose flour, 3/4 cu...",Easy,2 hr 35 min,https://www.foodnetwork.com/recipes/devils-foo...


In [354]:
# Chef Scott Conan Dataframe
recipe_url, chef_list, title_list, ingredients_list, level_list, time_list=recipe_scrapping(sc)

scottconan= pd.DataFrame({
        'Chef': chef_list,
        'Recipe Title': title_list,
        'Ingredients': ingredients_list,
        'Level of Difficulty': level_list,
        'Total Time Required': time_list,
        'Link to Recipe': recipe_url
        })

scottconan

0.0
1.6129032258064515
3.225806451612903
4.838709677419355
6.451612903225806
8.064516129032258
9.67741935483871
11.29032258064516
12.903225806451612
14.516129032258066
16.129032258064516
17.741935483870968
19.35483870967742
20.967741935483872
22.58064516129032
24.193548387096776
25.806451612903224
27.419354838709676
29.03225806451613
30.64516129032258
32.25806451612903
33.87096774193548
35.483870967741936
37.096774193548384
38.70967741935484
40.32258064516129
41.935483870967744
43.54838709677419
45.16129032258064
46.774193548387096
48.38709677419355
50.0
51.61290322580645
53.2258064516129
54.83870967741935
56.451612903225815
58.06451612903226
59.67741935483871
61.29032258064516
62.903225806451616
64.51612903225806
66.12903225806451
67.74193548387096
69.35483870967742
70.96774193548387
72.58064516129032
74.19354838709677
75.80645161290323
77.41935483870968
79.03225806451613
80.64516129032258
82.25806451612904
83.87096774193549
85.48387096774194
87.09677419354838
88.70967741935483
90.322

Unnamed: 0,Chef,Recipe Title,Ingredients,Level of Difficulty,Total Time Required,Link to Recipe
0,\n \n \...,Eggplant and Vegetable Rice,"[Deselect All, Extra-virgin olive oil, as need...",Easy,40 min,https://www.foodnetwork.com/fnk/recipes/eggpla...
1,\n \n \...,Flourless Dark Chocolate Cake,"[Deselect All, 8 tablespoons (4 ounces) unsalt...",Easy,40 min,https://www.foodnetwork.com/recipes/scott-cona...
2,\n \n \...,BLT Skewers,"[Deselect All, Six 1-inch-thick slices good-qu...",Easy,45 min,https://www.foodnetwork.com/recipes/scott-cona...
3,\n \n \...,Pastiera,"[Deselect All, 4 cups all-purpose flour, 1 cup...",Easy,2 hr 45 min,https://www.foodnetwork.com/recipes/scott-cona...
4,\n \n \...,Aleppo Popcorn with Parmesan and Herbs,"[Deselect All, 3 tablespoons plus 1/2 cup vege...",Easy,NOT AVAILABLE,https://www.foodnetwork.com/recipes/scott-cona...
...,...,...,...,...,...,...
57,\n \n \...,"Olive Oil Poached Tuna Infused with Thyme, Lem...","[Deselect All, Four 5-ounce pieces sushi-grade...",Intermediate,1 hr 15 min,https://www.foodnetwork.com/recipes/olive-oil-...
58,\n \n \...,Spaghetti with Tomato Sauce,"[Deselect All, Kosher salt, Tomato Sauce, as f...",Advanced,5 hr,https://www.foodnetwork.com/fnk/recipes/spaghe...
59,\n \n \...,"Moist Roasted Whole Red Snapper with Tomatoes,...","[Deselect All, One 2 1/2- to 3-pound whole (he...",Easy,1 hr,https://www.foodnetwork.com/recipes/scott-cona...
60,\n \n \...,"Burger with Taleggio, Pancetta and Onion-Musta...","[Deselect All, 2 pounds freshly ground beef (p...",Easy,55 min,https://www.foodnetwork.com/recipes/scott-cona...


In [357]:
# Chef Alex Guarnaschelli Dataframe
recipe_url, chef_list, title_list, ingredients_list, level_list, time_list=recipe_scrapping(ag)

alexguarnaschelli= pd.DataFrame({
        'Chef': chef_list,
        'Recipe Title': title_list,
        'Ingredients': ingredients_list,
        'Level of Difficulty': level_list,
        'Total Time Required': time_list,
        'Link to Recipe': recipe_url
        })

alexguarnaschelli

0.0
0.21413276231263384
0.4282655246252677
0.6423982869379015
0.8565310492505354
1.070663811563169
1.284796573875803
1.4989293361884368
1.7130620985010707
1.9271948608137044
2.141327623126338
2.355460385438972
2.569593147751606
2.7837259100642395
2.9978586723768736
3.2119914346895073
3.4261241970021414
3.640256959314775
3.854389721627409
4.068522483940043
4.282655246252676
4.496788008565311
4.710920770877944
4.925053533190578
5.139186295503212
5.353319057815846
5.567451820128479
5.781584582441114
5.995717344753747
6.209850107066381
6.423982869379015
6.638115631691649
6.852248394004283
7.066381156316917
7.28051391862955
7.494646680942184
7.708779443254818
7.922912205567452
8.137044967880087
8.35117773019272
8.565310492505352
8.779443254817988
8.993576017130621
9.207708779443255
9.421841541755889
9.635974304068522
9.850107066381156
10.06423982869379
10.278372591006423
10.492505353319057
10.706638115631693
10.920770877944326
11.134903640256958
11.349036402569594
11.563169164882227
11.7773

96.57387580299786
96.78800856531049
97.00214132762312
97.21627408993577
97.4304068522484
97.64453961456103
97.85867237687366
98.07280513918629
98.28693790149893
98.50107066381156
98.7152034261242
98.92933618843684
99.14346895074947
99.3576017130621
99.57173447537473
99.78586723768737


Unnamed: 0,Chef,Recipe Title,Ingredients,Level of Difficulty,Total Time Required,Link to Recipe
0,\n \n \...,Hoisin Sauce Noodles with Chicken,"[Deselect All, 2 tablespoons sesame seeds, 2 c...",Easy,1 hr,https://www.foodnetwork.com/recipes/hoisin-sau...
1,\n \n \...,Croque Madame Sandwich,"[Deselect All, 6 ounces unsalted butter, divid...",Easy,30 min,https://www.foodnetwork.com/recipes/alexandra-...
2,\n \n \...,Tomato and Watermelon Salad with Mozzarella,"[Deselect All, 1 cup extra-virgin olive oil, 6...",Easy,15 min,https://www.foodnetwork.com/recipes/alexandra-...
3,\n \n \...,Cauliflower Fritters,"[Deselect All, 1 1/2 cups all-purpose flour, 2...",Easy,36 min,https://www.foodnetwork.com/recipes/alexandra-...
4,\n \n \...,Citrus Flan,"[Deselect All, One 14-ounce can sweetened cond...",Intermediate,1 hr 35 min,https://www.foodnetwork.com/recipes/alexandra-...
...,...,...,...,...,...,...
462,\n \n \...,Sausage-Apple Stuffing,"[Deselect All, 1 stick unsalted butter, plus m...",Easy,2 hr 10 min,https://www.foodnetwork.com/recipes/alexandra-...
463,\n \n \...,Whole Roasted Fish,"[Deselect All, 2 whole American black sea bass...",Easy,45 min,https://www.foodnetwork.com/recipes/whole-roas...
464,\n \n \...,Pressed Cheese Sandwiches,"[Deselect All, 5 tablespoons olive oil, 1/2 te...",Easy,1 day 20 min,https://www.foodnetwork.com/recipes/alexandra-...
465,\n \n \...,Alex's Simple Whole Wheat Pasta Salad,"[Deselect All, 1 large red bell pepper, 1 teas...",Easy,55 min,https://www.foodnetwork.com/fnk/recipes/alexs-...


In [360]:
# Chef Molly Yeh Dataframe
recipe_url, chef_list, title_list, ingredients_list, level_list, time_list=recipe_scrapping(my)

mollyyeh= pd.DataFrame({
        'Chef': chef_list,
        'Recipe Title': title_list,
        'Ingredients': ingredients_list,
        'Level of Difficulty': level_list,
        'Total Time Required': time_list,
        'Link to Recipe': recipe_url
        })

mollyyeh

0.0
0.17667844522968199
0.35335689045936397
0.5300353356890459
0.7067137809187279
0.88339222614841
1.0600706713780919
1.2367491166077738
1.4134275618374559
1.5901060070671376
1.76678445229682
1.9434628975265018
2.1201413427561837
2.2968197879858656
2.4734982332155475
2.65017667844523
2.8268551236749118
3.0035335689045937
3.180212014134275
3.356890459363958
3.53356890459364
3.7102473498233217
3.8869257950530036
4.063604240282685
4.240282685512367
4.41696113074205
4.593639575971731
4.770318021201414
4.946996466431095
5.123674911660777
5.30035335689046
5.477031802120141
5.6537102473498235
5.830388692579505
6.007067137809187
6.18374558303887
6.36042402826855
6.5371024734982335
6.713780918727916
6.890459363957597
7.06713780918728
7.243816254416961
7.420494699646643
7.597173144876325
7.773851590106007
7.950530035335689
8.12720848056537
8.303886925795052
8.480565371024735
8.657243816254418
8.8339222614841
9.010600706713781
9.187279151943462
9.363957597173144
9.540636042402827
9.71731448763250

79.68197879858657
79.85865724381625
80.03533568904594
80.21201413427562
80.3886925795053
80.56537102473497
80.74204946996466
80.91872791519434
81.09540636042402
81.2720848056537
81.44876325088339
81.62544169611307
81.80212014134275
81.97879858657244
82.15547703180212
82.3321554770318
82.50883392226149
82.68551236749117
82.86219081272085
83.03886925795054
83.21554770318022
83.3922261484099
83.56890459363959
83.74558303886926
83.92226148409894
84.09893992932862
84.2756183745583
84.45229681978799
84.62897526501767
84.80565371024736
84.98233215547704
85.15901060070671
85.33568904593639
85.51236749116607
85.68904593639576
85.86572438162544
86.04240282685512
86.21908127208481
86.39575971731449
86.57243816254417
86.74911660777384
86.92579505300353
87.10247349823321
87.27915194346289
87.45583038869258
87.63250883392226
87.80918727915194
87.98586572438163
88.16254416961131
88.33922261484098
88.51590106007066
88.69257950530034
88.86925795053003
89.04593639575971
89.2226148409894
89.3992932862190

Unnamed: 0,Chef,Recipe Title,Ingredients,Level of Difficulty,Total Time Required,Link to Recipe
0,\n \n \...,"Sausage, Peppers and Onion Deep Dish Pizza","[Deselect All, 4 cups all-purpose flour, plus ...",Intermediate,3 hr 50 min,https://www.foodnetwork.com/recipes/sausage-pe...
1,\n \n \...,Strawberry Milk Tea,"[Deselect All, 1/4 cup (84 grams) honey, 2 jas...",Easy,1 hr,https://www.foodnetwork.com/recipes/strawberry...
2,\n \n \...,Muenster Monster Hash Brown Fingers,"[Deselect All, Vegetable oil cooking spray, fo...",Intermediate,50 min,https://www.foodnetwork.com/recipes/muenster-m...
3,\n \n \...,Savory Monkey Bread with Creamy Veggie Dip,"[Deselect All, Neutral oil, for the bowl, 3 3/...",Intermediate,3 hr 30 min,https://www.foodnetwork.com/recipes/savory-mon...
4,\n \n \...,Bear-y Cookie Salad,"[Deselect All, 1 1/2 cups (339 grams) cold, pl...",Easy,2 hr 10 min,https://www.foodnetwork.com/recipes/bear-y-coo...
...,...,...,...,...,...,...
561,\n \n \...,Latkes with Garlic-and-Onion Sour Cream,"[Deselect All, 2 1/2 pounds russet potatoes, 2...",Easy,1 hr 10 min,https://www.foodnetwork.com/recipes/latkes-wit...
562,\n \n \...,Refried Beans and Rice Bowl,"[Deselect All, 6 ounces sliced bacon, diced, 1...",Easy,1 hr,https://www.foodnetwork.com/recipes/refried-be...
563,\n \n \...,Gochujang Meatloaf Sandwiches,"[Deselect All, 1 tablespoon (13 grams) neutral...",Intermediate,2 hr 45 min,https://www.foodnetwork.com/recipes/gochujang-...
564,\n \n \...,Peking Chicken,"[Deselect All, 1 3-to-4 pound whole chicken, g...",Intermediate,12 hr,https://www.foodnetwork.com/recipes/peking-chi...


In [361]:
# Chef Geoffrey Zakarian Dataframe
recipe_url, chef_list, title_list, ingredients_list, level_list, time_list=recipe_scrapping(gz)

geoffreyzakarian= pd.DataFrame({
        'Chef': chef_list,
        'Recipe Title': title_list,
        'Ingredients': ingredients_list,
        'Level of Difficulty': level_list,
        'Total Time Required': time_list,
        'Link to Recipe': recipe_url
        })

geoffreyzakarian

0.0
0.17605633802816903
0.35211267605633806
0.528169014084507
0.7042253521126761
0.8802816901408451
1.056338028169014
1.232394366197183
1.4084507042253522
1.584507042253521
1.7605633802816902
1.936619718309859
2.112676056338028
2.2887323943661975
2.464788732394366
2.640845070422535
2.8169014084507045
2.992957746478873
3.169014084507042
3.345070422535211
3.5211267605633805
3.697183098591549
3.873239436619718
4.049295774647888
4.225352112676056
4.401408450704225
4.577464788732395
4.753521126760563
4.929577464788732
5.105633802816902
5.28169014084507
5.457746478873239
5.633802816901409
5.809859154929577
5.985915492957746
6.161971830985916
6.338028169014084
6.514084507042253
6.690140845070422
6.866197183098592
7.042253521126761
7.21830985915493
7.394366197183098
7.570422535211267
7.746478873239436
7.922535211267606
8.098591549295776
8.274647887323944
8.450704225352112
8.626760563380282
8.80281690140845
8.97887323943662
9.15492957746479
9.330985915492958
9.507042253521126
9.683098591549296


79.75352112676056
79.92957746478874
80.1056338028169
80.28169014084507
80.45774647887323
80.63380281690141
80.80985915492957
80.98591549295774
81.16197183098592
81.33802816901408
81.51408450704226
81.69014084507043
81.86619718309859
82.04225352112677
82.21830985915493
82.3943661971831
82.57042253521126
82.74647887323944
82.9225352112676
83.09859154929578
83.27464788732394
83.45070422535211
83.62676056338029
83.80281690140845
83.97887323943662
84.15492957746478
84.33098591549296
84.50704225352112
84.6830985915493
84.85915492957746
85.03521126760563
85.2112676056338
85.38732394366197
85.56338028169014
85.73943661971832
85.91549295774648
86.09154929577466
86.26760563380282
86.44366197183099
86.61971830985915
86.79577464788733
86.97183098591549
87.14788732394366
87.32394366197182
87.5
87.67605633802818
87.85211267605634
88.02816901408451
88.20422535211267
88.38028169014085
88.55633802816901
88.73239436619718
88.90845070422534
89.08450704225352
89.26056338028168
89.43661971830986
89.6126760

Unnamed: 0,Chef,Recipe Title,Ingredients,Level of Difficulty,Total Time Required,Link to Recipe
0,\n \n \...,"Bacon, Onion and Cheese Tart","[Deselect All, 8 slices thick-cut bacon, cut c...",Easy,1 hr 5 min,https://www.foodnetwork.com/recipes/geoffrey-z...
1,\n \n \...,Salt Cod Brandade,"[Deselect All, 2 Idaho potatoes, 1 pound salt ...",Intermediate,1 day 1 hr 5 min,https://www.foodnetwork.com/recipes/geoffrey-z...
2,\n \n \...,Shrimp Cakes with Zucchini Salad,"[Deselect All, 1 large zucchini, cubed small, ...",Easy,1 hr 40 min,https://www.foodnetwork.com/recipes/geoffrey-z...
3,\n \n \...,Espresso Martini,"[Deselect All, 6 ounces vodka, such as Russian...",Easy,5 min,https://www.foodnetwork.com/recipes/geoffrey-z...
4,\n \n \...,Spiced Pecans,"[Deselect All, Cooking spray, Kosher salt, 1/2...",Easy,30 min,https://www.foodnetwork.com/recipes/geoffrey-z...
...,...,...,...,...,...,...
563,\n \n \...,Sour Puss,"[Deselect All, 1 ounce whiskey, such as Jack D...",Easy,5 min,https://www.foodnetwork.com/recipes/geoffrey-z...
564,\n \n \...,Sausage and Taleggio Casserole with Swiss Chard,"[Deselect All, 1 tablespoon extra-virgin olive...",Easy,1 hr,https://www.foodnetwork.com/recipes/geoffrey-z...
565,\n \n \...,Elbow Macaroni with Crispy Breadcrumbs and Bro...,"[Deselect All, Kosher salt, 6 tablespoons extr...",Easy,25 min,https://www.foodnetwork.com/recipes/elbow-maca...
566,\n \n \...,"Branzino with Polenta, Wild Mushrooms and Wate...","[Deselect All, 1 tablespoon canola oil, 3 tabl...",Easy,1 hr,https://www.foodnetwork.com/recipes/geoffrey-z...


In [363]:
# Chef Giada De Laurentiis Dataframe
recipe_url, chef_list, title_list, ingredients_list, level_list, time_list=recipe_scrapping(gd)

giadadelaurentiis= pd.DataFrame({
        'Chef': chef_list,
        'Recipe Title': title_list,
        'Ingredients': ingredients_list,
        'Level of Difficulty': level_list,
        'Total Time Required': time_list,
        'Link to Recipe': recipe_url
        })

giadadelaurentiis

0.0
0.05373455131649651
0.10746910263299302
0.16120365394948952
0.21493820526598603
0.2686727565824825
0.32240730789897903
0.37614185921547555
0.42987641053197206
0.4836109618484685
0.537345513164965
0.5910800644814616
0.6448146157979581
0.6985491671144546
0.7522837184309511
0.8060182697474476
0.8597528210639441
0.9134873723804406
0.967221923696937
1.0209564750134337
1.07469102632993
1.1284255776464267
1.182160128962923
1.2358946802794197
1.2896292315959161
1.3433637829124128
1.3970983342289092
1.4508328855454058
1.5045674368619022
1.5583019881783988
1.6120365394948952
1.6657710908113916
1.7195056421278883
1.7732401934443847
1.8269747447608813
1.8807092960773777
1.934443847393874
1.9881783987103707
2.0419129500268673
2.0956475013433637
2.14938205265986
2.203116603976357
2.2568511552928534
2.31058570660935
2.364320257925846
2.418054809242343
2.4717893605588395
2.525523911875336
2.5792584631918323
2.6329930145083287
2.6867275658248255
2.740462117141322
2.7941966684578183
2.84793121977431

23.858140784524448
23.911875335840946
23.965609887157445
24.01934443847394
24.073078989790435
24.12681354110693
24.18054809242343
24.234282643739924
24.288017195056423
24.341751746372918
24.395486297689413
24.44922084900591
24.502955400322406
24.556689951638905
24.6104245029554
24.6641590542719
24.717893605588394
24.77162815690489
24.825362708221384
24.879097259537883
24.93283181085438
24.986566362170876
25.040300913487375
25.09403546480387
25.14777001612036
25.20150456743686
25.25523911875336
25.308973670069857
25.362708221386356
25.416442772702847
25.470177324019343
25.52391187533584
25.57764642665234
25.63138097796883
25.68511552928533
25.73885008060183
25.792584631918324
25.846319183234822
25.900053734551314
25.953788285867812
26.00752283718431
26.06125738850081
26.1149919398173
26.168726491133796
26.222461042450295
26.276195593766793
26.329930145083292
26.383664696399784
26.437399247716282
26.491133799032777
26.544868350349276
26.598602901665767
26.652337452982266
26.7060720042987

47.87748522299839
47.93121977431489
47.98495432563138
48.03868887694788
48.092423428264375
48.14615797958087
48.199892530897365
48.25362708221386
48.30736163353036
48.36109618484686
48.41483073616335
48.46856528747985
48.52229983879634
48.576034390112845
48.62976894142934
48.683503492745835
48.73723804406233
48.790972595378825
48.84470714669533
48.89844169801182
48.95217624932832
49.00591080064481
49.059645351961315
49.11337990327781
49.167114454594305
49.2208490059108
49.274583557227295
49.3283181085438
49.38205265986029
49.43578721117679
49.48952176249328
49.54325631380978
49.59699086512628
49.65072541644277
49.70445996775927
49.758194519075765
49.81192907039226
49.86566362170876
49.91939817302525
49.97313272434175
50.02686727565825
50.08060182697475
50.134336378291245
50.18807092960774
50.24180548092424
50.29554003224072
50.349274583557225
50.40300913487372
50.45674368619022
50.51047823750672
50.56421278882321
50.617947340139715
50.67168189145621
50.72541644277271
50.77915099408919


72.2729715206878
72.3267060720043
72.3804406233208
72.43417517463729
72.4879097259538
72.5416442772703
72.59537882858677
72.64911337990327
72.70284793121976
72.75658248253627
72.81031703385277
72.86405158516926
72.91778613648576
72.97152068780225
73.02525523911876
73.07898979043524
73.13272434175174
73.18645889306823
73.24019344438474
73.29392799570124
73.34766254701773
73.40139709833423
73.45513164965072
73.50886620096723
73.56260075228371
73.61633530360021
73.6700698549167
73.7238044062332
73.77753895754971
73.8312735088662
73.8850080601827
73.9387426114992
73.99247716281569
74.04621171413218
74.09994626544868
74.15368081676517
74.20741536808167
74.26114991939818
74.31488447071467
74.36861902203117
74.42235357334766
74.47608812466416
74.52982267598065
74.58355722729715
74.63729177861364
74.69102632993014
74.74476088124663
74.79849543256314
74.85222998387964
74.90596453519613
74.95969908651263
75.01343363782912
75.06716818914562
75.12090274046211
75.17463729177861
75.2283718430951
75.

96.9371305749597
96.99086512627619
97.04459967759269
97.0983342289092
97.15206878022569
97.20580333154219
97.25953788285868
97.31327243417518
97.36700698549167
97.42074153680817
97.47447608812466
97.52821063944116
97.58194519075765
97.63567974207416
97.68941429339066
97.74314884470715
97.79688339602365
97.85061794734014
97.90435249865664
97.95808704997313
98.01182160128963
98.06555615260612
98.11929070392263
98.17302525523912
98.22675980655562
98.28049435787212
98.33422890918861
98.3879634605051
98.4416980118216
98.4954325631381
98.54916711445459
98.60290166577109
98.6566362170876
98.71037076840409
98.76410531972058
98.81783987103707
98.87157442235358
98.92530897367007
98.97904352498657
99.03277807630306
99.08651262761956
99.14024717893606
99.19398173025256
99.24771628156905
99.30145083288554
99.35518538420204
99.40891993551854
99.46265448683504
99.51638903815153
99.57012358946803
99.62385814078452
99.67759269210103
99.73132724341752
99.785061794734
99.8387963460505
99.89253089736701
9

Unnamed: 0,Chef,Recipe Title,Ingredients,Level of Difficulty,Total Time Required,Link to Recipe
0,\n \n \...,Fennel Slaw with Prosciutto and Pistachio Pesto,"[Deselect All, Pistachio Pesto:, 2 cups (light...",10 min,10 min,https://www.foodnetwork.com/recipes/giada-de-l...
1,\n \n \...,Lemon Profiteroles,"[Deselect All, 1 quart whole milk, 6 whole egg...",Intermediate,50 min,https://www.foodnetwork.com/recipes/lemon-prof...
2,\n \n \...,Baked Penne with Squash and Goat Cheese,"[Deselect All, 1/2 cup panko breadcrumbs, 2 ta...",Easy,1 hr 10 min,https://www.foodnetwork.com/recipes/giada-de-l...
3,\n \n \...,Raspberry Ice Cream Sodas,"[Deselect All, 1 pint raspberry sherbet, 1 pin...",Easy,1 hr 20 min,https://www.foodnetwork.com/recipes/giada-de-l...
4,\n \n \...,Artichoke Gratinata,"[Deselect All, 3 tablespoons olive oil, 1 garl...",Easy,25 min,https://www.foodnetwork.com/recipes/giada-de-l...
...,...,...,...,...,...,...
1856,\n \n \...,Grilled Plum Salad,"[Deselect All, 1 tablespoon honey, 1 tablespoo...",Easy,35 min,https://www.foodnetwork.com/recipes/giada-de-l...
1857,\n \n \...,Chocolate-Raspberry Mascarpone Bars,"[Deselect All, Vegetable oil cooking spray, 8 ...",Intermediate,8 hr 40 min,https://www.foodnetwork.com/recipes/giada-de-l...
1858,\n \n \...,Chocolate-Hazelnut Gelato,"[Deselect All, 2 cups whole milk, 1 cup heavy ...",Easy,2 hr 55 min,https://www.foodnetwork.com/recipes/giada-de-l...
1859,\n \n \...,Smoked Salmon with Creamy Cucumbers,"[Deselect All, 1 English cucumber, Kosher salt...",Easy,15 min,https://www.foodnetwork.com/recipes/giada-de-l...


In [365]:
# Chef Ina Garten Dataframe
recipe_url, chef_list, title_list, ingredients_list, level_list, time_list=recipe_scrapping(ig)

inagarten= pd.DataFrame({
        'Chef': chef_list,
        'Recipe Title': title_list,
        'Ingredients': ingredients_list,
        'Level of Difficulty': level_list,
        'Total Time Required': time_list,
        'Link to Recipe': recipe_url
        })

inagarten

0.0
0.0796812749003984
0.1593625498007968
0.2390438247011952
0.3187250996015936
0.398406374501992
0.4780876494023904
0.5577689243027889
0.6374501992031872
0.7171314741035857
0.796812749003984
0.8764940239043826
0.9561752988047808
1.0358565737051793
1.1155378486055778
1.1952191235059761
1.2749003984063745
1.3545816733067728
1.4342629482071714
1.5139442231075697
1.593625498007968
1.6733067729083666
1.7529880478087652
1.8326693227091633
1.9123505976095616
1.9920318725099602
2.0717131474103585
2.1513944223107573
2.2310756972111556
2.3107569721115535
2.3904382470119523
2.4701195219123506
2.549800796812749
2.6294820717131477
2.7091633466135456
2.788844621513944
2.8685258964143427
2.948207171314741
3.0278884462151394
3.1075697211155378
3.187250996015936
3.266932270916335
3.346613545816733
3.4262948207171315
3.5059760956175303
3.5856573705179287
3.6653386454183265
3.745019920318725
3.824701195219123
3.904382470119522
3.9840637450199203
4.063745019920319
4.143426294820717
4.223107569721115
4.30

35.537848605577686
35.61752988047809
35.69721115537849
35.776892430278885
35.85657370517929
35.93625498007968
36.01593625498008
36.09561752988048
36.17529880478087
36.254980079681275
36.33466135458167
36.41434262948207
36.49402390438247
36.57370517928287
36.65338645418327
36.733067729083665
36.81274900398407
36.89243027888446
36.972111553784856
37.05179282868526
37.13147410358566
37.211155378486055
37.29083665338646
37.37051792828685
37.45019920318725
37.52988047808765
37.60956175298805
37.689243027888445
37.76892430278884
37.84860557768924
37.92828685258964
38.00796812749004
38.08764940239044
38.16733067729084
38.24701195219124
38.32669322709164
38.40637450199203
38.48605577689243
38.56573705179283
38.645418326693225
38.72509960159363
38.80478087649402
38.88446215139442
38.964143426294825
39.04382470119522
39.123505976095615
39.20318725099602
39.28286852589641
39.36254980079681
39.44223107569721
39.52191235059761
39.60159362549801
39.68127490039841
39.76095617529881
39.8406374501992
3

71.71314741035857
71.79282868525897
71.87250996015936
71.95219123505976
72.03187250996017
72.11155378486056
72.19123505976096
72.27091633466135
72.35059760956175
72.43027888446215
72.50996015936255
72.58964143426296
72.66932270916334
72.74900398406375
72.82868525896414
72.90836653386454
72.98804780876495
73.06772908366533
73.14741035856574
73.22709163346613
73.30677290836654
73.38645418326693
73.46613545816733
73.54581673306772
73.62549800796813
73.70517928286853
73.78486055776892
73.86454183266933
73.94422310756971
74.02390438247012
74.10358565737052
74.18326693227093
74.26294820717132
74.34262948207171
74.42231075697211
74.5019920318725
74.58167330677291
74.66135458167331
74.7410358565737
74.8207171314741
74.9003984063745
74.9800796812749
75.0597609561753
75.13944223107569
75.2191235059761
75.2988047808765
75.37848605577689
75.4581673306773
75.53784860557768
75.61752988047809
75.69721115537848
75.77689243027889
75.85657370517929
75.93625498007968
76.01593625498008
76.09561752988047
7

Unnamed: 0,Chef,Recipe Title,Ingredients,Level of Difficulty,Total Time Required,Link to Recipe
0,\n \n \...,Roast Turkey with Truffle Butter,"[Deselect All, 1 (12 to 14-pound) fresh turkey...",Easy,3 hr 45 min,https://www.foodnetwork.com/recipes/ina-garten...
1,\n \n \...,Iced Coffee,"[Deselect All, 3 tablespoons chocolate almond ...",Easy,1 hr 5 min,https://www.foodnetwork.com/recipes/ina-garten...
2,\n \n \...,Fettunta with Prosciutto,"[Deselect All, 6 slices good French boule, sli...",Easy,25 min,https://www.foodnetwork.com/recipes/ina-garten...
3,\n \n \...,Sweet Potato Puree,"[Deselect All, 3 pounds sweet potatoes, peeled...",Easy,45 min,https://www.foodnetwork.com/recipes/ina-garten...
4,\n \n \...,Red Berry Shortcakes with Honey Yogurt,"[Deselect All, 2 1/4 cups all-purpose flour, p...",Easy,1 hr 40 min,https://www.foodnetwork.com/recipes/ina-garten...
...,...,...,...,...,...,...
1250,\n \n \...,French Apple Tart,"[Deselect All, 2 cups all-purpose flour, 1/2 t...",Easy,2 hr 15 min,https://www.foodnetwork.com/recipes/ina-garten...
1251,\n \n \...,Gorgonzola Sauce,"[Deselect All, 4 cups heavy cream, 3 to 4 ounc...",Easy,1 hr,https://www.foodnetwork.com/recipes/ina-garten...
1252,\n \n \...,Caviar Dip,"[Deselect All, 8 ounces cream cheese, at room ...",Easy,30 min,https://www.foodnetwork.com/recipes/ina-garten...
1253,\n \n \...,"""16 Bean"" Pasta E Fagioli","[Deselect All, 1 (1-pound) bag Goya 16 Bean So...",Easy,1 hr 30 min,https://www.foodnetwork.com/recipes/ina-garten...


## 2.4. Merge All the Dataframes (for Each Chef) & Save As a CSV File

In [383]:
df=pd.concat([bobbyflay,bobbyflay2,inagarten,giadadelaurentiis, geoffreyzakarian, mollyyeh,alexguarnaschelli,scottconan,brookewilliamson,michaelsimon,michaelvoltaggio], axis=0, ignore_index=True)

In [390]:
df.to_csv('recipes.csv', index=False)
df

Unnamed: 0,Chef,Recipe Title,Ingredients,Level of Difficulty,Total Time Required,Link to Recipe
0,...,Bbq Chicken Cobb Salad,"[Deselect All, 4 chicken thighs, bone in, 2 cu...",4 servings,NOT AVAILABLE,https://www.foodnetwork.com/recipes/bobby-flay...
1,...,"Breakfast Burritos with Mocha-Rubbed Steak, Gr...","[Deselect All, 2 tablespoons canola oil, 1 med...",Intermediate,1 hr 40 min,https://www.foodnetwork.com/recipes/bobby-flay...
2,...,Rajas Salsa,"[Deselect All, 2 roasted red and yellow bell p...",Easy,30 min,https://www.foodnetwork.com/recipes/rajas-sals...
3,...,Grilled Corn,"[Deselect All, 8 ears of corn, Unsalted butter...",Easy,NOT AVAILABLE,https://www.foodnetwork.com/recipes/bobby-flay...
4,...,Korean-style BBQ Short Ribs,"[Deselect All, 4 large short ribs, 2 scallions...",Easy,2 hr 50 min,https://www.foodnetwork.com/recipes/korean-sty...
...,...,...,...,...,...,...
8113,...,Kale Noodles with Chorizo Breadcrumbs and Parm...,"[Deselect All, 2 tablespoons extra-virgin oliv...",Intermediate,2 hr 30 min,https://www.foodnetwork.com/recipes/kale-noodl...
8114,...,"Cauliflower Hummus, Cauliflower Pickles and Ca...","[Deselect All, Neutral oil, like canola, for f...",Intermediate,1 hr 40 min,https://www.foodnetwork.com/recipes/cauliflowe...
8115,...,Seaweed Mashed Potatoes,"[Deselect All, 5 cups (2000 g) water, 1 sheet ...",Intermediate,1 hr 50 min,https://www.foodnetwork.com/recipes/seaweed-ma...
8116,...,Hibiscus Whiskey Sour,"[Deselect All, 2 cups dried hibiscus flowers, ...",Easy,1 hr 10 min,https://www.foodnetwork.com/recipes/hibiscus-w...


# SECTION 2 OF NOTEBOOK 1: 
# CLEANING THE DATAFRAME AFTER WEB SCRAPING 
- After web scraping the Food Network's Website, a total of 8118 recipes were scraped and saved under the 'recipes.csv'
- This dataframe requires cleaning in order to recommend a recipe to a user

## 3. Clean-Up

In [4]:
# Read the csv file
recipe_df=pd.read_csv('recipes.csv')

In [5]:
recipe_df

Unnamed: 0,Chef,Recipe Title,Ingredients,Level of Difficulty,Total Time Required,Link to Recipe
0,...,Bbq Chicken Cobb Salad,"['Deselect All', '4 chicken thighs, bone in', ...",4 servings,NOT AVAILABLE,https://www.foodnetwork.com/recipes/bobby-flay...
1,...,"Breakfast Burritos with Mocha-Rubbed Steak, Gr...","['Deselect All', '2 tablespoons canola oil', '...",Intermediate,1 hr 40 min,https://www.foodnetwork.com/recipes/bobby-flay...
2,...,Rajas Salsa,"['Deselect All', '2 roasted red and yellow bel...",Easy,30 min,https://www.foodnetwork.com/recipes/rajas-sals...
3,...,Grilled Corn,"['Deselect All', '8 ears of corn', 'Unsalted b...",Easy,NOT AVAILABLE,https://www.foodnetwork.com/recipes/bobby-flay...
4,...,Korean-style BBQ Short Ribs,"['Deselect All', '4 large short ribs', '2 scal...",Easy,2 hr 50 min,https://www.foodnetwork.com/recipes/korean-sty...
...,...,...,...,...,...,...
8113,...,Kale Noodles with Chorizo Breadcrumbs and Parm...,"['Deselect All', '2 tablespoons extra-virgin o...",Intermediate,2 hr 30 min,https://www.foodnetwork.com/recipes/kale-noodl...
8114,...,"Cauliflower Hummus, Cauliflower Pickles and Ca...","['Deselect All', 'Neutral oil, like canola, fo...",Intermediate,1 hr 40 min,https://www.foodnetwork.com/recipes/cauliflowe...
8115,...,Seaweed Mashed Potatoes,"['Deselect All', '5 cups (2000 g) water', '1 s...",Intermediate,1 hr 50 min,https://www.foodnetwork.com/recipes/seaweed-ma...
8116,...,Hibiscus Whiskey Sour,"['Deselect All', '2 cups dried hibiscus flower...",Easy,1 hr 10 min,https://www.foodnetwork.com/recipes/hibiscus-w...


### 3.1 Remove Blanks (from Column 'Ingredients' )

In [6]:
# Remove Rows Where the Ingredients Row is Blank / "[ ]"
recipe_df=recipe_df[recipe_df['Ingredients']!="[]"]

len(recipe_df) # 8118 recipes to 8090 recipes

8090

### 3.2 Remove Recipe Duplicates (based on the URL)

In [7]:
# Remove Any Recipe Duplicates (based on the URL)
recipe_df=recipe_df.drop_duplicates(subset = "Link to Recipe", keep = 'first')

len(recipe_df) #8090 recipes to 8080 recipes

8080

### 3.3 Column 'Chef' 

In [8]:
recipe_df['Chef']=recipe_df['Chef'].str.replace('\n', '')
recipe_df['Chef']=recipe_df['Chef'].str.lstrip()
recipe_df['Chef'] =recipe_df['Chef'].str.rstrip()

chefs=recipe_df['Chef'].unique()
len(chefs) # Total of 604 chefs contributing

604

### 3.4 Column 'Level of Difficulty' 

In [9]:
levels_unique=recipe_df['Level of Difficulty'].unique()

levels_expected=['Easy','Intermediate','Advanced']

recipe_df['Level of Difficulty']=recipe_df['Level of Difficulty'].apply(lambda x: x if x in levels_expected else 'Level of Difficulty Not Available')

recipe_df['Level of Difficulty'].value_counts() #471 have no level of difficulty assigned from website

Level of Difficulty
Easy                                 5884
Intermediate                         1638
Level of Difficulty Not Available     471
Advanced                               87
Name: count, dtype: int64

### 3.5 Column 'Total Time Required'

In [10]:
recipe_df['Total Time Required'].unique()
recipe_df['Total Time Required']=recipe_df['Total Time Required'].replace('NOT AVAILABLE', 'Cooking Time Not Available')
#recipe_df['Total Time Required'].unique()

### 3.6 Column 'Ingredients'

In [11]:
recipe_df['Ingredients']=recipe_df['Ingredients'].apply(literal_eval)
recipe_df

Unnamed: 0,Chef,Recipe Title,Ingredients,Level of Difficulty,Total Time Required,Link to Recipe
0,Recipe courtesy of Bobby Flay,Bbq Chicken Cobb Salad,"[Deselect All, 4 chicken thighs, bone in, 2 cu...",Level of Difficulty Not Available,Cooking Time Not Available,https://www.foodnetwork.com/recipes/bobby-flay...
1,Recipe courtesy of Bobby Flay,"Breakfast Burritos with Mocha-Rubbed Steak, Gr...","[Deselect All, 2 tablespoons canola oil, 1 med...",Intermediate,1 hr 40 min,https://www.foodnetwork.com/recipes/bobby-flay...
2,Recipe courtesy of Linda Hacker,Rajas Salsa,"[Deselect All, 2 roasted red and yellow bell p...",Easy,30 min,https://www.foodnetwork.com/recipes/rajas-sals...
3,Recipe courtesy of Bobby Flay,Grilled Corn,"[Deselect All, 8 ears of corn, Unsalted butter...",Easy,Cooking Time Not Available,https://www.foodnetwork.com/recipes/bobby-flay...
4,Recipe courtesy of Flip Cuddy,Korean-style BBQ Short Ribs,"[Deselect All, 4 large short ribs, 2 scallions...",Easy,2 hr 50 min,https://www.foodnetwork.com/recipes/korean-sty...
...,...,...,...,...,...,...
8113,Recipe courtesy of Michael Voltaggio,Kale Noodles with Chorizo Breadcrumbs and Parm...,"[Deselect All, 2 tablespoons extra-virgin oliv...",Intermediate,2 hr 30 min,https://www.foodnetwork.com/recipes/kale-noodl...
8114,Recipe courtesy of Michael Voltaggio,"Cauliflower Hummus, Cauliflower Pickles and Ca...","[Deselect All, Neutral oil, like canola, for f...",Intermediate,1 hr 40 min,https://www.foodnetwork.com/recipes/cauliflowe...
8115,Recipe courtesy of Michael Voltaggio,Seaweed Mashed Potatoes,"[Deselect All, 5 cups (2000 g) water, 1 sheet ...",Intermediate,1 hr 50 min,https://www.foodnetwork.com/recipes/seaweed-ma...
8116,Recipe courtesy of Michael Voltaggio,Hibiscus Whiskey Sour,"[Deselect All, 2 cups dried hibiscus flowers, ...",Easy,1 hr 10 min,https://www.foodnetwork.com/recipes/hibiscus-w...


#### 3.6.1 Remove 'Deselect All' (this appears as the first item in list of ingredients for every recipe)

In [12]:
recipe_df['Ingredients']=recipe_df['Ingredients'].apply(lambda x: x[1:] if isinstance(x, list) and len(x) > 1 and x[0] == 'Deselect All' else x)
recipe_df

Unnamed: 0,Chef,Recipe Title,Ingredients,Level of Difficulty,Total Time Required,Link to Recipe
0,Recipe courtesy of Bobby Flay,Bbq Chicken Cobb Salad,"[4 chicken thighs, bone in, 2 cups Mesa BBQ Sa...",Level of Difficulty Not Available,Cooking Time Not Available,https://www.foodnetwork.com/recipes/bobby-flay...
1,Recipe courtesy of Bobby Flay,"Breakfast Burritos with Mocha-Rubbed Steak, Gr...","[2 tablespoons canola oil, 1 medium Spanish on...",Intermediate,1 hr 40 min,https://www.foodnetwork.com/recipes/bobby-flay...
2,Recipe courtesy of Linda Hacker,Rajas Salsa,"[2 roasted red and yellow bell peppers, peeled...",Easy,30 min,https://www.foodnetwork.com/recipes/rajas-sals...
3,Recipe courtesy of Bobby Flay,Grilled Corn,"[8 ears of corn, Unsalted butter, Salt]",Easy,Cooking Time Not Available,https://www.foodnetwork.com/recipes/bobby-flay...
4,Recipe courtesy of Flip Cuddy,Korean-style BBQ Short Ribs,"[4 large short ribs, 2 scallions, chopped, 3 g...",Easy,2 hr 50 min,https://www.foodnetwork.com/recipes/korean-sty...
...,...,...,...,...,...,...
8113,Recipe courtesy of Michael Voltaggio,Kale Noodles with Chorizo Breadcrumbs and Parm...,"[2 tablespoons extra-virgin olive oil, 8 ounce...",Intermediate,2 hr 30 min,https://www.foodnetwork.com/recipes/kale-noodl...
8114,Recipe courtesy of Michael Voltaggio,"Cauliflower Hummus, Cauliflower Pickles and Ca...","[Neutral oil, like canola, for frying, 500 gra...",Intermediate,1 hr 40 min,https://www.foodnetwork.com/recipes/cauliflowe...
8115,Recipe courtesy of Michael Voltaggio,Seaweed Mashed Potatoes,"[5 cups (2000 g) water, 1 sheet (15 g) kombu ,...",Intermediate,1 hr 50 min,https://www.foodnetwork.com/recipes/seaweed-ma...
8116,Recipe courtesy of Michael Voltaggio,Hibiscus Whiskey Sour,"[2 cups dried hibiscus flowers, 2 ounces bourb...",Easy,1 hr 10 min,https://www.foodnetwork.com/recipes/hibiscus-w...


#### 3.6.2 Create a New Column 'Ingredients_Cleaned' & "Number of Ingredients in the Recipe"
- Count the number  of ingredients in each recipe
- 'Ingredients_Cleaned' column will be used to perform cleaning techniques such as : removing punctuation, stopwords, etc.

In [15]:
recipe_df['Ingredients_Cleaned']=recipe_df['Ingredients']

# Create a Column to Determine the Number of Ingredients in this Recipe
recipe_df['Number of Ingredients in the Recipe']=recipe_df['Ingredients_Cleaned'].apply(lambda x: len(x) if isinstance(x, list) else None)
recipe_df

Unnamed: 0,Chef,Recipe Title,Ingredients,Level of Difficulty,Total Time Required,Link to Recipe,Ingredients_Cleaned,Number of Ingredients in the Recipe
0,Recipe courtesy of Bobby Flay,Bbq Chicken Cobb Salad,"[4 chicken thighs, bone in, 2 cups Mesa BBQ Sa...",Level of Difficulty Not Available,Cooking Time Not Available,https://www.foodnetwork.com/recipes/bobby-flay...,"[4 chicken thighs, bone in, 2 cups Mesa BBQ Sa...",30
1,Recipe courtesy of Bobby Flay,"Breakfast Burritos with Mocha-Rubbed Steak, Gr...","[2 tablespoons canola oil, 1 medium Spanish on...",Intermediate,1 hr 40 min,https://www.foodnetwork.com/recipes/bobby-flay...,"[2 tablespoons canola oil, 1 medium Spanish on...",38
2,Recipe courtesy of Linda Hacker,Rajas Salsa,"[2 roasted red and yellow bell peppers, peeled...",Easy,30 min,https://www.foodnetwork.com/recipes/rajas-sals...,"[2 roasted red and yellow bell peppers, peeled...",7
3,Recipe courtesy of Bobby Flay,Grilled Corn,"[8 ears of corn, Unsalted butter, Salt]",Easy,Cooking Time Not Available,https://www.foodnetwork.com/recipes/bobby-flay...,"[8 ears of corn, Unsalted butter, Salt]",3
4,Recipe courtesy of Flip Cuddy,Korean-style BBQ Short Ribs,"[4 large short ribs, 2 scallions, chopped, 3 g...",Easy,2 hr 50 min,https://www.foodnetwork.com/recipes/korean-sty...,"[4 large short ribs, 2 scallions, chopped, 3 g...",23
...,...,...,...,...,...,...,...,...
8113,Recipe courtesy of Michael Voltaggio,Kale Noodles with Chorizo Breadcrumbs and Parm...,"[2 tablespoons extra-virgin olive oil, 8 ounce...",Intermediate,2 hr 30 min,https://www.foodnetwork.com/recipes/kale-noodl...,"[2 tablespoons extra-virgin olive oil, 8 ounce...",18
8114,Recipe courtesy of Michael Voltaggio,"Cauliflower Hummus, Cauliflower Pickles and Ca...","[Neutral oil, like canola, for frying, 500 gra...",Intermediate,1 hr 40 min,https://www.foodnetwork.com/recipes/cauliflowe...,"[Neutral oil, like canola, for frying, 500 gra...",23
8115,Recipe courtesy of Michael Voltaggio,Seaweed Mashed Potatoes,"[5 cups (2000 g) water, 1 sheet (15 g) kombu ,...",Intermediate,1 hr 50 min,https://www.foodnetwork.com/recipes/seaweed-ma...,"[5 cups (2000 g) water, 1 sheet (15 g) kombu ,...",10
8116,Recipe courtesy of Michael Voltaggio,Hibiscus Whiskey Sour,"[2 cups dried hibiscus flowers, 2 ounces bourb...",Easy,1 hr 10 min,https://www.foodnetwork.com/recipes/hibiscus-w...,"[2 cups dried hibiscus flowers, 2 ounces bourb...",5


In [107]:
#SAVE DF AS A CSV FILE AS A BACKUP (if required)
recipe_df.to_csv('recipes_clean.csv', index=False)

#### 3.6.3 Cleaning the 'Ingredients_Cleaned' Column 
Need to use NLP Cleaning Text Methods to simplify the ingredients:

- Remove numbers
- Remove punctuation
- Remove lower case
- Remove stopwords from existing python library
- Remove cooking related measurements (ex. tablespoon(s), tbsp(s), cup(s), teaspoon(s), tsp(s), pound(s), slice(s), pinch(es), ounce, ounces,  packet, packets, pinch(es), zest, etc)
- Remove cooking actions (ex. chopped, julienned, seeded, sliced, cored, peeled, toasted, etc)

In [26]:
df=pd.read_csv('recipes_clean.csv')

In [27]:
# Remove '\\xa0' from each string
df["Ingredients"]=df["Ingredients"].str.replace(r'\xa0', '')
df["Ingredients_Cleaned"]=df["Ingredients_Cleaned"].str.replace(r'\xa0', '')
#df["Ingredients_Cleaned"]=df["Ingredients_Cleaned"].str.replace(r'\xa', '')

In [29]:
# Remove Numbers
df["Ingredients_Cleaned"]=df["Ingredients_Cleaned"].str.replace('\d+', '',regex=True)

In [30]:
# Remove Capital Letters
df["Ingredients_Cleaned"]=df["Ingredients_Cleaned"].str.lower()

In [31]:
#Remove Different Type of Punctuation/Characters
def punctuation(text):
    cleaned_text = re.sub(r'[^a-zA-Z0-9\s]', '', text)
    return cleaned_text

df["Ingredients_Cleaned"]=df["Ingredients_Cleaned"].apply(punctuation)

In [32]:
# Remove Stopwords & Cooking Measurements (created my own list)
from nltk.corpus import stopwords
nltk.download('stopwords')
stopwords=(stopwords.words('english'))

measurement_stopwords=['tablespoon','tablespoons','tbsp','tbsps',"teaspoon","teaspoons","tsp","tsps",
                       "pound","pounds",'lbs',"pinch","pinches","ounce","ounces","packet","packets",
                       "pinch","pinches","pint","pints",'pt','zest','cup','cups','oz','slice','slices',
                       'quart','qt','one','two','three','four','five','six','seven','eight','nine','ten'
                       'inch','inches',"inchthick",'small','medium','large','dozen',"half","halves","eighteen"]

cooking_terminology=["ground","room","temperature","recipe","cube","cubes","finely",'cut',
                    "chopped","fresh","roughly","torn","crumble","crumbled","peel","peeled",
                     "remove","skinned","removed","thin","thinly","sliced","thick","quarter",
                     "quartered","dice","diced",'clean','cleaned',"coarse","coarsely","coarsed",
                     "available","specialty","markets","market","seed","seeded","char","charred",
                     "crush","crushed","approximately","approximate","fresh","chiffonade","freshly",
                     "follow","follows","preferably",'segmented','serving','end','trimmed','halved',
                     'prepare','prepared','toated','coarse','coarsely','smash','smashed',"unpeel",
                     "unpeeled","style","mince","minced","julienned","recommend","recommended",
                     "optional","drain","drained","lightly","pitted","grilled","needed","cold",
                     "warm","water","chill","chilled","plus","organic","shred","shredded","slivered",
                    "season","seasoned","pureed","extravirgin","seasoned","garnish","grated","zest","zester","good",
                    "squeeze","squeezed","using","diagonally","diagonal","extra","mortar","pestle",
                    "sweet","slight","slightly","soften","softened","sautee","sauteed","grill","grilled",
                    "roast","roasted","highquality","patted","dry"]

basic_ingredients=["salt","pepper"]

stopwords.extend(measurement_stopwords)
stopwords.extend(cooking_terminology)
stopwords.extend(basic_ingredients)
stopwords

df["Ingredients_Cleaned"]=df["Ingredients_Cleaned"].apply(lambda x: ' '.join([word for word in x.split() if word not in (stopwords)]))

[nltk_data] Downloading package stopwords to C:\Users\Alicia
[nltk_data]     Ionata\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [33]:
# Lemmatizing
from nltk.stem import WordNetLemmatizer
lemmatizer=WordNetLemmatizer()
lemmatized_string=[]

for row in range(0,len(df["Ingredients_Cleaned"])):
    tokens=nltk.word_tokenize(df["Ingredients_Cleaned"].iloc[row])
    #print(tokens)
    lemmatized_string.append( ' '.join([lemmatizer.lemmatize(words) for words in tokens]))

df["Ingredients_Cleaned"]=lemmatized_string

In [34]:
print(df.iloc[1]['Ingredients'])
print('\n')
print(df.iloc[1]['Ingredients_Cleaned'])

['2 tablespoons canola oil', '1 medium Spanish onion, coarsely chopped', '3 cloves garlic, coarsely chopped', '1 1/2 cups ketchup', '1 cup chocolate stout beer', '2 tablespoons dark brown sugar', '1 tablespoon molasses', '1 heaping tablespoon Dijon mustard', '1 tablespoon red wine vinegar', '1 tablespoon Worcestershire sauce', '1 canned chipotle chile in adobo sauce, chopped', 'Kosher salt and freshly ground black pepper', '2 tablespoons ancho chile powder', '2 tablespoons high-quality Dutch processed cocoa powder, such as Valrhona or Callebaut', '1 tablespoon paprika', '1/4 teaspoon chile de arbol or cayenne chile powder', '1 ounce bittersweet chocolate, chopped', '3 tablespoons ancho chile powder', '1 tablespoon high-quality Dutch processed cocoa powder, such as Valrhona or Callebaut', '1 tablespoon kosher salt, plus more for seasoning', '1 tablespoon ground espresso', '2 teaspoons ground celery seed', '2 teaspoons ground coriander', '2 teaspoons dried oregano', '2 teaspoons Spanish 

In [35]:
# Save a copy of the preliminary cleaning of dataframe
df.to_csv('recipes_clean_v2.csv', index=False)