# Epicurious (Pagination, scraping 1x per row)
## Andrew Williams

https://www.epicurious.com/search/cucumbers - or whatever other search term

Remember, we're scraping multiple pages of search results, so the URL will be different!

Scrape 10 pages of cucumber search results, and save as a CSV file. Include the following fields:

- Tag/category
- Title
- Summary
- Rating (We'll only want the 2, not the 2 / 4)
- Would make again
- Link/URL

Tip: You'll need to try/escape on some of these fields


In [1]:
import pandas as pd
import requests
from bs4 import BeautifulSoup

In [2]:
response = requests.get('https://www.epicurious.com/search/lemon')
doc = BeautifulSoup(response.text)
#print(doc.prettify())

In [3]:
lemons = doc.find_all('article')

In [4]:
rows = []
for lemon in lemons:
    row = {}
    try:
        row['Tag'] = lemon.find(class_="tag").text.strip()
    except:
        pass
    try:
        row['Title'] = lemon.find(class_="hed").text.strip()
    except:
        pass
    try:
        row['Summary'] = lemon.find(class_="dek").text.strip()
    except:
        pass
    try:
        row['Rating'] = lemon.find(itemprop="ratingValue").text.strip()
    except:
        pass
    try:
        row['Again?'] = lemon.find(class_="make-again-percentage").text.strip()
    except:
        pass
    try:
        if lemon.find('a')['href'] == 'https://www.condenast.com/user-agreement/':
            continue
        else:
            row['URL'] = lemon.find('a')['href']
    except:
        pass
    rows.append(row)
rows

[{'Tag': 'recipe',
  'Title': 'East 62nd Street Lemon Cake',
  'Summary': 'The zest of two whole lemons goes into this cake batter, giving it a real punch of citrus flavor, accented by the drippy lemon glaze that gets drizzled overtop.',
  'Rating': '4',
  'Again?': '100%',
  'URL': '/recipes/food/views/east-62nd-street-lemon-cake'},
 {'Tag': 'recipe',
  'Title': 'Lemon-Pepper Salami Bites',
  'Summary': 'Bright citrus and fresh-ground pepper put a festive spin on two party staples: cured meat and Marcona almonds.',
  'Rating': '0',
  'Again?': '0%',
  'URL': '/recipes/food/views/lemon-pepper-salami-bites'},
 {'Tag': 'recipe',
  'Title': 'Herb-Infused Lemon-Strawberry Loaf',
  'Summary': 'Fresh herbs infuse this buttery lemon cake with floral flavor that beautifully complements swirls of sweet strawberry jam.',
  'Rating': '3',
  'Again?': '85%',
  'URL': '/recipes/food/views/herb-infused-lemon-strawberry-loaf'},
 {'Tag': 'recipe',
  'Title': 'Chicken with Lemon and Spicy Spring Onions

## Making URLs + Looping

In [3]:
for page_num in range(1, 11): #start num and last num + 1
    url = f"https://www.epicurious.com/search/lemon?page={page_num}"
    print(url)

https://www.epicurious.com/search/lemon?page=1
https://www.epicurious.com/search/lemon?page=2
https://www.epicurious.com/search/lemon?page=3
https://www.epicurious.com/search/lemon?page=4
https://www.epicurious.com/search/lemon?page=5
https://www.epicurious.com/search/lemon?page=6
https://www.epicurious.com/search/lemon?page=7
https://www.epicurious.com/search/lemon?page=8
https://www.epicurious.com/search/lemon?page=9
https://www.epicurious.com/search/lemon?page=10


In [131]:
# make empty list first 
print("Making empty list")
rows = []

for page_num in range(1, 11): # 1-10
    url = f"https://www.epicurious.com/search/lemon?page={page_num}"
    print("Now scraping", url)
    
    #Download the appropriate page
    response = requests.get(url)
    doc = BeautifulSoup(response.text)
    
    #Scraping
    lemons = doc.find_all('article')
    for lemon in lemons:
        row = {}
        try:
            row['Tag'] = lemon.find(class_="tag").text.strip()
        except:
            pass
        try:
            row['Title'] = lemon.find(class_="hed").text.strip()
        except:
            pass
        try:
            row['Summary'] = lemon.find(class_="dek").text.strip()
        except:
            pass
        try:
            row['Rating'] = lemon.find(itemprop="ratingValue").text.strip()
        except:
            pass
        try:
            row['Again?'] = lemon.find(class_="make-again-percentage").text.strip()
        except:
            pass
        try:
            if lemon.find('a')['href'] == 'https://www.condenast.com/user-agreement/':
                continue
            else:
                row['URL'] = lemon.find('a')['href']
        except:
            pass
        rows.append(row)
    
# Don't create dataframe until end
print("Building Pandas DataFrame")
df = pd.DataFrame(rows)

Making empty list
Now scraping https://www.epicurious.com/search/lemon?page=1
Now scraping https://www.epicurious.com/search/lemon?page=2
Now scraping https://www.epicurious.com/search/lemon?page=3
Now scraping https://www.epicurious.com/search/lemon?page=4
Now scraping https://www.epicurious.com/search/lemon?page=5
Now scraping https://www.epicurious.com/search/lemon?page=6
Now scraping https://www.epicurious.com/search/lemon?page=7
Now scraping https://www.epicurious.com/search/lemon?page=8
Now scraping https://www.epicurious.com/search/lemon?page=9
Now scraping https://www.epicurious.com/search/lemon?page=10
Building Pandas DataFrame


In [132]:
df.head()

Unnamed: 0,Again?,Rating,Summary,Tag,Title,URL
0,100%,4,The zest of two whole lemons goes into this ca...,recipe,East 62nd Street Lemon Cake,/recipes/food/views/east-62nd-street-lemon-cake
1,0%,0,Bright citrus and fresh-ground pepper put a fe...,recipe,Lemon-Pepper Salami Bites,/recipes/food/views/lemon-pepper-salami-bites
2,85%,3,Fresh herbs infuse this buttery lemon cake wit...,recipe,Herb-Infused Lemon-Strawberry Loaf,/recipes/food/views/herb-infused-lemon-strawbe...
3,0%,0,Roasting two chickens on one baking sheet is a...,recipe,Chicken with Lemon and Spicy Spring Onions,/recipes/food/views/chicken-with-lemon-and-spi...
4,0%,0,Whole wheat phyllo dough makes a crispy crust ...,recipe,Potato Tart with Mustard Greens and Thyme,/recipes/food/views/potato-tart-with-mustard-g...


In [26]:
df.shape

(180, 6)

In [135]:
df.to_csv("Epicurious.csv", index=False)

## Epicurious, Part 2: Once-per-row scraping
Then, open your search results csv, filter for ONLY recipes. Merge the following fields with your original recipes and save as a new CSV file:

- Ingredients
- Directions
- Tags

Tip: If you use .find for your directions/ingredients, it'll print them all on one line. But if you use .find_all to separate them, it makes your life a lot harder! ...unless you just steal this code:

In [12]:
df = pd.read_csv("Epicurious.csv")
df.shape

(180, 6)

In [13]:
dfr = df[df.Tag == "recipe"]
dfr.shape
dfr

Unnamed: 0,Again?,Rating,Summary,Tag,Title,URL
0,100%,4.0,The zest of two whole lemons goes into this ca...,recipe,East 62nd Street Lemon Cake,/recipes/food/views/east-62nd-street-lemon-cake
1,0%,0.0,Bright citrus and fresh-ground pepper put a fe...,recipe,Lemon-Pepper Salami Bites,/recipes/food/views/lemon-pepper-salami-bites
2,85%,3.0,Fresh herbs infuse this buttery lemon cake wit...,recipe,Herb-Infused Lemon-Strawberry Loaf,/recipes/food/views/herb-infused-lemon-strawbe...
3,0%,0.0,Roasting two chickens on one baking sheet is a...,recipe,Chicken with Lemon and Spicy Spring Onions,/recipes/food/views/chicken-with-lemon-and-spi...
4,0%,0.0,Whole wheat phyllo dough makes a crispy crust ...,recipe,Potato Tart with Mustard Greens and Thyme,/recipes/food/views/potato-tart-with-mustard-g...
5,100%,4.0,"Whatever fruit you have on hand—pears, berries...",recipe,Lemon Cake with Fruit,/recipes/food/views/lemon-cake-with-fruit
6,0%,0.0,Earthy-sweet figs and bright lemon zest and ju...,recipe,Lemon and Fig Cupcakes,/recipes/food/views/lemon-and-fig-cupcakes
7,0%,0.0,Cooking summer squash low and slow yields swee...,recipe,Slow-Cooked Summer Squash with Lemon and Thyme,/recipes/food/views/slow-cooked-summer-squash-...
8,100%,4.0,How to pull off a weeknight roast chicken: set...,recipe,"Roast Chicken with Bell Peppers, Lemon, and Thyme",/recipes/food/views/roast-chicken-with-bell-pe...
9,67%,2.5,If you want to go with the same flavors and ba...,recipe,Grilled Lemon-Pepper Chicken,/recipes/food/views/grilled-lemon-pepper-chicken


In [14]:
rows = []

url = f"https://www.epicurious.com/recipes/food/views/east-62nd-street-lemon-cake"
print("Now scraping", url)

#Download the appropriate page
response = requests.get(url)
doc = BeautifulSoup(response.text)

row = {}
row['Ingredients'] = doc.find(class_="ingredient-groups").text.strip()
row['Directions'] = doc.find(class_="preparation-groups").text.strip()
row['Tags'] = doc.find(class_="tags").text.strip()

Now scraping https://www.epicurious.com/recipes/food/views/east-62nd-street-lemon-cake


In [15]:
def scrape_page(row):
    url = f"https://www.epicurious.com{row['URL']}"
    print("Scraping:", url)
    
    response = requests.get(url)
    doc = BeautifulSoup(response.text)

    row = {}
    row['Ingredients'] = doc.find(class_="ingredient-groups").text.strip()
    row['Directions'] = doc.find(class_="preparation-groups").text.strip()
    row['Tags'] = doc.find(class_="tags").text.strip()
    
    return pd.Series(row)

In [16]:
scraped_df = dfr.apply(scrape_page, axis=1) #take df, go through every row, scrape page

Scraping: https://www.epicurious.com/recipes/food/views/east-62nd-street-lemon-cake
Scraping: https://www.epicurious.com/recipes/food/views/east-62nd-street-lemon-cake
Scraping: https://www.epicurious.com/recipes/food/views/lemon-pepper-salami-bites
Scraping: https://www.epicurious.com/recipes/food/views/herb-infused-lemon-strawberry-loaf
Scraping: https://www.epicurious.com/recipes/food/views/chicken-with-lemon-and-spicy-spring-onions
Scraping: https://www.epicurious.com/recipes/food/views/potato-tart-with-mustard-greens-and-lemon-thyme
Scraping: https://www.epicurious.com/recipes/food/views/lemon-cake-with-fruit
Scraping: https://www.epicurious.com/recipes/food/views/lemon-and-fig-cupcakes
Scraping: https://www.epicurious.com/recipes/food/views/slow-cooked-summer-squash-with-lemon-and-thyme
Scraping: https://www.epicurious.com/recipes/food/views/roast-chicken-with-bell-peppers-lemon-and-thyme
Scraping: https://www.epicurious.com/recipes/food/views/grilled-lemon-pepper-chicken
Scrapin

Scraping: https://www.epicurious.com/recipes/food/views/filipino-chicken-barbecue-skewers-inihaw-na-manok
Scraping: https://www.epicurious.com/recipes/food/views/cold-roast-salmon-with-smashed-green-bean-salad
Scraping: https://www.epicurious.com/recipes/food/views/crispy-fish-sandwich
Scraping: https://www.epicurious.com/recipes/food/views/cold-toddy
Scraping: https://www.epicurious.com/recipes/food/views/campari-rose-spritz-cocktail
Scraping: https://www.epicurious.com/recipes/food/views/cherry-bourbon-soda-can-cocktail
Scraping: https://www.epicurious.com/recipes/food/views/grain-salad-with-olives-and-whole-lemon-vinaigrette
Scraping: https://www.epicurious.com/recipes/food/views/easy-canned-chickpea-hummus
Scraping: https://www.epicurious.com/recipes/food/views/sheet-pan-curry-pork-chops-and-sweet-potatoes
Scraping: https://www.epicurious.com/recipes/food/views/sheet-pan-old-bay-trout-and-succotash
Scraping: https://www.epicurious.com/recipes/food/views/winter-of-our-content-apple-

In [17]:
scraped_df.head()

Unnamed: 0,Ingredients,Directions,Tags
0,Cake:3 cups sifted unbleached all-purpose flou...,For the cake: ...,Mother's DaySpringCakeBakeLemonSoy FreePeanut ...
1,2 lb. thinly sliced salamiHandfuls of Marcona ...,Layer salami with some almonds in a shallow bo...,Bon AppétitAppetizerHors D'OeuvrePorkMeatAlmon...
2,"3/4 cup (1 1/2 sticks) unsalted butter, plus m...","Preheat oven to 350°F. Butter an 8 1/2x4 1/2"" ...",Mother's DayBrunchDessertBakeCakeButterThymeRo...
3,"1 lemon2 (3 1/2–4-lb.) chickensKosher salt, fr...",Preheat oven to 350°F. Very thinly slice 1 lem...,Bon AppétitDinnerChickenSpringGreen Onion/Scal...
4,4 sheets whole wheat phyllo dough1 large Yukon...,Preheat the oven to 375°F. Line a 10-inch tart...,HarperCollinsTartBreakfastBrunchSpringEasterPo...


In [18]:
scraped_df.shape

(141, 3)

In [19]:
recipes = dfr.merge(scraped_df, left_index=True, right_index=True)

In [20]:
recipes.head()

Unnamed: 0,Again?,Rating,Summary,Tag,Title,URL,Ingredients,Directions,Tags
0,100%,4.0,The zest of two whole lemons goes into this ca...,recipe,East 62nd Street Lemon Cake,/recipes/food/views/east-62nd-street-lemon-cake,Cake:3 cups sifted unbleached all-purpose flou...,For the cake: ...,Mother's DaySpringCakeBakeLemonSoy FreePeanut ...
1,0%,0.0,Bright citrus and fresh-ground pepper put a fe...,recipe,Lemon-Pepper Salami Bites,/recipes/food/views/lemon-pepper-salami-bites,2 lb. thinly sliced salamiHandfuls of Marcona ...,Layer salami with some almonds in a shallow bo...,Bon AppétitAppetizerHors D'OeuvrePorkMeatAlmon...
2,85%,3.0,Fresh herbs infuse this buttery lemon cake wit...,recipe,Herb-Infused Lemon-Strawberry Loaf,/recipes/food/views/herb-infused-lemon-strawbe...,"3/4 cup (1 1/2 sticks) unsalted butter, plus m...","Preheat oven to 350°F. Butter an 8 1/2x4 1/2"" ...",Mother's DayBrunchDessertBakeCakeButterThymeRo...
3,0%,0.0,Roasting two chickens on one baking sheet is a...,recipe,Chicken with Lemon and Spicy Spring Onions,/recipes/food/views/chicken-with-lemon-and-spi...,"1 lemon2 (3 1/2–4-lb.) chickensKosher salt, fr...",Preheat oven to 350°F. Very thinly slice 1 lem...,Bon AppétitDinnerChickenSpringGreen Onion/Scal...
4,0%,0.0,Whole wheat phyllo dough makes a crispy crust ...,recipe,Potato Tart with Mustard Greens and Thyme,/recipes/food/views/potato-tart-with-mustard-g...,4 sheets whole wheat phyllo dough1 large Yukon...,Preheat the oven to 375°F. Line a 10-inch tart...,HarperCollinsTartBreakfastBrunchSpringEasterPo...


In [21]:
recipes.to_csv("Epicurious-Recipes.csv", index=False)