# Web Scraping Exercise

## 1. Introduction and Planning

### Objective:
The goal of this exercise is to build a web scraper that collects data from a chosen website. You will learn how to send HTTP requests, parse HTML content, extract relevant data, and store it in a structured format.

### Tasks:
1. Identify the data you want to scrape.
2. Choose the target website(s).
3. Plan the structure of your project.

### Example:
For this exercise, we will scrape job listings from Indeed.com. We will extract job titles, company names, locations, and job descriptions.

## 2. Understanding the Target Website
### Objective:

Analyze the structure of the web pages to be scraped.
### Tasks:

* Inspect the target website using browser developer tools.
* Identify the HTML elements that contain the desired data.

### Instructions:

* Open your browser and navigate to the target website (e.g., Indeed.com).
* Right-click on the webpage and select "Inspect" or press Ctrl+Shift+I.
* Use the developer tools to explore the HTML structure of the webpage.
* Identify the tags and classes of the elements that contain the job titles, company names, locations, and descriptions.

## 3. Writing the Scraper
### Objective:

Develop the code to scrape data from the target website.
### Tasks:

* Send HTTP requests to the target website.
* Parse the HTML content and extract the required data.
* Handle pagination to scrape data from multiple pages.
* Implement error handling.

In [36]:
import requests
from bs4 import BeautifulSoup
import pandas as pd

url = 'https://www.allrecipes.com/'
# url = 'https://www.allrecipes.com/recipes/17562/dinner/'

# Define a user-agent to make the request look like it's coming from a web browser
headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:98.0) Gecko/20100101 Firefox/98.0",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
        "Accept-Language": "en-US,en;q=0.5",
        "Accept-Encoding": "gzip, deflate",
        "Connection": "keep-alive",
        "Upgrade-Insecure-Requests": "1",
        "Sec-Fetch-Dest": "document",
        "Sec-Fetch-Mode": "navigate",
        "Sec-Fetch-Site": "none",
        "Sec-Fetch-User": "?1",
        "Cache-Control": "max-age=0",
    }
# Send an HTTP request to the target URL
response = requests.get(url, headers=headers)


In [37]:
response

<Response [200]>

In [38]:
if response.status_code == 200:
    soup = BeautifulSoup(response.content, 'html.parser')

    recipes_cards = soup.find_all('a', class_='comp mntl-card-list-items mntl-document-card mntl-card card--image-top card card--no-image')
    recipes_urls = [card['href'] for card in recipes_cards]

    recipes_data = []
    
    for url in recipes_urls:
        recipe_page = requests.get(url, headers=headers)
        recipe_soup = BeautifulSoup(recipe_page.content, 'html.parser')
        
        recipe_title = recipe_soup.find('h1', class_='article-heading type--lion')
        
        recipe_ingredients_list = recipe_soup.find('ul', class_ = 'mm-recipes-structured-ingredients__list')
        recipe_ingredients_elmnts = recipe_ingredients_list.find_all('p')
        recipe_ingredients = [[p.text] for p in recipe_ingredients_elmnts]
        
        recipes_steps_list = recipe_soup.find('ol', class_ = 'comp mntl-sc-block mntl-sc-block-startgroup mntl-sc-block-group--OL')
        recipes_steps_elmnt = recipes_steps_list.find_all('p')
        recipes_steps = [[p.text] for p in recipes_steps_elmnt]

        recipes_description_list = recipe_soup.find('p', class_= 'article-subheading type--dog')
        recipes_description_elmnt = recipes_description_list.find_all('p')
        recipes_description = [[p.text] for p in recipes_description_list]
        
        recipes_data.append({
            'Title': recipe_title,
            'Description': recipes_description,
            'Ingredientes': recipe_ingredients,
            'Steps': recipes_steps
        })

    df = pd.DataFrame(recipes_data)
    print(df)

else:
    print(response.status_code)


                                        Title  \
0                  [Slow-Cooker Pepper Steak]   
1         [Slow Cooker Lime Cilantro Chicken]   
2                   [Slow Cooker Pulled Pork]   
3        [Chicken Enchilada Slow Cooker Soup]   
4     [Slow Cooker Texas Smoked Beef Brisket]   
5                        [Warm Berry Compote]   
6           [Chef John's Hot Water Cornbread]   
7                           [Baked Feta Eggs]   
8                     [Watermelon Margarita ]   
9         [Pennsylvania Coal Region Barbecue]   
10  [Easy Three Ingredient Raspberry Dessert]   
11                        [Peach Custard Pie]   

                                          Description  \
0   [[This crockpot pepper steak recipe is very te...   
1   [[This slow cooker cilantro-lime chicken is bu...   
2   [[Root beer and pulled pork might not seem lik...   
3   [[Make chicken enchilada soup in your crockpot...   
4   [[Cook brisket in a crockpot with this wonderf...   
5   [[This berry com

In [39]:
df

Unnamed: 0,Title,Description,Ingredientes,Steps
0,[Slow-Cooker Pepper Steak],[[This crockpot pepper steak recipe is very te...,"[[2 pounds beef sirloin, cut into 2 inch strip...","[[ Gather all ingredients.\n], [Dotdash Meredi..."
1,[Slow Cooker Lime Cilantro Chicken],[[This slow cooker cilantro-lime chicken is bu...,"[[1 (16 ounce) jar salsa], [1 (1.25 ounce) pac...","[[ Gather all ingredients.\n], [Dotdash Meredi..."
2,[Slow Cooker Pulled Pork],[[Root beer and pulled pork might not seem lik...,"[[1 (2 pound) pork tenderloin], [1 (12 fluid o...","[[ Gather the ingredients.\n], [ALLRECIPES /JU..."
3,[Chicken Enchilada Slow Cooker Soup],[[Make chicken enchilada soup in your crockpot...,"[[1 pound skinless, boneless chicken breast ha...",[[ Rinse chicken breasts and pat dry. Place ch...
4,[Slow Cooker Texas Smoked Beef Brisket],[[Cook brisket in a crockpot with this wonderf...,"[[3 tablespoons smoked paprika], [2 tablespoon...","[[ Gather all ingredients.\n], [Dotdash Meredi..."
5,[Warm Berry Compote],[[This berry compote is made in a slow cooker ...,"[[6 cups frozen mixed berries], [½ cup white s...","[[ Stir frozen berries, sugar, orange juice, a..."
6,[Chef John's Hot Water Cornbread],[[This deep-fried hot water cornbread might be...,"[[2 cups cornmeal], [1/4 cup all-purpose flour...","[[ Combine cornmeal, flour, sugar, salt, and b..."
7,[Baked Feta Eggs],[[These baked feta eggs are so very simple to ...,"[[2 small tomatoes, diced (such as Campari®)],...",[[ Preheat the oven to 375 degrees F (190 degr...
8,[Watermelon Margarita ],[[This watermelon margarita tastes amazing and...,"[[½ cup white sugar], [½ cup water], [3 strips...",[[ To make a simple syrup: Bring 1/2 cup sugar...
9,[Pennsylvania Coal Region Barbecue],[[Traditional sweet and sour Pennsylvania barb...,"[[1 pound ground beef], [1 medium onion, chopp...",[[ Cook ground beef and onion in a large skill...


In [40]:
df['Description'][0]

[["This crockpot pepper steak recipe is very tender and flavorful and is one of our family's favorites. It's great to make ahead of time in the slow cooker and then serve over rice, egg noodles, or chow mein."]]