## What is Web Scrapping
### Web scraping, web harvesting, or web data extraction is an automated process of collecting large data(unstructured) from websites. The user can extract all the data on particular sites or the specific data as per the requirement. The data collected can be stored in a structured format for further analysis.

## Steps involved in web scraping:

### Find the URL of the webpage that you want to scrape
#### Select the particular elements by inspecting
#### Write the code to get the content of the selected elements
#### Store the data in the required format

In [1]:
! pip install requests



In [2]:
! pip install BeautifulSoup4



In [3]:
# importing libraries
import pandas as pd #pandas – to create a dataframe
import requests #requests – to send HTTP requests and access the HTML content from the target webpage
from bs4 import BeautifulSoup as bs #BeautifulSoup – is a Python Library for parsing structured HTML data

In [4]:
base_url = "https://www.consumeraffairs.com/food/dominos.html"
all_pages_reviews =[]

In [11]:
def scraper():
    for i in range(1,11): # fetching reviews from five pages
        pagewise_reviews = [] #Creating an empty list to store the reviews of each page(from 1 to 5)
        # Query parameters represent additional values that can be declared on the page.
        query_parameter = "?page="+str(i) # i represents the page number
        url = base_url + query_parameter #Construct the URL
        response = requests.get(url) #Send HTTP request to the URL using requests and store the response
        soup = bs(response.content, 'html.parser') #Create a soup object and parse the HTML page
        rev_div = soup.findAll("div",attrs={"class","rvw-bd"}) #Find all the div elements of class name “rvw-bd” and 
                                                               #store them in a variable
            

    for j in range(len(rev_div)): #Loop through all the rev_div and append the review text to the pagewise_reviews list
    # finding all the p tags to fetch only the review text
        pagewise_reviews.append(rev_div[j].find("p").text)

    for k in range(len(pagewise_reviews)): #Append all pagewise review to a single list “all_pages_reviews”
        all_pages_reviews.append(pagewise_reviews[k]) 
    return all_pages_reviews #At the end of the function, return the final list of reviews


In [12]:
# Call the function scraper() and store the output to a variable 'reviews'
# Driver code
reviews = scraper()
i = range(1, len(reviews)+1)
reviews_df = pd.DataFrame({'review':reviews}, index=i)
print(reviews_df)

                                               review
1   I ordered a pizza from local outlet on 4 April...
2   My daughter ordered a pizza online, the driver...
3   My daughter ordered a pizza online, the driver...
4   Dominos duped me. I gave correct name, address...
5   I’ve always ordered from Domino's but lately I...
6   Ordered pizza at 1:47 pm with a 23-33 minute d...
7   I ordered a pizza and a salad from the Domino'...
8   Be forthcoming with delivery times. I would ne...
9   I am a business owner that has grown up in the...
10  I called Domino's Pizza located at 1601 Hwy. 5...
11  I order in Domino's most often but today when ...
12  So first off I ordered my pizza without garlic...
13  Was Nov. 1st or 2nd this year... Ordered a 2 t...
14  I place a order online at the Ralph Ave store ...
15  The store in Millington TN. 38053. They are to...
16  My Experience – I had ordered 2 farmhouse Pizz...
17  This was at Marrickville 2204 store. The filth...
18  I placed a order and wai

In [13]:
reviews_df

Unnamed: 0,review
1,I ordered a pizza from local outlet on 4 April...
2,"My daughter ordered a pizza online, the driver..."
3,"My daughter ordered a pizza online, the driver..."
4,"Dominos duped me. I gave correct name, address..."
5,I’ve always ordered from Domino's but lately I...
6,Ordered pizza at 1:47 pm with a 23-33 minute d...
7,I ordered a pizza and a salad from the Domino'...
8,Be forthcoming with delivery times. I would ne...
9,I am a business owner that has grown up in the...
10,I called Domino's Pizza located at 1601 Hwy. 5...


In [15]:
# Writing the content of the data frame to a text file
reviews_df.to_csv('reviews.txt', sep='t')