### Web Scraping Tutorial
Contains three main components
1) Crawl => Reach the target website and download the response from the target URL by making a HTTP Request
2) Parse and Transform => Parsing the downloaded data into a HTML Parser like BeautifulSoup and Extract the Required Data
3) Store => Now the Extracted data is stored as some JSON or CSV format or directly into databases like MongoDB.

### Step 1. Crawl

In [60]:
# importing the required libraries
import requests as req
from bs4 import BeautifulSoup
import pandas as pd
import xlsxwriter

In [6]:
# target URL to scrape
url = "https://www.goibibo.com/hotels/hotels-in-shimla-ct/"

#headers
headers = {
    'User-Agent': "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36"
}

In [8]:
# send request to download the data
response = req.request("GET", url, headers=headers)

In [26]:
# parse the downloaded data
data = BeautifulSoup(response.text, 'html.parser')
print(data)


<!DOCTYPE html>

<html lang="en">
<head>
<script>
          var starttime = new Date();
        </script>
<title data-react-helmet="true">Hotels in Shimla Book from 816 hotels</title>
<meta content="#2d67b2" data-react-helmet="true" name="theme-color"/><meta content="122023101161980" data-react-helmet="true" property="fb:app_id"/><meta content="239522418693" data-react-helmet="true" property="fb:pages"/><meta content="l3rQIge7B2N_G1cQl0VZP0y7-nE" data-react-helmet="true" name="alexaVerifyID"/><meta charset="utf-8" data-react-helmet="true"/><meta content="width=device-width,initial-scale=1.0, maximum-scale=1.0, user-scalable=0" data-react-helmet="true" name="viewport"/><meta content="Book Hotels in Shimla at lowest Prices on Goibibo. Get Free Cancellation &amp; Instant Refund on 816  Shimla Hotels starting from  ₹1003. Use code GETSETGO to get discounts upto 30% off on Hotels in Shimla." data-react-helmet="true" name="description"/><meta content="Book Hotels in Shimla at lowest Prices 

### Step 2. Parse And Transform

In [52]:
# finding all the sections with the specified card name
hotel_cards_data = data.find_all('div', attrs={
    'class':'HotelCardstyles__WrapperSectionMetaDiv-sc-1s80tyk-8'
})

# total number of cards
print(f"The total number of cards are {len(hotel_cards_data)}")

The total number of cards are 30


### Step 3. Store the Data in JSON or CSV or Database

In [62]:
# list to store the scraped data
scraped_data = []

# extract the hotel name and price per room
for card in hotel_cards_data:
    
    # a dictionary to store the hotel name and the price per room
    hotel_details = {}
    
    # fetch the hotel name
    hotel_name_anchor = card.find('a',attrs={
        'itemprop':'name',
        'class':'HotelCardstyles__HotelNameSeoAnchor-sc-1s80tyk-23'
    })
    hotel_name = hotel_name_anchor.find('span').text.strip()
    
    # fetch the hotel price
    hotel_price_div = card.find('div', attrs={
        'class':'HotelCardstyles__CurrentPriceTextWrapper-sc-1s80tyk-39'
    })
    hotel_price = hotel_price_div.find('p',attrs={
        'itemprop':'priceRange',
        'class':'HotelCardstyles__CurrentPrice-sc-1s80tyk-40'
    }).text.strip()
    
    # add the data to the dictionary
    hotel_details['Hotel_Name'] = hotel_name
    hotel_details['PricePerRoom'] = hotel_price
    
    # append the data to the list
    scraped_data.append(hotel_details)


# create a dataframe from the list of dictionaries
dataframe = pd.DataFrame.from_dict(scraped_data)

# save the data to a Excel file
with pd.ExcelWriter('Hotels_Shimla_City.xlsx', engine='xlsxwriter') as writer:
    dataframe.to_excel(writer, index=False, sheet_name='Hotels')
    worksheet = writer.sheets['Hotels']
    worksheet.set_column('A:B', 30)