# WEB SCRAPING

Web scraping in Python is the process of automatically extracting data from websites. 
It's commonly used for data collection, price monitoring, news aggregation, and more. 
The most popular tools and libraries for web scraping in Python include:

üîß Essential Libraries:

1. requests ‚Äì for sending HTTP requests.

2. BeautifulSoup (from bs4) ‚Äì for parsing HTML and XML.

3. lxml ‚Äì faster parser alternative to the default.

4. selenium ‚Äì for scraping JavaScript-heavy websites.

5. pandas ‚Äì for organizing and exporting scraped data.
 

üåê HTTP response status codes:

1. Informational responses (100 ‚Äì 199)
   
2. Successful responses (200 ‚Äì 299)
   
3. Redirection messages (300 ‚Äì 399)
   
4. Client error responses (400 ‚Äì 499)
   
5. Server error responses (500 ‚Äì 599)


In [48]:
# Basic Example Using requests and BeautifulSoup

import requests
from bs4 import BeautifulSoup
import pandas as pd

URL = "https://www.dominos.co.in/store-location/chennai"
response = requests.get(URL)

print(response)


<Response [200]>


In [71]:
# To View the Response

response.text     #--- Returns the HTML as a string




In [54]:
# Parse HTML / Convert to HTML

soup = BeautifulSoup(response.text, "html.parser")

print(soup)


<!DOCTYPE html>

<html lang="en" style="overflow-x:hidden;width:100%;">
<head>
<meta charset="utf-8"/><script type="text/javascript">(window.NREUM||(NREUM={})).init={ajax:{deny_list:["bam.nr-data.net"]}};(window.NREUM||(NREUM={})).loader_config={licenseKey:"NRBR-3e999030a12e14fece2",applicationID:"685069429"};;/*! For license information please see nr-loader-rum-1.288.1.min.js.LICENSE.txt */
<meta content="Find nearby pizza restaurants in Chennai for free pizza delivery. Get address, phone number &amp; menu of your nearest pizza shops. Order pizza online from Domino‚Äôs Chennai." name="description"/>
<meta content="Domino‚Äôs in Chennai,list of domino‚Äôs pizza in Chennai" name="keywords"/>
<link href="https://www.dominos.co.in/store-locations/chennai" rel="canonical"/>
<meta content="width=device-width, initial-scale=1, minimal-ui" name="viewport"/>
<title> Domino's Pizza Restaurants in Chennai | Nearby Pizza Shops in Chennai ‚Äì Domino‚Äôs India</title>
<!-- CSS For Coupons -->
<!--<

In [66]:
# Extract Location from soup

Location = soup.find_all("p", {"class" : "city-main-sub-title"})

Location


[<p class="city-main-sub-title"> R.K.ROAD</p>,
 <p class="city-main-sub-title"> ADYAR</p>,
 <p class="city-main-sub-title"> ASHOK NAGAR</p>,
 <p class="city-main-sub-title"> BEACH CASTLE</p>,
 <p class="city-main-sub-title"> INFOSYS CHENNAI</p>,
 <p class="city-main-sub-title"> ANNA NAGAR</p>,
 <p class="city-main-sub-title"> NUNGAMBAKAM</p>,
 <p class="city-main-sub-title"> VELACHERY</p>,
 <p class="city-main-sub-title"> TIDEL PARK</p>,
 <p class="city-main-sub-title"> TEYNAMPET</p>,
 <p class="city-main-sub-title"> INFOSYS MAHINDRA WORLD CITY</p>,
 <p class="city-main-sub-title"> MOGAPPAIR</p>,
 <p class="city-main-sub-title"> FLOWERS ROAD PURUSWAKKAM</p>,
 <p class="city-main-sub-title"> ALWARTHIRUNAGAR</p>,
 <p class="city-main-sub-title"> PERAMBUR</p>,
 <p class="city-main-sub-title"> PORUR</p>,
 <p class="city-main-sub-title"> TAMBARAM</p>,
 <p class="city-main-sub-title"> OKKIM THORAPAKKAM (OMR)</p>,
 <p class="city-main-sub-title"> CHROMPET</p>,
 <p class="city-main-sub-title">

In [79]:
# To get only the text from the above result

for i in Location:

    print(i.text)
    

 R.K.ROAD
 ADYAR
 ASHOK NAGAR
 BEACH CASTLE
 INFOSYS CHENNAI
 ANNA NAGAR
 NUNGAMBAKAM
 VELACHERY
 TIDEL PARK
 TEYNAMPET
 INFOSYS MAHINDRA WORLD CITY
 MOGAPPAIR
 FLOWERS ROAD PURUSWAKKAM
 ALWARTHIRUNAGAR
 PERAMBUR
 PORUR
 TAMBARAM
 OKKIM THORAPAKKAM (OMR)
 CHROMPET
 LITTLE MOUNT RD, GUINDY
 VENKATAKRISHNA ROAD, R. A. PURAM
 EAST COAST ROAD
 MEDAVAKKAM
 MADDIPAKKAM
 NELSON MANIKAM ROAD
 KOLATHUR
 DLF IT PARK, CHENNAI
 GANDHI ROAD, TAMBARAM
 PHOENIX MARKET CITY
 COROMANDEL PLAZA
 VANAGARAM, CHENNAI
 NANGANALLUR CHENNAI
 KRISHNAGIRI BY PASS ROAD
 THE FORUM VIJAYA MALL, VADAPALANI(CHENNAI)
 KK.NAGAR, CHENNAI
 G.N. CHETTY, T.NAGAR
 AMPA SKYWALK MALL
 THIRUVANMIYUR
 PAMMAL MAIN ROAD
 MONTEITH ROAD, EGMORE,
 ASCENDAS IT PARK
 KELAMBAKKAM (PADURVILLAGE) CHENNAI
 Guduvanchery
 PALLIKARNAI, CHENNAI
 Maraimalai Nagar - 2
 Poonthamallee, Chennai
 AMBATTUR-CHENNAI
 PERUNGUDI,OMR ROAD
 SELAIYUR,KANCHEEPURAM,CHENNAI
 MOULIVAKKAM,CHENNAI
 KODAMBAKKAM,CHENNAI
 EXPRESS AVENUE, CHENNAI
 ROYAPURAM,CHENNAI


In [81]:
# Store the above result as a list

Location = [i.text for i in Location]

print(Location)


[' R.K.ROAD', ' ADYAR', ' ASHOK NAGAR', ' BEACH CASTLE', ' INFOSYS CHENNAI', ' ANNA NAGAR', ' NUNGAMBAKAM', ' VELACHERY', ' TIDEL PARK', ' TEYNAMPET', ' INFOSYS MAHINDRA WORLD CITY', ' MOGAPPAIR', ' FLOWERS ROAD PURUSWAKKAM', ' ALWARTHIRUNAGAR', ' PERAMBUR', ' PORUR', ' TAMBARAM', ' OKKIM THORAPAKKAM (OMR)', ' CHROMPET', ' LITTLE MOUNT RD, GUINDY', ' VENKATAKRISHNA ROAD, R. A. PURAM', ' EAST COAST ROAD', ' MEDAVAKKAM', ' MADDIPAKKAM', ' NELSON MANIKAM ROAD', ' KOLATHUR', ' DLF IT PARK, CHENNAI', ' GANDHI ROAD, TAMBARAM', ' PHOENIX MARKET CITY', ' COROMANDEL PLAZA', ' VANAGARAM, CHENNAI', ' NANGANALLUR CHENNAI', ' KRISHNAGIRI BY PASS ROAD', ' THE FORUM VIJAYA MALL, VADAPALANI(CHENNAI)', ' KK.NAGAR, CHENNAI', ' G.N. CHETTY, T.NAGAR', ' AMPA SKYWALK MALL', ' THIRUVANMIYUR', ' PAMMAL MAIN ROAD', ' MONTEITH ROAD, EGMORE,', ' ASCENDAS IT PARK', ' KELAMBAKKAM (PADURVILLAGE) CHENNAI', ' Guduvanchery', ' PALLIKARNAI, CHENNAI', ' Maraimalai Nagar - 2', ' Poonthamallee, Chennai', ' AMBATTUR-CHENN

In [97]:
# Extract Address from soup

Address = soup.find_all("p", {"class" : "grey-text mb-0"})

for j in Address:

    print(j.text)

Address = [j.text for j in Address]

Address


 NO.100/1,RADHA KRISHNAN SALAI,MYLAPORE,CHENNAI - 600004 PH.NO. 044-28474444/ 42134444
 NO.55, 1ST STREET,KAMARAJ AVENUE, ADITYA ARCADE,KASTURIBAI NAGAR,ADYAR,CHENNAI - 600020 PH.NO. 044-24424444/ 42124444/ 24451692/ 24424411/ 33/ 88
 CEEBROS, NO. 20,10TH AVENUE,ASHOK NAGARA,CHENNAI - 600083 PH.NO. 044-24714555/ 42044445/ 24712762/2764/2779/ 24719856/57/58/59
 34, ELIOT'S BEACH ROAD, BESANT NAGAR
 INFOSYS TECHNOLOGIES,138, OLD MAHABALIPURAM ROAD,SHOLINGANALLURINFOSYS,CHENNAI - 600119 PH.NO. 9445027294
 No. J22/1, 1st Floor, 3rd Avenue, Annan agar East,Opp. Yesyesi Super Market, Chennai.Ph no.:26207333,26201334/2334/4334/5334/7331/7334/9334
 152 & 153,DR. MGR SALAIKODAMBAKKAM HIGH ROADBIG FM BUILDINGNUNGUMBAKKAM,CHENNAI - 600034 PH.NO. 044-28221111/2011/2012/2013/2014/2015
 No 137 Block No 182 Opp to Grand Mall Velachery Tambaram Main Road Velachery Chennai Tamil Nadu 600042 Phone 044 4904 3232100 FT, TARAMANI MAIN ROAD, VELACHERY,CHENNAI - 600042 PH.NO-44-49043232
 COUNTER NO. 2, FOOD 

[' NO.100/1,RADHA KRISHNAN SALAI,MYLAPORE,CHENNAI - 600004 PH.NO. 044-28474444/ 42134444',
 ' NO.55, 1ST STREET,KAMARAJ AVENUE, ADITYA ARCADE,KASTURIBAI NAGAR,ADYAR,CHENNAI - 600020 PH.NO. 044-24424444/ 42124444/ 24451692/ 24424411/ 33/ 88',
 ' CEEBROS, NO. 20,10TH AVENUE,ASHOK NAGARA,CHENNAI - 600083 PH.NO. 044-24714555/ 42044445/ 24712762/2764/2779/ 24719856/57/58/59',
 " 34, ELIOT'S BEACH ROAD, BESANT NAGAR",
 ' INFOSYS TECHNOLOGIES,138, OLD MAHABALIPURAM ROAD,SHOLINGANALLURINFOSYS,CHENNAI - 600119 PH.NO. 9445027294',
 ' No. J22/1, 1st Floor, 3rd Avenue, Annan agar East,Opp. Yesyesi Super Market, Chennai.Ph no.:26207333,26201334/2334/4334/5334/7331/7334/9334',
 ' 152 & 153,DR. MGR SALAIKODAMBAKKAM HIGH ROADBIG FM BUILDINGNUNGUMBAKKAM,CHENNAI - 600034 PH.NO. 044-28221111/2011/2012/2013/2014/2015',
 ' No 137 Block No 182 Opp to Grand Mall Velachery Tambaram Main Road Velachery Chennai Tamil Nadu 600042 Phone 044 4904 3232100 FT, TARAMANI MAIN ROAD, VELACHERY,CHENNAI - 600042 PH.NO-44-

In [101]:
# Store the above Location & Addtess in a Variable

Data = {"Branch" : Location, "Address" : Address}
Data


{'Branch': [' R.K.ROAD',
  ' ADYAR',
  ' ASHOK NAGAR',
  ' BEACH CASTLE',
  ' INFOSYS CHENNAI',
  ' ANNA NAGAR',
  ' NUNGAMBAKAM',
  ' VELACHERY',
  ' TIDEL PARK',
  ' TEYNAMPET',
  ' INFOSYS MAHINDRA WORLD CITY',
  ' MOGAPPAIR',
  ' FLOWERS ROAD PURUSWAKKAM',
  ' ALWARTHIRUNAGAR',
  ' PERAMBUR',
  ' PORUR',
  ' TAMBARAM',
  ' OKKIM THORAPAKKAM (OMR)',
  ' CHROMPET',
  ' LITTLE MOUNT RD, GUINDY',
  ' VENKATAKRISHNA ROAD, R. A. PURAM',
  ' EAST COAST ROAD',
  ' MEDAVAKKAM',
  ' MADDIPAKKAM',
  ' NELSON MANIKAM ROAD',
  ' KOLATHUR',
  ' DLF IT PARK, CHENNAI',
  ' GANDHI ROAD, TAMBARAM',
  ' PHOENIX MARKET CITY',
  ' COROMANDEL PLAZA',
  ' VANAGARAM, CHENNAI',
  ' NANGANALLUR CHENNAI',
  ' KRISHNAGIRI BY PASS ROAD',
  ' THE FORUM VIJAYA MALL, VADAPALANI(CHENNAI)',
  ' KK.NAGAR, CHENNAI',
  ' G.N. CHETTY, T.NAGAR',
  ' AMPA SKYWALK MALL',
  ' THIRUVANMIYUR',
  ' PAMMAL MAIN ROAD',
  ' MONTEITH ROAD, EGMORE,',
  ' ASCENDAS IT PARK',
  ' KELAMBAKKAM (PADURVILLAGE) CHENNAI',
  ' Guduvanchery'

In [107]:
# Create a DataFrame with the above results

df = pd.DataFrame(Data)
df


Unnamed: 0,Branch,Address
0,R.K.ROAD,"NO.100/1,RADHA KRISHNAN SALAI,MYLAPORE,CHENNA..."
1,ADYAR,"NO.55, 1ST STREET,KAMARAJ AVENUE, ADITYA ARCA..."
2,ASHOK NAGAR,"CEEBROS, NO. 20,10TH AVENUE,ASHOK NAGARA,CHEN..."
3,BEACH CASTLE,"34, ELIOT'S BEACH ROAD, BESANT NAGAR"
4,INFOSYS CHENNAI,"INFOSYS TECHNOLOGIES,138, OLD MAHABALIPURAM R..."
...,...,...
68,RMZ MILLENIA BUSINESS PARK II CHENNAI,THE MARKET PLACE UNIT NO 06 FC GROUND FLOOR I...
69,Vellore Institute Of Technology Chennai,Vellore Institute of Technology VIT Vandalur ...
70,"Valasaravakkam, Chennai, Tamil Nadu","First Floor, Pe Ve Plaza, Plot No.9, Arcot ..."
71,"Vikas Mantra Tower, Kotturpuram, Chennai","First Floor In Vikas Mantra Tower, Plot No.63..."


In [109]:
# To save in a CSV

df.to_csv("Web Scraping.csv")
