# Swiggy Top Rated Restaurants in Delhi


- In this web scraping project we'll be scraping a list of Top Rated restaurants in Delhi from Swiggy's website. Swiggy is an Indian online food ordering and delivery platform. Here is the link - https://www.swiggy.com/city/delhi/top-rated-collection

- We'll get a list of top rated restaurants in Delhi on Swiggy, and for each restaurant we'll grab name, cuisine, rating, price for two, and URL.

- The list will be stored in a csv file with format:

```
name,cuisine,rating,price_two,url
Third Wave Coffee Roasters,"Beverages, Pizzas",4.8,₹400 FOR TWO,https://swiggy.com/restaurants/third-wave-coffee-roasters-connaught-place-delhi-554643
Makery- Healthy Meal Bowls,"Healthy Food, Salads",4.7,₹800 FOR TWO,https://swiggy.com/restaurants/makery-healthy-meal-bowls-khan-market-delhi-491839
```



##### Importing required packages

In [2]:
import requests
from bs4 import BeautifulSoup
import pandas as pd

##### Using Requests to get the page and BeautifulSoup to parse through the page content

In [3]:
my_header = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'}
rest_page_url= 'https://www.swiggy.com/city/delhi/top-rated-collection?page=1'

response = requests.get(rest_page_url, headers=my_header)

if response.status_code != 200:
    raise Exception('Failed to load page {}'.format(rest_page_url))

page_contents= response.text
doc = BeautifulSoup(page_contents, 'html.parser')

##### Getting required information tags

In [4]:
rest_name_tag= doc.find_all('div', {'class': 'nA6kb'})

In [5]:
rest_name_tag[:5]

[<div class="nA6kb">HRX by Eatfit</div>,
 <div class="nA6kb">PIZZA NEVER LIES</div>,
 <div class="nA6kb">ROLL BADSHAH</div>,
 <div class="nA6kb">Downtown Delhi</div>,
 <div class="nA6kb">CURRY &amp; BIRYANI ZONE</div>]

In [6]:
rest_names= []

for tag in rest_name_tag:
    rest_names.append(tag.text)
rest_names[:5]

['HRX by Eatfit',
 'PIZZA NEVER LIES',
 'ROLL BADSHAH',
 'Downtown Delhi',
 'CURRY & BIRYANI ZONE']

In [7]:
rest_cuisine_tag= doc.find_all('div', {'class':'_1gURR'})

In [8]:
rest_cuisine_tag[:5]

[<div class="_1gURR" title="Healthy Food, Salads, Keto">Healthy Food, Salads, Keto</div>,
 <div class="_1gURR" title="Pizzas">Pizzas</div>,
 <div class="_1gURR" title="Fast Food, Snacks">Fast Food, Snacks</div>,
 <div class="_1gURR" title="North Indian, Indian, Chinese, Snacks">North Indian, Indian, Chinese, Snacks</div>,
 <div class="_1gURR" title="Indian, Chinese, Beverages">Indian, Chinese, Beverages</div>]

In [9]:
rest_cuisine = []
for tag in rest_cuisine_tag:
    rest_cuisine.append(tag.text.strip())
rest_cuisine[:5]

['Healthy Food, Salads, Keto',
 'Pizzas',
 'Fast Food, Snacks',
 'North Indian, Indian, Chinese, Snacks',
 'Indian, Chinese, Beverages']

In [10]:
rest_det_tag= doc.find_all('div', {'class':'_3Mn31'})

In [11]:
rest_det_tag[:5]

[<div class="_3Mn31"><div class="_9uwBC wY0my"><span class="icon-star _537e4"></span><span>5.0</span></div><div>•</div><div>48 MINS</div><div>•</div><div class="nVWSi">₹270 FOR TWO</div></div>,
 <div class="_3Mn31"><div class="_9uwBC wY0my"><span class="icon-star _537e4"></span><span>5.0</span></div><div>•</div><div>53 MINS</div><div>•</div><div class="nVWSi">₹500 FOR TWO</div></div>,
 <div class="_3Mn31"><div class="_9uwBC wY0my"><span class="icon-star _537e4"></span><span>5.0</span></div><div>•</div><div>49 MINS</div><div>•</div><div class="nVWSi">₹249 FOR TWO</div></div>,
 <div class="_3Mn31"><div class="_9uwBC wY0my"><span class="icon-star _537e4"></span><span>4.8</span></div><div>•</div><div>65 MINS</div><div>•</div><div class="nVWSi">₹300 FOR TWO</div></div>,
 <div class="_3Mn31"><div class="_9uwBC wY0my"><span class="icon-star _537e4"></span><span>4.8</span></div><div>•</div><div>61 MINS</div><div>•</div><div class="nVWSi">₹200 FOR TWO</div></div>]

In [12]:
rest_det_tag[0].text.strip()

'5.0•48 MINS•₹270 FOR TWO'

In [13]:
rest_rating = []
for tag in rest_det_tag:
    tag = tag.text.strip()
    rest_rating.append(float(tag[0:3]))
rest_rating [:5]

[5.0, 5.0, 5.0, 4.8, 4.8]

In [14]:
rest_price_two = []
for tag in rest_det_tag:
    tag = tag.text.strip()
    rest_price_two.append(int(tag[13:-8]))
rest_price_two[:5]

[270, 500, 249, 300, 200]

In [15]:
url_selection_class = "_1j_Yo"
url_tags= doc.find_all('a', {'class': url_selection_class})

In [16]:
url_tags[:2]

[<a class="_1j_Yo" href="/restaurants/hrx-by-eatfit-payara-lal-road-karol-bagh-delhi-558794"><div class="_1HEuF"><div class="_3FR5S"><div class="efp8s"><img alt="HRX by Eatfit" class="_12_oN" height="160" width="254"/></div><div class="_3Ztcd"><div class="nA6kb">HRX by Eatfit</div><div class="_1gURR" title="Healthy Food, Salads, Keto">Healthy Food, Salads, Keto</div></div><div class="_3Mn31"><div class="_9uwBC wY0my"><span class="icon-star _537e4"></span><span>5.0</span></div><div>•</div><div>48 MINS</div><div>•</div><div class="nVWSi">₹270 FOR TWO</div></div><div class="Zlfdx"><span class="icon-offer-filled _2fujs"></span><span class="sNAfh">10% off | Use TRYNEW</span></div></div><div class="_3B2qG"><span aria-label="Open" class="_2ECk4 _24tlh" role="button">Quick View</span></div></div></a>,
 <a class="_1j_Yo" href="/restaurants/pizza-never-lies-karol-bagh-delhi-476110"><div class="_1HEuF"><div class="_3FR5S"><div class="efp8s"><img alt="PIZZA NEVER LIES" class="_12_oN" height="160" 

In [17]:
rest_urls=[]
base_url = 'https://swiggy.com'
for tag in url_tags:
    rest_urls.append(base_url + tag['href'])
rest_urls[:5]

['https://swiggy.com/restaurants/hrx-by-eatfit-payara-lal-road-karol-bagh-delhi-558794',
 'https://swiggy.com/restaurants/pizza-never-lies-karol-bagh-delhi-476110',
 'https://swiggy.com/restaurants/roll-badshah-karol-bagh-delhi-576779',
 'https://swiggy.com/restaurants/downtown-delhi-laxmi-nagar-delhi-572530',
 'https://swiggy.com/restaurants/curry-and-biryani-zone-karol-bagh-west-patel-nagar-delhi-531700']

##### Putting all information in a dictionary

In [18]:
rest_dict = {
    'name': rest_names,
    'cuisine': rest_cuisine,
    'rating': rest_rating,
    'price_two': rest_price_two,
    'url': rest_urls
    }

##### Creating a data frame using Pandas

In [19]:
rest_df = pd.DataFrame(rest_dict)

In [20]:
rest_df[:5]

Unnamed: 0,name,cuisine,rating,price_two,url
0,HRX by Eatfit,"Healthy Food, Salads, Keto",5.0,270,https://swiggy.com/restaurants/hrx-by-eatfit-p...
1,PIZZA NEVER LIES,Pizzas,5.0,500,https://swiggy.com/restaurants/pizza-never-lie...
2,ROLL BADSHAH,"Fast Food, Snacks",5.0,249,https://swiggy.com/restaurants/roll-badshah-ka...
3,Downtown Delhi,"North Indian, Indian, Chinese, Snacks",4.8,300,https://swiggy.com/restaurants/downtown-delhi-...
4,CURRY & BIRYANI ZONE,"Indian, Chinese, Beverages",4.8,200,https://swiggy.com/restaurants/curry-and-birya...


##### Saving in a csv file

In [21]:
rest_df.to_csv('Rest_delhi.csv', index = None)

##### Now putting code in a loop to get data from all pages

In [22]:

rest_names= []
rest_urls=[]
rest_cuisine=[]
rest_rating= []
rest_price_two= []

for page in range (2):
    
    my_header = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'}
    rest_page_url= 'https://www.swiggy.com/city/delhi/top-rated-collection?page='
    
    response = requests.get(rest_page_url+ str(page), headers=my_header)
    
    if response.status_code != 200:
        raise Exception('Failed to load page {}'.format(rest_page_url))
    
    page_contents= response.text
    doc = BeautifulSoup(page_contents, 'html.parser')
    
    
    rest_name_tag= doc.find_all('div', {'class': 'nA6kb'})

    for tag in rest_name_tag:
        rest_names.append(tag.text)
  
      
    rest_cuisine_tag= doc.find_all('div', {'class':'_1gURR'})

    for tag in rest_cuisine_tag:
        rest_cuisine.append(tag.text.strip())
    rest_cuisine
     
    
    rest_det_tag= doc.find_all('div', {'class':'_3Mn31'})
    
    for tag in rest_det_tag:
        tag = tag.text.strip()
        rest_rating.append(float(tag[0:3]))
    rest_rating 

    for tag in rest_det_tag:
        tag = tag.text.strip()
        rest_price_two.append(tag[12:])
    rest_price_two
    

    url_selection_class = "_1j_Yo"
    url_tags= doc.find_all('a', {'class': url_selection_class})
    
    base_url = 'https://swiggy.com'
    for tag in url_tags:
        rest_urls.append(base_url + tag['href'])
        

    rest_dict = {
    'name': rest_names,
    'cuisine': rest_cuisine,
    'rating': rest_rating,
    'price_two': rest_price_two,
    'url': rest_urls
    }
    
    
    rest_df = pd.DataFrame(rest_dict)
    rest_df.to_csv('Rest_delhi.csv', index = None)

### Final code

In [23]:
import requests
from bs4 import BeautifulSoup
import pandas as pd

rest_names= []
rest_urls=[]
rest_cuisine=[]
rest_rating= []
rest_price_two= []

for page in range (60):
    
    my_header = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'}
    rest_page_url= 'https://www.swiggy.com/city/delhi/top-rated-collection?page='
    
    response = requests.get(rest_page_url+ str(page), headers=my_header)
    #check for response status
    
    if response.status_code != 200:
        raise Exception('Failed to load page {}'.format(rest_page_url))
    
    #parse using BeautifulSoup
    page_contents= response.text
    doc = BeautifulSoup(page_contents, 'html.parser')
    
    #getting names
    rest_name_tag= doc.find_all('div', {'class': 'nA6kb'})

    for tag in rest_name_tag:
        rest_names.append(tag.text)
  
    #getting cuisine    
    rest_cuisine_tag= doc.find_all('div', {'class':'_1gURR'})

    for tag in rest_cuisine_tag:
        rest_cuisine.append(tag.text.strip())
    rest_cuisine
     
    #getting rating and price for two
    rest_det_tag= doc.find_all('div', {'class':'_3Mn31'})
    
    for tag in rest_det_tag:
        tag = tag.text.strip()
        rest_rating.append(tag[0:3])
    rest_rating 

    for tag in rest_det_tag:
        tag = tag.text.strip()
        rest_price_two.append(tag[12:])
    rest_price_two
    
    #getting urls 
    url_selection_class = "_1j_Yo"
    url_tags= doc.find_all('a', {'class': url_selection_class})
    
    base_url = 'https://swiggy.com'
    for tag in url_tags:
        rest_urls.append(base_url + tag['href'])
    
    #putting all info in dict 
    rest_dict = {
    'name': rest_names,
    'cuisine': rest_cuisine,
    'rating': rest_rating,
    'price_two': rest_price_two,
    'url': rest_urls
    }
    
    #creating a DataFrame
    rest_df = pd.DataFrame(rest_dict)
    
    #putting all in csv
    rest_df.to_csv('Rest_delhi.csv', index = None)

Ciao amico!