# WEB SCRAPING 

## Scraping Dineout's Website

**Part 1: Data Collection, by scraping data from Dineout's website.**


The notebook scrapes the following data points about the restaurants in Delhi-NCR:

   * Name
   * Cost_for_2	
   * Address	
   * About	
   * Type	
   * Facilities	
   * Contact	
   * Cuisine	
   * Latitude	
   * Longitude	
   * Stars	
   * Vote_Count	
   * Review_Count

In [365]:
# Import libraries
import pandas as pd
import numpy as np
import nltk
from nltk.tokenize import sent_tokenize
import re
from selenium.webdriver.common.by import By

In [15]:
# Set Up Selenium Webdriver
from selenium import webdriver as wb
from webdriver_manager.chrome import ChromeDriverManager
driver = wb.Chrome(ChromeDriverManager().install())



Current google-chrome version is 92.0.4515
Get LATEST driver version for 92.0.4515
Driver [C:\Users\Chitwan Manchanda\.wdm\drivers\chromedriver\win32\92.0.4515.107\chromedriver.exe] found in cache


In [75]:
# Extract the required data

'''
Following data points are extracted:
1) Brand Name
2) Cost for One
3) Cuisine
4) Stars
5) Ratings
6) Delivery rating count
7) Is_Promoted
8) Discount
9) Is_Newly_Opened
10) Pro_Discount


class_name = "restnt-detail-wrap"
'''
# Xpaths for extracting cuisines
cuisine_show_more_xpath = '//*[@id="locality-show-more"]' # Click on show more button
show_more_cancel_xpath = '//*[@id="w1-filters-filter_popup"]/div[2]/div[2]/span' # Extract Cuisines
cuisine_xpath_list = '//*[@id="filterlist_pop"]' # Click on Cancel button to go back

# Xpaths for extracting names of restaurant
rst_name_xpath = '//*[@id="detailDiv"]/section[1]/div[1]/h1'

# Xpaths for extracting cost for 2
rst_cost_xpath = '//*[@id="detailDiv"]/section[1]/div[1]/div[1]'

# Xpath for restaurant directions
rst_directions_xpath = '//*[@id="detailDiv"]/section[1]/div[1]/div[2]/a[3]' # Click on the directions
rst_address_xpath = '//*[@id="w1-w2"]/div[2]/div[2]/p' # Extract address

# Xpath for restaurant opening timing
rst_timings_xpath = '//*[@id="dropdownMenu1"]/span'

# Xpath for restaurant co-ordinates
rst_location_xpath = '//*[@id="w1-w2-0[0]"]/iframe'

#rst_coordinates_xpath = '//*[@id="pane"]/div/div[1]/div/div/div[2]/div[1]/div[1]/div[1]/h1' # Extract Co-ordinates

# Xpath for cancelling the xpath
cancel_button_xpath = '//*[@id="direction"]/div/button' # To be clicked

# Extract Offers (Try Error)
offer_class_name = 'free-offers_content'

# Xpath for extracting the restaurant about
rst_about_xpath = '//*[@id="w1-6[0]"]/span[2]/a' # Click to read more
rst_about_comp_xpath = '//*[@id="w1-6[0]"]'

# Xpath for extracting the restaurant type
rst_type_xpath = '//*[@id="about"]/div/div/div[2]/div[2]/p'

#cuisine_xpath = '//*[@id="about"]/div/div/div[1]/div[2]'

# Xpath for extracting the restaurant facilities
rst_facilities_xpath = '//*[@id="about"]/div/div/div[4]'
rst_facilities_xpath_exc = '//*[@id="about"]/div/div/div[4]/div[2]'

# Xpath for extracting ratings and reviews
rst_ratings_xpath = '//*[@id="review-section"]/div[1]/div[1]/div[1]'
rst_ratings_xpath_exc = '//*[@id="review-section"]/div[1]/div[1]'
rst_review_xpath = '//*[@id="review-section"]/div[2]'

# Xpath for extracting the restaurant's contact
rst_contact_xpath = '//*[@id="help"]/ul/li[2]/a/div/div[2]/p'

# Xpath for clicking on the next page
next_button_xpath = '//*[@id="w1-pagination"]/li[13]/a' # Click "//*[@id="w1-pagination"]/li[14]/a/i" 
go_home_xpath = '//*[@id="w1-noresult"]/div/div/a'

# Xpath for extracting extra offers
extra_discount_xpath = '//*[@id="detailDiv"]/section[3]/div/div' # Not for every restaurant

# food not found xpath
food_not_found_xpath = '//*[@id="w1-noresult"]/div/div/h2'

def extract_data():
    
    
    # Get URL
    BASE_URL = "https://www.dineout.co.in/delhi-restaurants/welcome-back"
    driver.get(BASE_URL)
    
    #driver.back()
    '''
    # Extract Cuisines
    driver.find_element_by_xpath(cuisine_show_more_xpath).click()
    cuisines = driver.find_element_by_xpath(cuisine_xpath_list).text.split()
    driver.find_element_by_xpath(show_more_cancel_xpath).click()
    '''
    
    # List to store data
    rest_names = []
    cost_for_two = []
    rest_addresses = []
    rest_timings = []
    rest_coordinates = []
    rest_offers = []
    rest_about = []
    rest_type = []
    rest_facilities = []
    rest_ratings = []
    rest_reviews = []
    rest_contact = []
    rest_extra_offers = []
    
    '''
    # Extract data by cuisines
    for cuisine in cuisines:
        
        print("Extracting {} cuisine data".format(cuisine))
        driver.implicitly_wait(15)
        driver.find_element_by_xpath('//*[@id="location-auto-suggest"]').send_keys(cuisine)
        driver.find_element_by_xpath('//*[@id="w0-0[0]"]').click()
    '''

    # Loop for all the pages
    j = 1
    while j >= 0:

        if len(rest_names) > 5000:
            print("We have 5000 data points")
            break

        try:

            # Loop for the page
            i = 1
            while i >= 0:

                try:
                    # Click at the first restaurant
                    driver.find_element_by_xpath('//*[@id="w1-restarant"]/div[{}]'.format(i)).click()

                    # Append the cuisine
                    #rest_cuisine.append(cuisine)

                    # Extract Names
                    print("Extracting the Restaurant Names")
                    try:
                        rest_name = driver.find_element_by_xpath(rst_name_xpath).text
                        print(rest_name)
                        rest_names.append(rest_name)
                    except:
                        print("Exception occured while extracting {} restaurant name".format(rest_name))
                        rest_names.append("No_Name")

                    # Extract Cost for 2
                    print("Extracting the Cost for 2")
                    cost = driver.find_element_by_xpath(rst_cost_xpath).text
                    cost_for_two.append(cost)

                    # Extract addresses
                    print("Extracting the addresses")
                    driver.find_element_by_xpath(rst_directions_xpath).click()
                    rest_address = driver.find_element_by_xpath(rst_address_xpath).text
                    rest_addresses.append(rest_address)

                    # Extract Opening Timings
                    print("Extracting the Opening Timings")
                    timings = driver.find_element_by_xpath(rst_timings_xpath).text
                    rest_timings.append(timings)

                    # Extract the Restaurant Co-ordinates
                    print("Extracting the Restaurant Co-ordinates")
                    coord = re.findall(r'\d*\.?\d+',driver.find_element_by_xpath(rst_location_xpath).get_attribute('src'))[ : -1]
                    rest_coordinates.append(coord)

                    # Cancel the button
                    try:
                        driver.implicitly_wait(10)
                        driver.find_element_by_xpath(cancel_button_xpath).click()
                    except:
                        print("Excpetion occurred while trying to cancel the directions bar for restaurant {}".format(rest_name))
                        driver.refresh()

                    # Extract Offers
                    print("Extracting the Offers")
                    try:
                        rest_offer = driver.find_element_by_class_name(offer_class_name).text
                        rest_offers.append(rest_offer)
                    except:
                        print("Exception occured while extracting offer for restaurants {}".format(rest_name))
                        rest_offers.append("No_Offer")

                    # Extract the restaurant about section 
                    print("Extracting the Restaurant About")
                    try:
                        driver.find_element_by_xpath(rst_about_xpath).click() # Click on the read more button
                        about = driver.find_element_by_xpath(rst_about_comp_xpath).text
                        rest_about.append(about)
                    except:
                        print("Exception occured while extracting about for restaurants {}".format(rest_name))
                        rest_about.append('No_About')

                    # Extracting restaurant type
                    print("Extracting the Restaurant Type")
                    try:
                        rest_type_ = driver.find_element_by_xpath(rst_type_xpath).text
                        rest_type.append(rest_type_)
                    except:
                        print("Exception occured while extracting restaurant type for restaurants {}".format(rest_name))
                        rest_type.append("No_Type")

                    # Extract  the restaurant facilites
                    print("Extracting the Restaurant Facility")
                    try:
                        facilities = driver.find_element_by_xpath(rst_facilities_xpath).text.split('\n')[1:]
                        rest_facilities.append(facilities)
                    except:
                        try:
                            print("Exception occured while extracting facility for restaurants {}".format(rest_name))
                            facilities = driver.find_element_by_xpath(rst_facilities_xpath_exc).text.split('\n')[1:]
                            rest_facilities.append(facilities)
                        except:
                            print("Another Exception occured while extracting facility for restaurants {}".format(rest_name))
                            rest_facilities.append("No_Facilities")

                    # Extract the restaurant ratings
                    print("Extracting the Restaurant Ratings")
                    try :
                        ratings = driver.find_element_by_xpath(rst_ratings_xpath).text
                        rest_ratings.append(ratings)
                    except:
                        ratings = driver.find_element_by_xpath(rst_ratings_xpath_exc).text
                        rest_ratings.append(ratings)

                    # Extract the restaurant reviews
                    print("Extracting the Restaurant Reviews")
                    review = driver.find_element_by_xpath(rst_review_xpath).text
                    rest_reviews.append(review)

                    # Extract the restaurant contact
                    print("Extracting the Restaurant Contact")
                    try:
                        contact = driver.find_element_by_xpath(rst_contact_xpath).text
                        rest_contact.append(contact)
                    except:
                        print("Exception occured while extracting contact of restaurant {}".format(rest_name))
                        rest_contact.append("No_Contact")

                    # Extract the extra discount
                    print("Extracting the Restaurant Extra Discount")
                    try:
                        ext_discount = driver.find_element_by_xpath(extra_discount_xpath).text
                        rest_extra_offers.append(ext_discount)
                    except:
                        rest_extra_offers.append('No_Extra_Discount')

                    # GO back
                    i = i+1
                    driver.back()
                    driver.implicitly_wait(15)


                except:
                    print("Total restaurant on this page are {}".format(i-1))
                    print()
                    break


            # Click on the next page
            driver.implicitly_wait(15)
            driver.find_element_by_xpath('//*[@id="w1-pagination"]/li[14]/a').click() 
            j += 1

            try:
                # Check if Food Not Found text is found
                driver.implicitly_wait(15)
                button = driver.find_element_by_xpath(food_not_found_xpath)
                driver.implicitly_wait(15)

                if button.text == 'Food not found':
                    driver.implicitly_wait(15)
                    break
            except:
                pass

        except:
            #print("Number of Pages for the {} cuisine are {}".format(cuisine, j))
            print()
            break
                
    # Store the data in a dictionary
    data_dictionary = { "Name" : rest_names,
                        "Cost_for_2" : cost_for_two,
                        "Address" : rest_addresses,
                        "Opening_Timings" : rest_timings,
                        "Co-Ordinates" : rest_coordinates,
                        "Offers" : rest_offers,
                        "About" : rest_about,
                        "Type" : rest_type,
                        "Facilities" : rest_facilities,
                        "Ratings" : rest_ratings,
                        "Reviews" : rest_reviews,
                        "Contact" : rest_contact,
                        "Exta_Discount" : rest_extra_offers
                      }
    # Convert into dataframe
    # data = pd.DataFrame(data_dictionary)
    return data_dictionary

dineout_data = extract_data()

Extracting the Restaurant Names
IKURA
Extracting the Cost for 2
Extracting the addresses
Extracting the Opening Timings
Extracting the Restaurant Co-ordinates
Extracting the Offers
Exception occured while extracting offer for restaurants IKURA
Extracting the Restaurant About
Exception occured while extracting about for restaurants IKURA
Extracting the Restaurant Type
Extracting the Restaurant Facility
Extracting the Restaurant Ratings
Extracting the Restaurant Reviews
Extracting the Restaurant Contact
Extracting the Restaurant Extra Discount
Extracting the Restaurant Names
Sassy Begum - Biryani, Kebabs & Curries
Extracting the Cost for 2
Extracting the addresses
Extracting the Opening Timings
Extracting the Restaurant Co-ordinates
Extracting the Offers
Exception occured while extracting offer for restaurants Sassy Begum - Biryani, Kebabs & Curries
Extracting the Restaurant About
Exception occured while extracting about for restaurants Sassy Begum - Biryani, Kebabs & Curries
Extracting 

Extracting the Restaurant Names
Fat Lulu's Pizza
Extracting the Cost for 2
Extracting the addresses
Extracting the Opening Timings
Extracting the Restaurant Co-ordinates
Extracting the Offers
Exception occured while extracting offer for restaurants Fat Lulu's Pizza
Extracting the Restaurant About
Exception occured while extracting about for restaurants Fat Lulu's Pizza
Extracting the Restaurant Type
Extracting the Restaurant Facility
Extracting the Restaurant Ratings
Extracting the Restaurant Reviews
Extracting the Restaurant Contact
Extracting the Restaurant Extra Discount
Extracting the Restaurant Names
Call Chotu
Extracting the Cost for 2
Extracting the addresses
Extracting the Opening Timings
Extracting the Restaurant Co-ordinates
Extracting the Offers
Exception occured while extracting offer for restaurants Call Chotu
Extracting the Restaurant About
Exception occured while extracting about for restaurants Call Chotu
Extracting the Restaurant Type
Extracting the Restaurant Facility

Extracting the Opening Timings
Extracting the Restaurant Co-ordinates
Extracting the Offers
Exception occured while extracting offer for restaurants Burger King
Extracting the Restaurant About
Extracting the Restaurant Type
Extracting the Restaurant Facility
Extracting the Restaurant Ratings
Extracting the Restaurant Reviews
Extracting the Restaurant Contact
Extracting the Restaurant Extra Discount
Extracting the Restaurant Names
Mitra Da Noodle Oodle
Extracting the Cost for 2
Extracting the addresses
Extracting the Opening Timings
Extracting the Restaurant Co-ordinates
Extracting the Offers
Exception occured while extracting offer for restaurants Mitra Da Noodle Oodle
Extracting the Restaurant About
Exception occured while extracting about for restaurants Mitra Da Noodle Oodle
Extracting the Restaurant Type
Extracting the Restaurant Facility
Exception occured while extracting facility for restaurants Mitra Da Noodle Oodle
Another Exception occured while extracting facility for restaur

Exception occured while extracting about for restaurants Deli Salad Co.
Extracting the Restaurant Type
Extracting the Restaurant Facility
Extracting the Restaurant Ratings
Extracting the Restaurant Reviews
Extracting the Restaurant Contact
Extracting the Restaurant Extra Discount
Total restaurant on this page are 21

Extracting the Restaurant Names
Cafe Coffee Day
Extracting the Cost for 2
Extracting the addresses
Extracting the Opening Timings
Extracting the Restaurant Co-ordinates
Extracting the Offers
Exception occured while extracting offer for restaurants Cafe Coffee Day
Extracting the Restaurant About
Exception occured while extracting about for restaurants Cafe Coffee Day
Extracting the Restaurant Type
Extracting the Restaurant Facility
Exception occured while extracting facility for restaurants Cafe Coffee Day
Another Exception occured while extracting facility for restaurants Cafe Coffee Day
Extracting the Restaurant Ratings
Extracting the Restaurant Reviews
Extracting the Res

Extracting the Restaurant Type
Extracting the Restaurant Facility
Exception occured while extracting facility for restaurants Pizza Hut
Another Exception occured while extracting facility for restaurants Pizza Hut
Extracting the Restaurant Ratings
Extracting the Restaurant Reviews
Extracting the Restaurant Contact
Extracting the Restaurant Extra Discount
Extracting the Restaurant Names
La Pino'z Pizza
Extracting the Cost for 2
Extracting the addresses
Extracting the Opening Timings
Extracting the Restaurant Co-ordinates
Extracting the Offers
Exception occured while extracting offer for restaurants La Pino'z Pizza
Extracting the Restaurant About
Exception occured while extracting about for restaurants La Pino'z Pizza
Extracting the Restaurant Type
Extracting the Restaurant Facility
Extracting the Restaurant Ratings
Extracting the Restaurant Reviews
Extracting the Restaurant Contact
Extracting the Restaurant Extra Discount
Extracting the Restaurant Names
Royal Punjab
Extracting the Cost 

Exception occured while extracting facility for restaurants Costa Coffee
Another Exception occured while extracting facility for restaurants Costa Coffee
Extracting the Restaurant Ratings
Extracting the Restaurant Reviews
Extracting the Restaurant Contact
Extracting the Restaurant Extra Discount
Extracting the Restaurant Names
Haldiram's
Extracting the Cost for 2
Extracting the addresses
Extracting the Opening Timings
Extracting the Restaurant Co-ordinates
Extracting the Offers
Exception occured while extracting offer for restaurants Haldiram's
Extracting the Restaurant About
Exception occured while extracting about for restaurants Haldiram's
Extracting the Restaurant Type
Extracting the Restaurant Facility
Extracting the Restaurant Ratings
Extracting the Restaurant Reviews
Extracting the Restaurant Contact
Extracting the Restaurant Extra Discount
Extracting the Restaurant Names
Pot Pot - Yum Yum Indian Delivery
Extracting the Cost for 2
Extracting the addresses
Extracting the Opening 

Exception occured while extracting about for restaurants Walk In The Woods
Extracting the Restaurant Type
Extracting the Restaurant Facility
Extracting the Restaurant Ratings
Extracting the Restaurant Reviews
Extracting the Restaurant Contact
Extracting the Restaurant Extra Discount
Extracting the Restaurant Names
Sagar Ratna
Extracting the Cost for 2
Extracting the addresses
Extracting the Opening Timings
Extracting the Restaurant Co-ordinates
Extracting the Offers
Exception occured while extracting offer for restaurants Sagar Ratna
Extracting the Restaurant About
Extracting the Restaurant Type
Extracting the Restaurant Facility
Extracting the Restaurant Ratings
Extracting the Restaurant Reviews
Extracting the Restaurant Contact
Extracting the Restaurant Extra Discount
Extracting the Restaurant Names
Country Curries
Extracting the Cost for 2
Extracting the addresses
Extracting the Opening Timings
Extracting the Restaurant Co-ordinates
Extracting the Offers
Exception occured while extr

Extracting the Restaurant Ratings
Extracting the Restaurant Reviews
Extracting the Restaurant Contact
Extracting the Restaurant Extra Discount
Extracting the Restaurant Names
The Big Chill
Extracting the Cost for 2
Extracting the addresses
Extracting the Opening Timings
Extracting the Restaurant Co-ordinates
Extracting the Offers
Exception occured while extracting offer for restaurants The Big Chill
Extracting the Restaurant About
Extracting the Restaurant Type
Extracting the Restaurant Facility
Extracting the Restaurant Ratings
Extracting the Restaurant Reviews
Extracting the Restaurant Contact
Extracting the Restaurant Extra Discount
Extracting the Restaurant Names
Number 7 -The Asian Story
Extracting the Cost for 2
Extracting the addresses
Extracting the Opening Timings
Extracting the Restaurant Co-ordinates
Extracting the Offers
Exception occured while extracting offer for restaurants Number 7 -The Asian Story
Extracting the Restaurant About
Exception occured while extracting about

# Data Cleaning

In [40]:
df = pd.DataFrame(dineout_data)
df.head()

Unnamed: 0,Name,Cost_for_2,Address,Opening_Timings,Co-Ordinates,Offers,About,Type,Facilities,Ratings,Reviews,Contact,Exta_Discount
0,Local,"₹ 1,800 for 2 | North Indian, Italian, Contine...","Scindia House, 11, Ground Floor, Atmaram Mansi...",(Open Now),"[28.630053, 77.221065]",No_Offer,"Local, situated in the heart of the capital, i...","Happy Hours, Dineout Pay, Casual Dining, Night...","[Ghar Ki Baat, Botalan Sharab Diyanm Paan Cosm...",4.1\n2464 Votes\n253 Reviews,Aarjav Jain\n2\nDisliked: Food\nfood was ok ok...,09599553145,No_Extra_Discount
1,Tamasha,"₹ 2,000 for 2 | Continental, Finger Food, Nort...","28 , Kasturba Gandhi Marg Connaught Place Near...",,"[28.629487, 77.221558]",20% Off the Total Bill\nAvailable nowOffer ava...,Tamasha is one of the most popular places to h...,"Happy Hours, Lounge, Dineout Pay, Casual Dinin...","[Dahi Kebab, Dragon Roll, Alfredo Penne,Golden...",4.2\n5074 Votes\n473 Reviews,Kapil Nath\n1\nDisliked: Customer Service\nOne...,09999477661,20% Off the Total Bill\nAvailable nowOffer ava...
2,The Junkyard Cafe,"₹ 1,800 for 2 | North Indian, Mediterranean, A...","91, 2nd Floor, Block N, Outer Circle Connaught...",,"[28.630223, 77.220512]",No_Offer,The Junkyard Cafe is a small chain of casual d...,"Happy Hours, Lounge, Dineout Pay, Nightlife, B...","[Bourbon Wreckage Fluid, Floral Salvage Fluid,...",4.1\n2850 Votes\n370 Reviews,Abhishek Jain\n2\nDisliked: Customer Service\n...,011-9599947642,No_Extra_Discount
3,The G.T. ROAD,"₹ 1,800 for 2 | North Indian","M- Block, 39, Outer Circle Connaught Place Nea...",,"[28.633195, 77.222756]",15% Off On Dinner Buffet (Veg Original Price -...,One of the best multi-cuisine restaurants in C...,"Happy Hours, Buffet, Lunch Buffet, Dinner Buff...","[GTR Flavored Mojitos, Vodka Shikanji, Jamun-t...",4.3\n2420 Votes\n358 Reviews,"Pankaj\n5\nLiked: Food, Customer Service, Musi...",09871121428,Complimentary Drinks\nOnly for members\nBuy Now
4,Unplugged Courtyard,"₹ 2,000 for 2 | North Indian, Continental","L 23/7, Middle Circle Connaught Place Near Ode...",,"[28.634146, 77.221615]",No_Offer,Unplugged Courtyard is a specialty restaurant ...,"Dineout Pay, Casual Dining, Nightlife, Bar, Sa...","[Smoking Area, Air Conditioned, Wifi, Valet Av...",4.1\n1845 Votes\n211 Reviews,Vansh Gulati\n1\nDisliked:\nThey are not follo...,09999396661,25% Off the Food and Soft Beverage Bill\nOnly ...


In [307]:
df_copy = df.copy()

In [328]:
def preprocess_cost(x):
    return int(re.findall('\d+', re.sub(',', '', x))[0])

def preprocess_cuisine(x):
    return x.split('|')[1]

def preprocess_stars(x):
    return x.split('\n')[0]

def preprocess_vote_counts(x):
    try:
        vote_count = x.split('\n')[1].split(' ')[0]
    except:
        vote_count = 0
    return int(vote_count)

def preprocess_review_count(x):
    try:
        review_count = x.split('\n')[2].split(' ')[0]
        if review_count == 'No':
            review_count = 0
            
    except:
        review_count = 0
    return int(review_count)

def preprocess_reviews_liked(x):
    sub_string_1 = "Liked"
    try:
        liked =  x[x.index(sub_string_1) : ]
    except:
        liked = "Not_Mentioned"
        
    return list(liked.split('\n'))

def preprocess_reviews_disliked(x):
    sub_string_2 = "Disliked"
    try:
        disliked =  x[x.index(sub_string_2) : ]
    except:
        disliked = "Not_Mentioned"
        
    return list(disliked.split('\n'))

In [309]:
df_copy.drop('Opening_Timings', axis=1, inplace=True)

In [330]:
df_copy['Cuisine'] = df_copy['Cost_for_2'].apply(lambda x : preprocess_cuisine(x))

In [331]:
df_copy['Cost_for_2'] = df_copy['Cost_for_2'].apply(lambda x : preprocess_cost(x))

In [332]:
df_copy['Latitude'] = df_copy['Co-Ordinates'].apply(lambda x : x[0])
df_copy['Longitude'] = df_copy['Co-Ordinates'].apply(lambda x : x[1])

In [333]:
df_copy['Stars'] = df_copy['Ratings'].apply(lambda x : preprocess_stars(x))

In [334]:
df_copy['Vote_Count'] = df_copy['Ratings'].apply(lambda x : preprocess_vote_counts(x))

In [335]:
df_copy['Review_Count'] = df_copy['Ratings'].apply(lambda x : preprocess_review_count(x))

In [337]:
df_copy.drop(['Ratings', 'Reviews', 'Offers', 'Co-Ordinates', 'Exta_Discount'], axis=1, inplace=True)

In [338]:
df_copy.head()

Unnamed: 0,Name,Cost_for_2,Address,About,Type,Facilities,Contact,Cuisine,Latitude,Longitude,Stars,Vote_Count,Review_Count
0,Local,1800,"Scindia House, 11, Ground Floor, Atmaram Mansi...","Local, situated in the heart of the capital, i...","Happy Hours, Dineout Pay, Casual Dining, Night...","[Ghar Ki Baat, Botalan Sharab Diyanm Paan Cosm...",09599553145,"North Indian, Italian, Continental, Asian, Fa...",28.630053,77.221065,4.1,2464,253
1,Tamasha,2000,"28 , Kasturba Gandhi Marg Connaught Place Near...",Tamasha is one of the most popular places to h...,"Happy Hours, Lounge, Dineout Pay, Casual Dinin...","[Dahi Kebab, Dragon Roll, Alfredo Penne,Golden...",09999477661,"Continental, Finger Food, North Indian, Itali...",28.629487,77.221558,4.2,5074,473
2,The Junkyard Cafe,1800,"91, 2nd Floor, Block N, Outer Circle Connaught...",The Junkyard Cafe is a small chain of casual d...,"Happy Hours, Lounge, Dineout Pay, Nightlife, B...","[Bourbon Wreckage Fluid, Floral Salvage Fluid,...",011-9599947642,"North Indian, Mediterranean, Asian, Italian, ...",28.630223,77.220512,4.1,2850,370
3,The G.T. ROAD,1800,"M- Block, 39, Outer Circle Connaught Place Nea...",One of the best multi-cuisine restaurants in C...,"Happy Hours, Buffet, Lunch Buffet, Dinner Buff...","[GTR Flavored Mojitos, Vodka Shikanji, Jamun-t...",09871121428,North Indian,28.633195,77.222756,4.3,2420,358
4,Unplugged Courtyard,2000,"L 23/7, Middle Circle Connaught Place Near Ode...",Unplugged Courtyard is a specialty restaurant ...,"Dineout Pay, Casual Dining, Nightlife, Bar, Sa...","[Smoking Area, Air Conditioned, Wifi, Valet Av...",09999396661,"North Indian, Continental",28.634146,77.221615,4.1,1845,211


In [302]:
df_copy.to_csv(r'.\dineout_data_cleaned.csv', index=False)