<h2>Data Requirements and Sources</h2>

<h2>Introducton</h2>

This project explores various eateries in Kolkata and attributes the data based on user ratings and average price. To explore this information, this project involves the data from both the <b>Foursquare API</b> and the <b>Zomato API</b> to fetch complete information of various venues (including name, address, category, rating, and price). Further, a map of the venues with specific color attributes will be plotted to
highlight their position, and information about these venues. Such plots imbibe bountiful information in the form of their colored representations and location on the map. This enables any visitor to take <b>a quick glance and decide what place to visit.</b>

We will first determine the latitude and longitude of Kolkata from google. THe coordinates obtained give a central position of the city

In [4]:
TARGET_LATITUDE = 22.572645
TARGET_LONGITUDE = 88.363892
TARGET = 'Kolkata'
print('The geograpical coordinates of {} are {}, {}.'.format(TARGET, TARGET_LATITUDE, TARGET_LONGITUDE))

The geograpical coordinates of Kolkata are 22.572645, 88.363892.


<h3>Foursquare API</h3>

Next we will fetch venues around a radius of 10 km from the centre of Kolkata using Foursqare API. 

In [5]:
CLIENT_ID = 'VAWXW5EBT00H3V1AN143DJRQHF3FGRF5Z1EJG44TWHPIZ3ZF' # your Foursquare ID
CLIENT_SECRET = 'Z5K3X30QBGQS15JEXAAEIR0BWIXNDH3UL0WXRZYEI1NCD4VB' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30
print('Your credentails:Riddhiman')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)
RADIUS = 10000 # 10 Km
NO_OF_VENUES = 100
VERSION = '20200425' # Current date

Your credentails:Riddhiman
CLIENT_ID: VAWXW5EBT00H3V1AN143DJRQHF3FGRF5Z1EJG44TWHPIZ3ZF
CLIENT_SECRET:Z5K3X30QBGQS15JEXAAEIR0BWIXNDH3UL0WXRZYEI1NCD4VB


Defining a function for fetching the correct category types

In [6]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

We'll call the API over and over till we get all venues from the API within the given distance. The maximum venues this API can fetch is 100, so we will fetch all venues by iteratively calling this API and increasing the offset each time.

* Foursquare API requires client_id, and client_secret to function which can be accessed after creating a developer account. 
* We will set the radius as 10 Kilometers. 
* The version is a required parameter which defines the date on which we are browsing so that it retrieves the latest data.

In [8]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as colors

from pandas.io.json import json_normalize
import requests

pd.set_option('display.max_rows', None)

offset = 0
total_venues = 0
foursquare_venues = pd.DataFrame(columns = ['name', 'categories', 'lat', 'lng'])

while (True):
    url = ('https://api.foursquare.com/v2/venues/explore?client_id={}'
           '&client_secret={}&v={}&ll={},{}&radius={}&limit={}&offset={}').format(CLIENT_ID, 
                                                                        CLIENT_SECRET, 
                                                                        VERSION, 
                                                                        TARGET_LATITUDE, 
                                                                        TARGET_LONGITUDE, 
                                                                        RADIUS,
                                                                        NO_OF_VENUES,
                                                                        offset)
    result = requests.get(url).json()
    venues_fetched = len(result['response'])
    total_venues = total_venues + venues_fetched
    print("Total {} venues fetched within a total radius of {} Km".format(venues_fetched, RADIUS/1000))

    venues = result['response']['groups'][0]['items']
    venues = pd.json_normalize(venues)

    # Filter the columns
    filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
    venues = venues.loc[:, filtered_columns]

    # Filter the category for each row
    venues['venue.categories'] = venues.apply(get_category_type, axis = 1)

    # Clean all column names
    venues.columns = [col.split(".")[-1] for col in venues.columns]
    foursquare_venues = pd.concat([foursquare_venues, venues], axis = 0, sort = False)
    
    if (venues_fetched < 100):
        break
    else:
        offset = offset + 100

foursquare_venues = foursquare_venues.reset_index(drop = True)
print("\nTotal {} venues fetched".format(total_venues))

Total 7 venues fetched within a total radius of 10.0 Km

Total 7 venues fetched


In [10]:
foursquare_venues

Unnamed: 0,name,categories,lat,lng
0,The Oberoi Grand,Hotel,22.561749,88.351594
1,Blue & Beyond,Pub,22.559131,88.35328
2,Eden Garden,Cricket Ground,22.564542,88.343296
3,Lalit Great Eastern Hotel,Hotel,22.567967,88.35001
4,Arsalan,Mughlai Restaurant,22.553897,88.354063
5,The Blue Poppy,Asian Restaurant,22.548543,88.351353
6,Girish Chandra Dey & Nakur Chandra Nandy,Indian Sweet Shop,22.59604,88.367485
7,Peter Cat,Indian Restaurant,22.552365,88.352544
8,Maidan,Field,22.549906,88.344219
9,Aqua,Lounge,22.554734,88.35218


<h3>Zomato API<h3>

The Zomato API allows using its search API to search for any given venue based on certain search filters such as query, latitude, longitude and more. Zomato also requires a Zomato user key which can be accessed with a developer account.

We'll use the name, lat, and lng values of various venues fetched from Foursquare API to use the search API and get more information regarding each venue.

The query will be the name of the venue.
The start defines from what offset we want to start, so we'll keep it at 0.
The count defines the number of restaurants we want to fetch. As we have the exact location coordinates, we'll fetch only one.
We will supply the latitude and longitude values.
We will set the sorting criteria as real_distance so each time we get the venue we're searching based on location coordinates.

In [11]:
headers = {'user-key': '4eedf76601ee45bd257d1c6753b851c8'}
venues_information = []

for index, row in foursquare_venues.iterrows():
    print("Fetching data for venue: {}".format(index + 1))
    venue = []
    url = ('https://developers.zomato.com/api/v2.1/search?q={}' + 
          '&start=0&count=1&lat={}&lon={}&sort=real_distance').format(row['name'], row['lat'], row['lng'])
    try:
        result = requests.get(url, headers = headers).json()
    except:
        print("There was an error...")
    try:
        
        if (len(result['restaurants']) > 0):
            venue.append(result['restaurants'][0]['restaurant']['name'])
            venue.append(result['restaurants'][0]['restaurant']['location']['latitude'])
            venue.append(result['restaurants'][0]['restaurant']['location']['longitude'])
            venue.append(result['restaurants'][0]['restaurant']['average_cost_for_two'])
            venue.append(result['restaurants'][0]['restaurant']['price_range'])
            venue.append(result['restaurants'][0]['restaurant']['user_rating']['aggregate_rating'])
            venue.append(result['restaurants'][0]['restaurant']['location']['address'])
            venues_information.append(venue)
        else:
            venues_information.append(np.zeros(6))
    except:
        pass
        
zomato_venues = pd.DataFrame(venues_information, 
                                  columns = ['venue', 'latitude', 
                                             'longitude', 'price_for_two', 
                                             'price_range', 'rating', 'address'])

Fetching data for venue: 1
Fetching data for venue: 2
Fetching data for venue: 3
Fetching data for venue: 4
Fetching data for venue: 5
Fetching data for venue: 6
Fetching data for venue: 7
Fetching data for venue: 8
Fetching data for venue: 9
Fetching data for venue: 10
Fetching data for venue: 11
Fetching data for venue: 12
Fetching data for venue: 13
Fetching data for venue: 14
Fetching data for venue: 15
Fetching data for venue: 16
Fetching data for venue: 17
Fetching data for venue: 18
Fetching data for venue: 19
Fetching data for venue: 20
Fetching data for venue: 21
Fetching data for venue: 22
Fetching data for venue: 23
Fetching data for venue: 24
Fetching data for venue: 25
Fetching data for venue: 26
Fetching data for venue: 27
Fetching data for venue: 28
Fetching data for venue: 29
Fetching data for venue: 30
Fetching data for venue: 31
Fetching data for venue: 32
Fetching data for venue: 33
Fetching data for venue: 34
Fetching data for venue: 35
Fetching data for venue: 36
F

In [12]:
zomato_venues

Unnamed: 0,venue,latitude,longitude,price_for_two,price_range,rating,address
0,The Bar - The Oberoi Grand,22.5606509835,88.351511322,3300.0,4.0,3.8,"The Oberoi Grand, 15, Jawaharlal Nehru Road, N..."
1,Blue And Beyond,22.5590873999,88.3532292768,1600.0,3.0,3.9,"The Lindsay, 8A & 8B, Lindsay Street, New Mark..."
2,Shree Balaji South Indian Mess Veg,22.5701528559,88.3266398683,200.0,1.0,3.5,"Shop 11, Block A2, 106 Kiran Chandra Singha Ro..."
3,The Tea Lounge - The Lalit Great Eastern,22.5679531486,88.3500418067,1500.0,3.0,3.6,"The Lalit Great Eastern, 1 - 3, Old Court Hous..."
4,Kareem's,22.5538791454,88.3543313295,1200.0,3.0,4.3,"55 B, Mirza Ghalib Street, Park Street Area, K..."
5,Biggies Burger,22.549124634,88.3505276218,600.0,2.0,4.2,"6, Russel Street, Camac Street Area, Kolkata"
6,Nobin Chandra Das & Sons,22.5992008678,88.3661628887,100.0,1.0,3.0,"77, Jatindra Mohan Avenue, Shobha Bazar, Kolkata"
7,Peter Cat,22.5524600857,88.352618739,1000.0,3.0,4.2,"18A, Park Street, Park Street Area, Kolkata"
8,The Pancake Centre,22.5474596648,88.3491224796,200.0,1.0,3.6,"Food Court, Metro Shopping Centre, 1 Ho Chi Mi..."
9,Aqua - The Park,22.5542185038,88.3513721824,3000.0,4.0,4.0,"The Park, 17, Park Street Area, Kolkata"
